The interrater reliability of physical examination tests that may predict the outcome or suggest the need for lumbar stabilization exercises

J Orthop Sports Phys Ther. 2013 Feb;43(2):83-90. doi: 10.2519/jospt.2013.4310. Epub 2013 Jan 14.

Abstract

Study design: Interrater reliability.

Objectives: (1) To examine the interrater reliability of an existing clinical prediction rule (CPR) to predict the success of lumbar stabilization exercises (LSE), and (2) to examine the interrater reliability of 4 clinical tests that may be useful in determining the need for LSE.

Background: Physical therapists commonly use LSE to manage patients with low back pain. The clinical efficacy of LSE is unclear. A CPR has been previously suggested to identify patients most likely to benefit from LSE. The passive lumbar extension test, lumbar extension load test, active straight leg raise test, and active hip abduction test are 4 clinical tests that may also suggest the need for LSE. The reliability of these tests has not been established sufficiently.

Methods: Thirty patients with low back pain, who participated in a larger randomized clinical trial, underwent all tests by 2 independent examiners. Kappa coefficients with 95% confidence intervals (CIs) were calculated to establish the interrater reliability of the CPR and individual tests.

Results: The interrater reliability of the CPR was excellent (κ = 0.86; 95% CI: 0.65, 1.00). The interrater reliability of the individual items making up the CPR, as well as that of the passive lumbar extension test, was substantial (κ = 0.64-0.73 and κ = 0.76, respectively; 95% CI: 0.46, 1.00). The interrater reliability of the active straight leg raise test (κ = 0.53; 95% CI: 0.20, 0.84) and lumbar extension load test (κ = 0.47; 95% CI: 0.14, 0.78) was moderate. The interrater reliability of the active hip abduction test was poor (κ = -0.09; 95% CI; -0.35, 0.27).

Conclusion: With the exception of the active hip abduction test, all other clinical tests can be considered sufficiently reliable for clinical use. The relatively small sample size likely contributed to the fairly wide confidence intervals around some of the reliability indices.

MeSH terms

  • Adult
  • Confidence Intervals
  • Female
  • Humans
  • Israel
  • Low Back Pain / therapy*
  • Lumbar Vertebrae / injuries*
  • Male
  • Observer Variation
  • Physical Examination / methods*
  • Physical Therapy Modalities*
  • Predictive Value of Tests
  • Reproducibility of Results