Weighted 2 x 2 kappa coefficients: recommended indices of diagnostic accuracy for evidence-based practice

J Clin Epidemiol. 2009 Oct;62(10):1045-53. doi: 10.1016/j.jclinepi.2008.11.012. Epub 2009 Mar 10.

Abstract

Objectives: The diagnostic accuracy of a clinical test is typically evaluated by comparing the classification of individuals by the test with their classification by some diagnostic gold standard. The most popular indices of diagnostic accuracy are sensitivity, specificity, and positive and negative predictive values. However, these measures are affected by the fact that some diagnostic decisions will be correct by chance, resulting in values that differ widely between indices, and clinicians may be confused as to whether results indicate that a test is good or poor. In keeping with the principle in evidence-based practice that decisions should be based on evidence not guesswork, we might expect that measures of diagnostic accuracy should therefore be chance corrected. The objective of this article was to advocate this and draw attention to indices that achieve this requirement.

Study design and setting: The principles underlying calculations of diagnostic accuracy are presented as a framework for understanding the problem and its solution.

Results: Disparities between different indices of diagnostic accuracy may be resolved by adjusting them to correct for chance effects. This produces a pair of weighted 2 x 2 "diagnostic" kappa coefficients offering a number of theoretical and practical advantages.

Conclusion: Routine use of weighted 2 x 2 kappa coefficients as indices of diagnostic accuracy is recommended.

MeSH terms

  • Data Interpretation, Statistical
  • Diagnostic Errors
  • Diagnostic Tests, Routine / standards*
  • Evidence-Based Medicine / methods*
  • Evidence-Based Medicine / standards
  • Humans
  • Predictive Value of Tests
  • Prevalence