Objective To review the diagnostic accuracy of the Ottawa Ankle and Midfoot Rules and explore if clinical features and/or methodological quality of the study influence diagnostic accuracy estimates.
Design Systematic review with meta-analysis.
Data sources MEDLINE, EMBASE, CINAHL, SPORTDiscus and Cochrane Library.
Eligibility criteria for selecting studies Primary diagnostic studies reporting the accuracy of the Rules in people with ankle and/or midfoot injury were retrieved. Diagnostic accuracy estimates, overall and for subgroups (patient’s age, profession of the assessor and setting of application), were made. Sensitivity analyses included studies with a low risk of bias and studies where all patients received radiographs.
Results 66 studies were included. Ankle and Midfoot Rules presented similar accuracies, which were homogeneous and high for sensitivity and negative likelihood ratios and poor and heterogeneous for specificity and positive likelihood ratios (mean, 95% CI pooled sensitivity of Ankle Rules: 99.4%, 97.9% to 99.8%; specificity: 35.3%, 28.8% to 42.3%). Sensitivity of the Ankle Rules was higher in adults than in children, but the profession of the assessor did not appear to influence accuracy. Specificity was higher for Midfoot than for Ankle Rules. There were not enough studies to allow comparison according to setting of application. Studies with a low risk of bias and where all patients received radiographs provided lower accuracy estimates. Specificity heterogeneity was not explained by assessor training, use of imaging in all patients and low risk of bias.
Conclusions Study features and the methodological quality influence estimates of the diagnostic accuracy of the Ottawa Ankle and Midfoot Rules.
Statistics from Altmetric.com
Ankle sprains and fractures are common.1 While both have similar acute presentation, these injuries need to be distinguished as they are managed differently.2–4 The reference standard for diagnosing an ankle injury as a sprain or fracture involves the use of imaging, usually plain radiographs.4 However, the routine use of plain radiographs to diagnose an ankle injury contributes significantly to high healthcare costs, increases the waiting time in busy emergency departments and exposes patients to often unnecessary radiation.5 Therefore, the routine use of imaging should be avoided.
The Ottawa Ankle and Midfoot Rules are the most commonly used clinical prediction rules to identify patients with a low probability of ankle and midfoot injuries that do not require radiographic examination.6 The Rules state that patients should be referred for radiographic examination or other medical imaging if they have pain in the malleolar or midfoot areas (for the ankle), the base of the fifth metatarsal or navicular bones (for the midfoot) as well as either bone tenderness over the same areas or an inability to weight bear four steps immediately after injury and when admitted to the emergency department.7
Previous systematic reviews8 ,9 have reported that the Rules have high sensitivity but unclear specificity as individual studies provide heterogeneous specificity estimates. However, there are several limitations of these existing reviews. They primarily summarise the accuracy of the Rules when applied by medical doctors in secondary care (hospital emergency departments) but the healthcare system has undergone dramatic changes over the past decade, with the management of musculoskeletal injuries shifting to primary care and first contact care being provided by other health professionals, including nurses and physiotherapists. The impact of the profession of the person applying the Rules and setting of application on Rule accuracy has not been systematically investigated. The diagnostic research field has also progressed substantially since publication of the previous review,8 witnessing improvements in the design and reporting of studies,10 ,11 enhanced methods for evaluation of risk of bias12 and innovations in statistical methods for meta-analysis of diagnostic studies.13 A consideration of these factors in an updated review of the Rules has the potential to advance previous findings and address persisting uncertainties about specificity and sources of heterogeneity.
The aim of this review was to systematically assess the diagnostic accuracy of the Ottawa Ankle and Midfoot Rules. The impact of patient's age, profession of the assessor and setting of application, as well as risk of bias, on reported accuracy of the Rules was evaluated.
Data sources and searches
The review protocol was prospectively registered on PROSPERO (registration number 42013004723). The review followed the methods recommended by the Cochrane Collaboration14 and is reported according to the STARD statement.15
The search combined relevant keywords and Medical Subject Headings and was conducted from inception to 26 August 2014 in the following electronic databases: MEDLINE via PubMed, EMBASE via subscription to http://www.embase.com, CINAHL via EBSCO-host, SPORTDiscus via EBSCO-host and Cochrane Library via OVID (see online supplementary file 1). The WHO Trial Registry portal was also searched and citation tracking of the included studies and relevant systematic reviews was conducted. There were no language or publication restrictions.
supplementary file 1
Diagnostic studies were included if they: (1) reported on a cohort of patients presenting with midfoot and/or ankle trauma and an ankle or midfoot fracture was a differential diagnosis; (2) evaluated the diagnostic performance of the Ottawa Ankle and/or Midfoot Rules; (3) confirmed the diagnosis of an ankle or midfoot fracture with an adequate reference standard (eg, plain radiographs or proxy clinical measure such as a telephone follow-up); and (4) reported results in sufficient detail to allow reconstruction of contingency tables.
Data extraction and quality assessment
Two independent reviewers screened titles and abstracts and then the full text of the potentially eligible studies and extracted data from the included studies. Authors were contacted when there were incomplete or missing data.
Study quality was assessed by two independent reviewers using the second version of the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2),12 which assesses the risk of bias and concerns about the applicability of systematic reviews of test accuracy studies in four domains: Patient selection (consecutive or random sample enrolled, case–control design and inappropriate exclusions avoided); Index test (blinded interpretation of the Rules); Reference standard (correctly excluded a fracture and blinded interpretation); and Flow and timing (appropriate interval between application of the Rules and reference standard, all patients received the reference standard and were included in the analysis). All disagreements were solved by discussion with a third reviewer.
Data synthesis and analysis
Diagnostic 2×2 contingency tables were created for each study detailing true positives, true negatives, false positives and false negatives and used to calculate study-level disease prevalence, likelihood ratios, sensitivity, specificity and their respective 95% CIs for the Ankle and Midfoot Rules. When it was not possible to distinguish between the Ankle and Midfoot, a general Ankle/Midfoot Rule estimate was provided.
Preliminary analyses used scatter plots of individual study estimates of sensitivity and specificity in receiver-operating characteristic space to assess visually clinical and statistical heterogeneity between the studies for each Rule. The bivariate model16 ,17 was used to compute summary estimates of sensitivity and specificity separately for the Ottawa Ankle, Midfoot and Ankle/Midfoot Rules. For studies where more than one profession applied the Rules on the same sample and one of them included a medical doctor, the accuracy provided by the doctor was used for the overall estimate in the meta-analysis.
A within-study analysis was planned to assess the accuracy of the Rules (Ankle, Midfoot and Ankle/Midfoot) for three subgroups: adults versus children; when applied by a medical doctor versus other health professionals; and when applied at a hospital emergency room versus a community setting. Since there were not enough studies for a within-study analysis, we conducted a between-study analysis for these subgroups, for each Rule, when there were at least three studies evaluating a subgroup for that Rule. For each model, the Akaike Information Criterion was used to assess whether random effects for sensitivity (versus a fixed effect) should be included in the bivariate model. For some analyses where the number of studies was small, model convergence was only possible with a fixed effect for sensitivity. Pooled likelihood ratios derived from bivariate models were used to calculate the post-test probability of fracture for a positive (and negative) test result for each subgroup, assuming the pretest probability to be the overall mean prevalence of fracture for each subgroup.
Two sensitivity analyses were conducted. The first looked at the accuracy of studies that were rated as low risk of bias in the four domains of QUADAS-2 and the second included studies that used medical imaging as a reference standard in all patients. Imaging is the gold standard method to diagnose a fracture, reducing a potential source of bias.
To explore reasons for heterogeneity, a series of bivariate models were fitted for each Rule that included a binary covariate: assessor training (yes vs no or not clear), reference standard (medical imaging in all patients vs in some patients) and risk of bias (low risk of bias in all QUADAS-2 domains vs not low risk of bias in all domains) for both sensitivity and specificity. These covariates were chosen as they are known to influence accuracy estimates of diagnostic tests.18 The Akaike Information Criterion was again used to assess whether random effects for sensitivity should be included. A post hoc analysis looked at the direct comparison between the accuracy of the Ankle and Midfoot Rules using studies that evaluated both Rules, thereby reducing potential confounding. The bivariate model included a binary covariate for Rule, for both sensitivity and specificity and also allowed the variance of the random effects for specificity to differ for the two Rules. Bivariate models were fitted using Proc NLMIXED in SAS V.9.4 and figures were drawn using Review Manager V.5.3.
Selection, characteristics and quality of studies
The search strategy yielded 261 studies and citation tracking added another 28 studies. After duplicates were removed, 140 titles and abstracts were screened and 94 full papers and eight abstracts for which a full paper was not found were assessed for eligibility. Finally, a total of 66 studies in 68 articles (56 prospective, 9 retrospective and 3 studies which could not be classified as prospective or retrospective) were included in the review, 60 full text7 ,19–77 and 8 abstracts78–85 (figure 1). The studies provided data for 22 273 patients with 3686 ankle or midfoot fractures. The prevalence of ankle and midfoot fracture across studies ranged from 0% to 35.0%, with a mean (SD) of 16.3% (6.6). The prevalence in sports centres was 11.0%. The mean age across studies ranged from 11.0 to 47.0, with a mean (SD) of 28.3 (10.4) years. One study was excluded from the pooled analysis as no fractures were detected in the sample enrolled.72 The characteristics of the included studies are provided in online supplementary file 2.
supplementary file 2
Overall, the majority of studies presented a high risk of bias in at least one domain, with only nine studies rating low risk of bias for all domains of QUADAS-2 (see online supplementary file 3).20 ,22 ,41 ,45 ,48 ,50 ,58 ,66 ,77 For the index test domain, 58% of studies were rated as low risk of bias, 51% were low risk for patient selection and flow and timing domains and only 37% for the reference standard domain. Regarding applicability, most studies rated low for patient selection (91%), 74% for the reference standard domain and 57% for index test.
supplementary file 3
Accuracy of the Ottawa Ankle and Midfoot Rules
Thirty-four studies (in 33 articles) reported the accuracy of the Ankle Rules,7 ,19 ,22 ,24–27 ,29 ,32 ,35–37 ,39 ,44 ,46 ,48 ,49 ,53 ,56 ,60 ,62 ,65 ,67–69 ,74–77 ,79 ,80 ,82 ,84 15 studies (in 14 articles) reported the accuracy of the Midfoot Rules7 ,19 ,22 ,25 ,44 ,48 ,53 ,60 ,67–69 ,76 ,77 ,84 and for 31 studies it was not possible to distinguish between the ankle and midfoot, so a general Ankle/Midfoot Rules accuracy is provided20 ,21 ,23 ,28 ,30 ,31 ,33 ,38 ,40–43 ,45 ,47 ,50–52 ,54 ,55 ,57–59 ,61 ,63 ,66 ,70 ,71 ,73 ,78 ,83 ,85 (table 1 and figures 2 and 3). Pooled estimates of negative likelihood ratios and sensitivity were good, while positive likelihood ratios and specificity were poor. Study specific estimates of sensitivity were generally homogeneous (ranging from 70% to 100%); however, there was substantial heterogeneity in specificities (6–85%).
Accuracy of the Ottawa Ankle and Midfoot Rules according to subgroups
The planned within-study comparisons looking at the accuracies by subgroup: age (adults vs children), profession of the assessor (medical doctor vs other health professionals) and setting of application (hospital emergency room vs community setting) were not possible due to the small number of studies reporting these comparisons. For the between-study comparison, the age comparison was only presented in one study,51 the profession comparison was presented in three studies (two on the Ankle Rule32 ,67 and one on the Ankle/Midfoot Rule)31 and there was no study presenting a direct comparison of setting.
Table 1 presents summary estimates of test accuracy and fracture prevalence, by subgroup, for each Rule. The only significant difference was found in the sensitivity of the Ankle Rules according to age, where sensitivity estimates were higher in the adult population.
Heterogeneity of specificity estimates was not explained by assessor training, use of reference standard in all patients and low risk of bias in all domains from QUADAS-2.
The first sensitivity analysis looked at studies at low risk of bias for all categories of QUADAS-2, and included only three studies for the Ankle and the Midfoot Rules22 ,48 ,77 and six studies for Ankle/Midfoot Rules.20 ,41 ,45 ,50 ,58 ,66 The second sensitivity analysis looked at studies in which the entire sample was assessed using medical imaging as the reference standard, and included 25 studies (in 24 articles) for the Ankle,7 ,19 ,22 ,24–27 ,29 ,36 ,37 ,39 ,48 ,53 ,56 ,62 ,65 ,67 ,68 ,74 ,76 ,77 ,79 ,82 ,84 12 studies (in 11 articles) for the Midfoot7 ,19 ,22 ,25 ,48 ,53 ,67 ,68 ,76 ,77 ,84 and 22 studies for the Ankle/Midfoot Rules.20 ,28 ,31 ,33 ,40 ,41 ,45 ,50–52 ,54 ,55 ,57–59 ,61 ,66 ,70 ,71 ,78 ,83 ,85 Accuracy estimates were decreased in both sensitivity analyses when compared with the overall accuracies based on all studies (table 1).
Comparison of accuracy of Ankle and Midfoot Rules
The comparison between Ankle and Midfoot Rules accuracies was based on 15 studies (in 14 articles)7 ,19 ,22 ,25 ,44 ,48 ,53 ,60 ,67–69 ,76 ,77 ,84 that evaluated both Rules (figure 4). Bivariate modelling of these data with a covariate for Rules (Ankle vs Midfoot) showed that allowing the variance of the random effects for specificity to differ by Rule provided the best fit (based on the Akaike Information Criterion). The variability in specificity was higher for the Midfoot Rule. There was no evidence of a difference in sensitivity between the Rules (p=0.84), but weak evidence of higher specificity for the Midfoot Rule (p=0.068).
Sixty-six studies evaluating the diagnostic accuracy of the Ottawa Ankle and Midfoot Rules were included, more than double the number in the previous review.8 In contrast to the previous review, statistical pooling of sensitivity and specificity were undertaken, summary estimates of positive and negative likelihood ratios computed, and sources of heterogeneity were explored. The Rules were consistently found to have a high sensitivity and low negative likelihood ratio, indicating that a negative test result is highly informative in excluding a fracture of the ankle and midfoot and, therefore, the need for radiographic examination. Specificity and positive likelihood ratios were generally low and heterogeneous and the post hoc analysis comparing the accuracy of the Rules showed that specificity was higher for the Midfoot when compared to the Ankle Rule. Assessor training, reference standard and low risk of bias did not explain heterogeneity in specificity estimates. The sensitivity analyses showed that estimates of accuracy were slightly lower in studies with a low risk of bias for all QUADAS-2 domains and when all patients receive medical imaging as the reference standard.
This meta-analysis provides the most contemporary and robust estimate of the diagnostic accuracy of the Ottawa Ankle and Midfoot Rules and demonstrates for the first time that the profession of the assessor does not appear to influence accuracy estimates. Strategies taken to ensure methodological rigour of this review included prospective registration and applying a sensitive search strategy. This search strategy resulted in the inclusion of double the number of primary studies compared to the previous review.8 Risk of bias of included studies was assessed using a valid and reliable tool12 and bivariate models16 ,17 were used to investigate factors influencing estimates of diagnostic accuracy. The limitations of this review are primarily related to the included studies. In particular, some prespecified analyses could not be conducted due to the small number of studies included in some categories. This also limited the strength of the conclusions on the performance of the Rules when applied by different health professionals and in different settings.
This is the first study to evaluate objectively the impact of assessor training, type of reference standard and low risk of bias on the estimates of specificity. Our review showed that these variables did not explain the observed variability. While the limited number of studies included in some categories could be a contributing factor, it is possible that other factors (eg, blinding of the assessor of the Rules) may be responsible for this variability.
The findings of this study suggest that application of the Rules would reduce unnecessary medical imaging by ∼30% across all settings and by 49% in sports centres (compared to imaging all patients). This indicates that a large proportion of people continue to receive medical imaging, adding costs to the healthcare system, increasing time spent in emergency departments and exposing patients to unnecessary radiation. Avoiding potentially unnecessary medical tests that may cause harm has been widely supported in the Choosing Wisely campaign86 in many countries. The creation of a secondary stepwise clinical decision tool to follow the Rules could allow a more refined selection of patients to receive imaging and perhaps increase the application of the Rules in different clinical settings.
This is the first review to provide preliminary evidence that accuracy estimates are similar when different health professionals apply the Rules. This is clinically relevant as the results suggest that clinicians who are not medical doctors, such as nurses or physiotherapists, can apply the Ottawa Ankle and Midfoot Rules to triage patients who present with an ankle and/or midfoot injury without any reduction in the diagnostic accuracy of the Rule.
In our review, the sensitivity of the Ankle Rules was significantly higher in adults than in children, meaning that the probability of missing an ankle fracture is higher in children than in adults. This difference could be related to the lack of bone maturity seen in children and to differences in pain perceptions between these two distinct populations. A refined version of the Rules for children, taking those factors into account, could potentially help in improving the sensitivity and perhaps specificity of the Rules in this population.
Future research in this area should focus on two key areas. First, a robust study is required to evaluate the diagnostic accuracy of the Rules when applied in a primary care community setting, to strengthen the clinical applicability of the Rules. Second, the creation of a stepwise tool to be used in conjunction to the Ottawa Ankle and Midfoot Rules or refinement of the Rules to improve the specificity estimates would be beneficial to further reduce the rates, costs and consequences associated with our current use of medical imaging.
This review demonstrates that estimates of the diagnostic accuracy of the Ottawa Ankle and Midfoot Rules are influenced by features of the population (higher accuracy in adults) and methodological quality of the study (higher quality studies report lower accuracy). Although the accuracy of the Rules remains unclear in community settings, it has been shown that different health professionals can apply the Rules without compromises on accuracy estimates.
What are the findings?
This review demonstrates that the Ottawa Ankle and Midfoot Rules are able to correctly exclude fractures in most patients; however, as the Rules dictate imaging all who test positive, the low specificity and low prevalence of fracture means that many people without fracture will still undergo imaging and exposure to radiation.
Different health professionals can apply the Rules without compromises on accuracy.
Accuracy of the Rules remains unclear in community settings and requires further investigation.
Assessor training, use of medical imaging as the reference standard and low risk of bias did not explain specificity heterogeneity.
What is already known?
Previous reviews support the use of the Ottawa Ankle and Midfoot Rules applied by medical doctors; however, in emergency departments, other health professionals, for example, nurses and physiotherapists, are now also responsible for the triage of patients and application of the Rules.
Accuracy of the Rules in community settings and when applied by different health professionals is unclear.
Variability in specificity has been previously reported, but its source has not yet been investigated.
Contributors PRB, C-WCL, ZAM, CGM and AMM conceived and designed the study. PM, PRB, C-WCL, CGM and AMM analysed and interpreted the data. PRB, C-WCL, PM, ZAM, CGM and AMM contributed by drafting the article. PM provided statistical expertise. PRB, C-WCL, ZAM, CGM, AMM collected and assembled the data. PRB is guarantor. All authors participated in the revision and final approval of the manuscript.
Funding PRB was funded by the International Postgraduate Research Scholarship (IPRS) and Australian Postgraduate Award (APA). C-WCL and CGM have fellowships that are funded by the National Health and Medical Research Council, Australia.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.