Article Text

PDF

Why screening tests to predict injury do not work—and probably never will…: a critical review
  1. Roald Bahr1,2
  1. 1Department of Sports Medicine, Oslo Sports Trauma Research Center, Norwegian School of Sport Sciences, Oslo, Norway
  2. 2Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar
  1. Correspondence to Professor Roald Bahr, Department of Sports Medicine, Oslo Sports Trauma Research Center, Norwegian School of Sport Sciences, P.O. Box 4014, Ullevål Stadion, Oslo 0806, Norway; roald.bahr{at}nih.no

Abstract

This paper addresses if and how a periodic health examination to screen for risk factors for injury can be used to mitigate injury risk. The key question asked is whether it is possible to use screening tests to identify who is at risk for a sports injury—in order to address the deficit through a targeted intervention programme. The paper demonstrates that to validate a screening test to predict and prevent sports injuries, at least 3 steps are needed. First, a strong relationship needs to be demonstrated in prospective studies between a marker from a screening test and injury risk (step 1). Second, the test properties need to be examined in relevant populations, using appropriate statistical tools (step 2). Unfortunately, there is currently no example of a screening test for sports injuries with adequate test properties. Given the nature of potential screening tests (where test performance is usually measured on a continuous scale from low to high), substantial overlap is to be expected between players with high and low risk of injury. Therefore, although there are a number of tests demonstrating a statistically significant association with injury risk, and therefore help the understanding of causative factors, such tests are unlikely to be able to predict injury with sufficient accuracy. The final step needed is to document that an intervention programme targeting athletes identified as being at high risk through a screening programme is more beneficial than the same intervention programme given to all athletes (step 3). To date, there is no intervention study providing support for screening for injury risk.

  • Anterior cruciate ligament
  • Assessment
  • Epidemiology
  • Hamstrings
  • Review

Statistics from Altmetric.com

Imagine that you are planning an injury prevention programme for your team. Chances are that you will consider including a periodic health examination (PHE) to screen athletes for injury risk as a key component. As outlined in the second step of the van Mechelen model,1 the classic approach to sports injury prevention research, it is necessary to understand the risk factors and injury mechanisms that play a part in the occurrence of sports injuries to develop a targeted prevention programme.2 ,3 This paper addresses this step, focusing on if and how a PHE to screen for risk factors for injury can be used to mitigate injury risk. I will use hamstring and anterior cruciate ligament (ACL) injuries, two of the most common injuries in team sports,4 ,5 to illustrate key issues. The key question is: ‘Is it possible to use screening tests to identify who is at risk for a hamstring or ACL injury—in order to address the deficit through a targeted intervention programme?’

The purpose of screening

Screening is a strategy used in a population to detect a disease in individuals without signs or symptoms of that disease. The intention is to identify pathological conditions early, thus enabling earlier intervention and management in the hope of reducing future morbidity and mortality. Perhaps the most famous, and successful, example is the infant screening programmes instituted worldwide in the early 1960s for phenylketonuria (Følling's disease).6 If left untreated, phenylketonuria leads to severe brain function abnormalities. In contrast, patients who follow the prescribed dietary treatment from birth may have no symptoms at all. Recent screening programmes include breast cancer screening with mammography and prostate cancer screening with a blood test measuring prostate-specific antigen. However, it should be noted that although screening may lead to an earlier diagnosis, not all programmes have been shown to be beneficial and the value of current programmes for breast and prostate cancer screening are being debated.7 ,8

To ensure that screening programmes confer the benefits intended, the WHO published the Wilson-Jungner criteria for appraising a screening programme.9 These are the main criteria: (1) that the condition being screened for is an important health problem (depending not just on how serious the condition is, but also how common it is), (2) that there is a detectable early stage, (3) that treatment at an early stage is of more benefit than at a later stage and (4) that a suitable test is available to detect disease in the early stage. Clearly, injuries represent an important health problem in many sports (criterion 1). However, criteria 2–4 need adaptation when being applied to the case of sports injury prevention.

First, while screening for breast cancer involves detecting established disease as early as possible, screening for injury risk usually involves using a performance test to detect impairments which predispose the individual to injury (eg, hamstring muscle weakness, poor knee alignment). This highlights an important difference between disease detection and injury prediction. When screening for disease, the individual is classified as healthy or sick; the outcome is dichotomous (yes/no). When risk factors for injuries are assessed, such as eccentric hamstring strength or knee control in a vertical drop jump test, the outcome is usually continuous. Therefore, one more step is needed to make the test be useful in clinical practice: the continuous variable must be translated to a dichotomous outcome, that is, whether the athlete is at increased risk or not (yes/no).

Second, when screening for disease, the objective is to initiate treatment as early as possible. In sports injury prevention, the objective is early intervention to minimise the risk factor before injury occurs. Examples include a strength training programme, targeting players with low hamstring strength, or a balance training to improve knee control, targeting at-risk athletes identified through a vertical drop jump test.

Risk factors can be modifiable and non-modifiable, and screening tests typically measure modifiable factors such as strength or knee control since these can be targeted for change, for example, through specific training programmes. However, it should be noted that non-modifiable factors (such as gender or previous injury history) can be used as well, to target intervention measures to the subgroup thought to be at increased risk.

Developing a screening programme

Research on risk factors for injury is advocated for two reasons: to help understand why injuries happen and to predict who is at risk of injury. These two concepts are often, and erroneously, confused. One common misconception is that all it takes to develop a screening test is to identify a statistically (highly) significant association between the result from a screening test and increased injury risk. Typically, exploratory studies will have a cohort of athletes undergo a series of tests during the preseason to identify potential risk factors for injury and then injuries are recorded prospectively during the subsequent competitive season. If a significant association is identified between one or more factors and injury risk, it may be tempting to conclude that these can be used to predict who is at risk of injury. However, as illustrated in figure 1, this is only the first step towards a validated screening programme.

Figure 1

Three research steps needed to develop and validate a screening programme.

The next step required is to repeat the same study using the exact same screening test, but this time to use predetermined cut-off criteria to separate athletes with high risk from the rest. This needs to be performed in cohorts representing all potential user groups for the screening test. In this second step, the question is not how strong the association between the test result and injury risk is (eg, OR, p value), but how well the test predicts who becomes injured and who does not in a new athlete population, different from the one used to develop the test criteria.

Once a test has been developed and validated with acceptable test properties, the final step is to examine the efficacy of a screening programme. As noted above, non-modifiable factors may be relevant for stratification purposes, but the ultimate purpose of athlete screening is to identify athletes at risk and reduce their risk by addressing modifiable risk factors. Therefore, a prerequisite for a screening programme to be effective is that methods exist to modify the risk factor before injury occurs. The final step should be completed as a randomised controlled trial, where the treatment group receives the combined screening and intervention programme.

The treatment group outcome (injury rate) can be compared with that of a control group, which trains as usual, but should also be compared with that of a control group where all athletes are given the prevention programme. This is another issue that separates disease screening from athlete screening. If the disease is breast cancer, treatment is obviously only relevant for those identified with disease (or early stages of disease). However, if the goal is to prevent ACL or hamstring injury, the intervention can be offered to all athletes. The delivery cost is in most cases the same; there is usually no risk associated with the prevention programme per se, and the training may even improve sports performance. In other words, for a screening test to be relevant, it needs to capture the majority of athletes with increased injury risk, so they do not miss the opportunity to prevent injury through targeted training programmes. Ideally, it should also be able to separate athletes with low risk from the rest of the group, so they do not waste time doing prevention programmes they do not need.

Screening test properties

The ability of a test to predict injury is often described using the same test properties as those used for diagnostic tests, that is, sensitivity (does the test capture all those with injury), specificity (does it capture only those with injury), positive predictive value (how many with a positive test are injured) and negative predictive value (how many with a negative test are not injured). The following will explain these concepts and examine their relevance in the athlete screening setting, using data on ACL and hamstring injuries as examples.

Hewett et al10 introduced the vertical drop jump test as a screening test for ACL injury in female athletes in 2005 based on a prospective cohort study. Of 205 young female athletes tested in the preseason, 9 went on to suffer an ACL rupture. Of a range of different movement characteristics compared between injured and uninjured players, they observed the strongest association with injury risk for peak external knee abduction moment during landing, concluding that this factor predicted ACL injury status with 78% sensitivity and 73% specificity. This study, although the sample is small, is a good example of the first step towards a screening test.

Figure 2 has been adapted to illustrate their data and demonstrate one key challenge when developing a screening test: There is substantial overlap in test results (external knee abduction moment) between the injured and uninjured groups; the test does not separate these into two distinctly different populations. This should not be surprising, as most of the tests that potentially could be used to screen for injury risk measure physical performance characteristics such as strength, flexibility, balance or reaction time. In a relatively homogeneous group of athletes, these characteristics typically follow a normal distribution. Unless the relationship between test score and injury risk is extremely strong, considerable overlap in test scores should therefore be expected between injured and uninjured athletes.

Figure 2

Schematic representation of data from Hewett et al,10 illustrating the relationship between external knee abduction moment (reported as Nm adjusted for body height and weight) and risk of ACL injury. Uninjured players are shown in grey, while athletes who went on to suffer an ACL injury during the season are shown in black. The dotted lines denoted in A, B and C illustrate three alternative cut-off values. Note that the relative proportion of injured (N=9) to uninjured athletes (N=196) is not to scale, as each injured athlete is depicted by a full-size figure.

This is different from screening for early disease, where the screening test is designed to have a yes/no outcome. The mammography programme screens for the presence of a tumour or not. Prostate cancer screening is based on a blood test where most individuals with disease (although not all) display a markedly increased serum level compared with the general population.

For an athlete screening test, the critical question is where the cut-off value separating high-risk and low-risk groups should be set. Sensitivity and specificity are inversely related. This means that if you want to capture all injured players (100% sensitivity), specificity suffers (more uninjured athletes will be classified as having high risk). In figure 2, scenario A results in a sensitivity of only 44%, that is, only four of nine injured athletes are classified as high risk. Scenario B results in a sensitivity of 78% (the best fit with the data), while the cut-off depicted for scenario C is needed to capture 8 of the 9 injured players. However, specificity will then have dropped, from 93% in scenario A to 70% in scenario C. The positive predictive value is low in all scenarios, ranging from 14% to 7%.

It follows from this that the optimal cut-off value for screening purposes is not necessarily the value representing the best fit. If the intervention is costly (for athletes this usually means time consuming), a conservative cut-off (high specificity) may be more appropriate. But if the intervention is easy, has no side effects and is highly effective, a cut-off with high sensitivity is more reasonable.

However, the all-important next step involves using the same test, applying a predetermined cut-off value, on a new population of athletes to: (1) confirm the association between risk factor and injury risk, and (2) test the performance of the cut-off value selected. Several groups have examined the vertical drop jump test and, unfortunately, other studies have not been able to confirm that there is an association between knee abduction and injury risk.11 ,12 The most stringent study was by Krosshaug et al,12 explicitly designed to validate the Hewett test in a cohort of >700 elite female football and handball players, of whom 42 suffered a new non-contact ACL injury. They tested five predetermined candidate risk factors in separate logistic regression analyses, with new ACL injury as the outcome: (1) knee valgus angle at initial contact, (2) peak knee abduction moment, (3) peak knee flexion angle, (4) peak vertical ground-reaction force and (5) medial knee displacement. While knee abduction moment was not associated with injury risk, ACL-injured players displayed greater total medial knee displacement during landing, as shown in figure 3 (although only when players with previous ACL injury were included in the analyses). However, these data once again illustrate the main challenges with athletic screening tests: the risk factor is continuous and there is substantial overlap between groups. It can be seen clearly from figure 3, where the mean difference in knee displacement was only 5 mm, that it is not possible to select a cut-off value to predict who is at risk and who is not.

Figure 3

Frequency diagram with Gaussian regression lines of medial knee displacement (cm) in 42 injured (top panel) and 669 uninjured knees (lower panel). Adapted from Krosshaug et al.12

Hamstring injuries are also common, and a recent meta-analysis demonstrated that older age, increased quadriceps peak torque and history of hamstring injury were associated with increased risk of hamstring muscle strain-type injuries in sport.13 However, the authors also observed that studies were small, as previously noted.14 In a recent study, van Dyk et al15 therefore examined the relationship between injury risk and various strength measures in 614 football players; during four seasons, 190 of these suffered a hamstring strain injury. They observed that eccentric hamstring strength at 60°/s was independently associated with injury risk (OR 1.37 per 1 Nm/kg difference). However, as illustrated in figure 4, again there is substantial overlap between injured and uninjured players, which clearly illustrates that a screening test based on eccentric hamstring strength cannot be used to predict injury risk.

Figure 4

Frequency diagram with Gaussian regression lines of body weight (BW)-adjusted hamstrings eccentric torque at 60°/s (Nm) in 190 injured (top panel) and 424 uninjured players (lower panel). Adapted from van Dyk et al.15

Both of these examples illustrate that while a statistically significant association indicates that there may be a causal relationship between a specific test result and injury risk, this is not sufficient to use the test to predict who is at risk of injury. Markers proposed for classifying or predicting risk in individual participants must be held to a much higher standard than merely being associated with outcome.16 ,17

It should be noted that there are more appropriate statistical measures than sensitivity, specificity, positive and negative predictive values and ORs, which should be used to describe the predictive ability of a screening test, such as likelihood ratio18 ,19 or receiver operating characteristic curve analyses.16 In the examples used here, receiver operating characteristic curve analyses revealed an area under the curve of only 0.60 (vertical drop jump test)12 and 0.56 (eccentric hamstring strength),15 where a value of 1.0 indicates perfect prediction and 0.5 indicates a truly useless test (one no better at identifying true positives than flipping a coin). This emphasises that more appropriate statistical methods confirm that these markers cannot be used as screening tests to predict ACL or hamstring injury, respectively.

Combining information on several different markers may improve predictive ability. However, even in larger studies, where the sample size is sufficient to perform multivariate analyses, the results are not impressive. Recent studies from Australia illustrate this.20–22 A novel test for eccentric hamstring (knee flexor) strength based on the Nordic hamstring exercise was completed during the preseason in three cohorts of 210 elite Australian football,20 178 rugby union21 and 152 association football (soccer)players.22 In addition, previous injury, age, biceps femoris fascicle length, between leg strength imbalance were included in multivariate models. However, the association with injury risk did not improve markedly when adding these factors to the models. The same studies also illustrate the importance of validating the cut-off value for eccentric hamstring strength chosen to determine risk in different cohorts; the best fit with injury risk was 256 N,20 268 N21 and 337 N22 in the three athlete groups.

Categorical risk factors as screening markers

Also in sport, there are examples of binary categorical risk factors, such as history of previous injury (yes/no) and sex (male/female), and the question is how these behave as markers for injury. Most such markers are non-modifiable, although it may be argued that history of previous injury represents a modifiable factor, at least for some injury types. The risk of reinjury is highest immediately after return to sport, and wanes with time.23 ,24 One example is that after an ankle sprain, the reinjury rate is about 50% during the first 6 months after return to play, but only 4% after 2 years, the same as for healthy ankles.23 Another study shows that graft ruptures after ACL surgery also tend to occur within the first 6 months.25 The explanation is probably that, with time, injured ligaments and muscles heal and their functional properties (strength, balance, neuromuscular control) improve.

Nevertheless, a consistent finding across most injury types and sports is that a history of previous injury is the by far strongest risk factor for injury, with very impressive ORs, often in the 2–6 range.12 ,13 ,26–29 Table 1 shows an example based on data from a one-season prospective study in Icelandic football, where players were asked about previous injuries before the start of the season and new injuries were recorded throughout the season.30 This study observed the highest OR ever reported for history of previous injury as a risk factor for hamstring strains; the OR was 7.4 (95% CI 2.9 to 19.0, p<0.001, univariate logistic regression).

Table 1

Comparison of the risk of new hamstring strains between players who previously had sustained such an injury and players with no previous injury

However, the question is how well this marker predicts a new hamstring injury. Using the traditional measures for the accuracy of diagnostic tests, the sensitivity (10/19) was 53%, the specificity (433/497) was 87%, the positive predictive value (10/74) was 14% and the negative predictive value (433/442) 98% in this sample.

In other words, if this marker had been used to predict injury (ie, to decide who needed an intervention programme), almost half of the players (9 out of 19) who went on to suffer an injury would have been denied the intervention. To prevent hamstring strains, the Nordic hamstring exercise programme has been developed as a highly effective intervention,31–33 which is easy to do and has no side effects when performed correctly. Therefore, in this case, it seems inappropriate to use a marker with low sensitivity.

Still, it may be argued that since injury risk is much higher among players with a history of previous injuries, they are the ones who should be targeted with a prevention programme (or perhaps a more intensive rehabilitation programme before and continuing after return to play). This view has some merits, as illustrated by the randomised trial by Petersen and colleagues,32 testing the effect of the Nordic hamstring exercise programme on hamstring injury risk. They showed that while 25 players without history of previous injury (95% CI 15 to 72 players) needed to perform the exercise programme to prevent one new injury (the number needed to treat), only 3 players with a history of previous injury (2 to 6) had to use the programme to prevent one recurrent injury. However, it should be noted that even in the group without previous injury, the preventive effect was substantial; a 59% reduction in injury risk was seen in this group (compared with an impressive 86% among players with a history of injury). As nearly 50% of hamstring injuries happen to players with no previous injury, a coach may therefore want to offer the programme to the entire team. This illustrates that when data from screening and intervention studies (based on screening) are available, informed decisions can be made to decide if a prevention programme should be introduced and who should be using the programme.

Another example illustrating the issue of stratification of preventive interventions is the sex difference in ACL injury risk. Studies have shown that the ACL injuries are anywhere between twofold and fivefold more common among women than men, depending on age group and sport.34 ,35 This is reflected in the research carried out on risk factors for ACL injuries; the studies performed to examine the performance of the vertical drop jump test have been performed on women only.10–12 It is also echoed in the intervention trials performed to test the effect of various prevention programmes for ACL injuries, which, almost exclusively, are performed on females only.36 It may be expected that this is also reflected in how individuals and teams have taken on such programmes; ACL injury prevention programmes are most likely almost exclusively taken aboard by female athletes. However, it should be noted that these are decisions that have not been based on using sex as a predictive marker for injury risk as such, but rather on the high prevalence of ACL injuries among female athletes. There are examples of populations of male athletes, such as professional football players in the Gulf region, where the prevalence of ACL injury seemingly is sufficient to warrant preventive initiatives.25

As a final note on categorical, non-modifiable risk factors, these can be used to make individual decisions, as well: “I am a young woman. I have had one ACL injury. I should probably give up basketball.” This decision is not helped by a vertical drop jump test or other physical tests.

Should we discontinue PHEs?

This paper demonstrates that to validate a screening test to predict and prevent sports injuries, at least three steps are needed. First, there needs to be a strong relationship between the marker and injury risk. Second, the test properties need to be examined in relevant populations, using appropriate statistical tools. Unfortunately, there is currently no example of a screening test for sports injuries with adequate test properties. The third and final step would be to document that a screening-based intervention is more beneficial than intervention alone. However, given the nature of existing screening tests (where test performance is measured on a continuous scale from low to high), substantial overlap is typically seen between players with high and low risk of injury. Therefore, although the factor tested may demonstrate a highly significant relationship with injury risk, and in this way improve the understanding of causative factors, such tests are unlikely to be able to predict injury with sufficient accuracy.

While predicting future injury risk through screening tests is unrealistic, a PHE or pre-participation examination can serve several other purposes, as outlined in the the IOC consensus statement on periodic health evaluation of elite athletes.37 First and foremost, it includes a comprehensive assessment of the athlete's current health status, and, typically, it is the entry point for medical care of the athlete. As demonstrated by Bakken and colleagues,19 in a large cohort of professional football players, the majority of athletes presented with at least one current health condition and one in three with a musculoskeletal condition requiring some form of follow-up. Other potential benefits of regular health examinations include establishing rapport between the medical team and the athlete, reviewing medications and supplements to avoid inadvertent doping, establishing a performance baseline for the athlete in the healthy state, and, in some settings, to satisfy the medicolegal duties of care.37 Nevertheless, the IOC consensus statement concluded that large-scale population-based studies are needed to evaluate the components of history and examination that can be used to identify athletes at risk, intervene and change outcome, and recommended that programmes on PHEs be set up and conducted as research projects. The current paper serves to reinforce those conclusions.

What are the findings?

  • To validate a screening test to predict and prevent sports injuries, at least three steps are needed: (1) a strong relationship must be demonstrated in prospective studies between a marker from a screening test and injury risk; (2) the test properties of the marker must be validated in relevant populations, using appropriate statistical tools; (3) an intervention programme targeting athletes identified as being at high risk using the marker must be more beneficial than the same intervention programme given to all athletes.

  • To date, there is no screening test available to predict sports injuries with adequate test properties and no intervention study providing evidence in support for screening for injury risk.

Acknowledgments

The author would like to thank his colleagues at the Oslo Sports Trauma Research Center and Ian Shrier for helpful comments to the manuscript.

References

View Abstract

Footnotes

  • Correction notice This paper has been amended since it was published Online First. The author's affiliations have been updated.

  • Funding The Oslo Sports Trauma Research Center has been established at the Norwegian School of Sport Sciences through generous grants from the Royal Norwegian Ministry of Culture, the South-Eastern Norway Regional Health Authority, the IOC, the Norwegian Olympic Committee & Confederation of Sport, and Norsk Tipping AS.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.