Article Text

Download PDFPDF

Better reporting standards are needed to enhance the quality of hop testing in the setting of ACL return to sport decisions: a narrative review
  1. Paul Read1,2,
  2. Sean Mc Auliffe3,
  3. Mathew G Wilson1,4,
  4. Gregory D Myer5,6,7
  1. 1 Research Department, Aspetar Orthopeadic and Sports Medicine Hospital, Doha, Qatar
  2. 2 School of Sport and Exercise, University of Gloucestershire, Gloucester, UK
  3. 3 Department of Physical Therapy and Rehabilitation Sciences, Qatar University, Doha, Qatar
  4. 4 Institute of Sport Exercise and Health (ISEH), University College London, London, UK
  5. 5 Division of Sports Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, USA
  6. 6 Department of Pediatrics and Orthopaedic Surgery, College of Medicine, College of Medicine, University of Cincinnati, Cincinnati, Ohio, USA
  7. 7 The Micheli Center for Sports Injury Prevention, Boston, MA, USA
  1. Correspondence to Dr Paul Read, Research Department, Aspetar Orthopaedic and Sports Medicine Hospital, PO Box number: 29222, Doha, Ad Dawhah, Qatar; paulread10{at}


Background/aim There is a lack of consistency in return to sport (RTS) assessments, in particular hop tests to predict who will sustain a reinjury following anterior cruciate ligament (ACL) reconstruction. Inconsistent test battery content and methodological heterogeneity might contribute to variable associations between hop test performance and subsequent injury. Our aim was to investigate whether commonly used hop tests are administered in a consistent manner and in accordance with reported guidelines.

Methods We conducted a narrative review of studies that examined whether hop testing could differentiate RTS pass rates, reinjury and rerupture in athletes after ACL reconstruction. Our specific focus was on the methodological procedures of hop testing as this component is widely used to evaluate patients’ function and readiness to RTS.

Main findings Substantial variation exists in RTS hop test administration, scoring and interpretation. Authors often failed to report important details of methods such as warm up activities, randomisation, number of trials, rest periods and landing requirements.

Conclusion We recommend researchers provide clearer descriptions of how hop tests are performed to increase standardisation and promote accurate data collection. Absence of reporting to describe test methods and using different test procedures makes it difficult to compare study findings.

  • ACL
  • exercise rehabilitation
  • exercise testing

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Return to sport (RTS) decision making following anterior cruciate ligament (ACL) reconstruction is a complex process involving many factors. A criterion-based approach is now accepted,1 where a range of tests are used in various combinations. Passing a test battery including a series of single leg hop and isokinetic tests was associated with lower rerupture rates following RTS,1 2 and an increased likelihood of returning to previous sporting levels.3

However, hop and isokinetic tests do not consistently predict successful outcomes following ACL rehabilitation. Losciale et al 4 reported no associations between the use of RTS discharge tests and greater risk of reinjury, stating the low quality of evidence affects our ability to make definitive conclusions. Toole et al 5 reported that many young athletes had been cleared to RTS by their surgeon and rehabilitation specialist but failed to pass the RTS cut-offs cited in the literature. A recent 2019 systematic review6 demonstrated that only 23% of patients passed RTS test batteries. These authors6 also suggested an apparent paradox that ‘passing’ an RTS battery was associated with a greater risk of injury to the contralateral limb.

That conclusion led us, and others,7 to question whether the existing data relating to RTS and subsequent reinjury displays too much clinical variability among the patient groups (race, age, sex, level of performance, type of ACL surgery, other associated injuries) and inconsistent application of test batteries (different clinicians and studies use a widely ranging set of tests). On deeper reflection, and after scoping the literature, we wondered whether the tests themselves (eg, the hop tests) are described clearly and used in a reproducible manner. For RTS tests to be to be valid and generalised across clinical settings, standardised methods are required for administration, scoring and interpretation.

The primary aim of this narrative review was to provide an overview of the RTS testing batteries of the studies included8–20 in the 2019 synthesis of the literature6 (table 1), and the cited studies in the methods sections of these manuscripts21–35 (table 2). Our focus was applied to hop testing methods and administration procedures following ACL reconstruction. This current review6 provided the most comprehensive and up-to-date body of literature in the area of RTS testing and numerous discussions followed its publication. Our second aim was to demonstrate how even minor alterations to the test protocol might affect interpretation and subsequent RTS decisions.

Table 1

Studies cited in the 2019 systematic review of RTS outcomes following ACL reconstruction6

Table 2

Studies cited in the methods of manuscripts included in the 2019 systematic review6


Table 1 shows that a wide range of approaches in test administration, scoring and interpretation were used, and important methodological details were infrequently reported. To illustrate this point, we created a colour code system to indicate the methodological quality of each of the included studies (green=aligned with current evidence; yellow=reported but not aligned with current evidence; red=not stated) (tables 1 and 2). A discussion of why these factors affect test outcomes is provided below.

Test order

Substantial variation existed across the included studies in relation to testing order. No studies randomised the order of testing, seven did not state the sequence performed, and the remaining studies included additional hop/jump tests.

Randomisation of test order will reduce the potential of an order effect and control fatigue. However, if a ‘battery’ of related tests is being used, it is prudent to standardise the order. A logical sequence is to perform non-fatiguing tests requiring high skill movements and coordination first. Therefore, power tests (hops) may be preferentially performed prior to other RTS tests such as those requiring maximal strength (isokinetic dynamometry).36 In athletes following ACL reconstruction, considering the loading requirements for each hop test is important. We suggest moving in sequence from the least to most demanding task and propose the following test order: (1) timed hop; (2) single hop; (3) triple-hop and (4) cross-over hop.

Influence of preceding tests measuring different performance constructs

Test order should consider the physiological energy systems targeted, as well as the athletes training age/experience. Performing hop tests soon after or even before maximal strength tests can have a pronounced effect on the test outcomes. Table 1 indicates that four studies performed isokinetic testing first, while two studies employed isokinetic testing following hop tests. Other studies also included either isokinetic or isometric knee strength assessments, but the test order was not stated.

Following a bout of resistance training, involved musculature are in both a fatigued and potentiated state. The balance between these two factors determines the subsequent muscle performance.37 An acute enhancement in muscle function following intense activity is defined as postactivation potentiation,38 with significant improvements frequently shown in resistance exercise performance following plyometric activities.39–41 However, modulating factors can influence these positive benefits. A larger effect is shown for stronger individuals and those with greater resistance training experience. The length of the rest interval is also dependent on the individual’s level of strength, with weaker athletes requiring longer rest periods following the previous activity.41

Quadriceps muscle fatigue can affect hop performance in athletes following ACL reconstruction.42 Single leg hop performance was compared in both a non-fatigued state, and immediately following a pre-exhaustion set (as many reps as possible) of knee extensions at 50% of 1 repetition maximum (RM) strength. All participants displayed ‘pass’ limb symmetry scores (LSI>90%) in the non-fatigued condition, but only 68% ‘passed’ the test following the pre-exhaustion protocol. Thus, pre-exhaustion of the quadriceps may be deleterious to single leg hop performance after ACL reconstruction.

Assessing athletes under conditions of fatigue may provide valuable information to the tester; however, the inconsistencies in test order displayed in the studies reviewed limits the interpretation of results and our ability to generalise the findings. The number of warm up trials for isokinetic protocols was also frequently not stated, and a range of strength test modes and speeds were used (60°/s, 90°/s and 180°/s), further limiting extrapolation of findings across all studies.

Limb testing order

Most studies tested the uninvolved before the involved limb. Only one study randomised limb order, and the others did not state the order limbs were tested. If the task is always performed on the uninvolved limb first as was the case in six studies, performance on the involved limb may benefit from a learning effect from the preceding hops on the other side. Research studies should counterbalance the limb order where the participant sample is divided in half, and the limb order is reversed for each group to remove this confounding factor. Clinicians may wish to randomise the limb order for each hop performed during the test battery so that the uninvolved limb is not always first.

Rest periods

No consistent prescription of rest periods between tests/trials was applied across the studies included in the prior synthesis of the literature.6 Protocols ranged from 30 s to 3 min, with many studies not stating how long participants rested. Previous recommendations have indicated work to rest ratios of at least 1:5 during plyometric exercise.43 Thus, counterbalancing limb order and 30 s rest between trials should provide adequate recovery and control for order and fatigue effects.

Reporting the best versus the mean score

Many studies reported either the best trial (three studies) or mean score (four studies), while the remaining 10 studies did not state which data were used for reporting. There are no clear recommendations for the preferred method (best vs mean score) to report hop test performance, and this is often based on individual preference. For countermovement jump tests, the mean score is more sensitive than the best recorded trial to detect performance changes and monitoring neuromuscular fatigue.44 Similarly, systematic bias was shown between test sessions attended a week apart for the best but not mean asymmetry score during an isometric squat test, with lower test retest reliability.45 Given that a limb symmetry index is a composite score derived from the performance of individual limbs, average scores may capture some of the inconsistencies between trials and the innate variability of asymmetry.45 Thus, it is possible that the mean score is preferable when measuring hop performance following ACL reconstruction.

Including task constraints to avoid movement compensation

Absence of task constraints can affect scores obtained during hop testing due to movement compensations. Only six studies mentioned any landing requirements with descriptions varying from ‘stable’, ‘controlled’, ‘without losing balance’ and either ‘hold for 2 s’, or ‘hold for 2–3 s’.

It may also be prudent when performing hop tests where the goal is to assess lower extremity power, to consider and control contributions from the extremities. The inclusion of an arm swing during jumping tasks has been shown to augment performance due to an increase in lower extremity work performed.46 Therefore, when comparing results across different studies, arm placement should be controlled, or at least communicated. No study described the role of arm movement during testing. In the studies they cited (table 2), inconsistency was present with two requesting athletes place their hands behind the back, while four permitted arm use. Proponents of arm use suggest it is more ‘functional’, but it could be argued that hop testing itself is not truly ‘functional’ and lacks ecological validity for most sports. This was acknowledged by Noyes et al 32 who stated ‘the four hop tests used in this study were not sensitive enough to detect their functional limitations. This could be due to the nature of the tests themselves; they are performed in a safe environment instead of the playing field, there are no opponents or objects to undermine the participant’s concentration, and the activities involve simple hopping with no cutting or twisting motions’.

Ageberg and Cronström47 examined single leg hop performance in participants with lower extremity injury under two conditions: (1) with arms free and (2) arms behind back. Hop distance was shorter on their involved versus uninvolved leg when the arms were placed behind their back, and nearly twice as many participants displayed ‘abnormal’ LSI (<90%). Agreement between the two conditions was also poor. These findings highlight the importance of limiting arm use, and that a stricter protocol should be adopted to avoid overestimating knee function.

Familiarisation and practice trials

Table 1 shows that no studies included a separate familiarisation session where athletes were provided with opportunities to practice the tests prior to data collection. There was also inconsistency in the number of practice trials, ranging from 3–5, 2–3, 1 and ‘a few’. This affects data interpretation, as systematic learning will occur between trials and test sessions in both ‘healthy’ recreational athletes24 and patients following ACL reconstruction.25 34 Significant differences in hop performance have been reported between test sessions 1 and 2, but not days 2 and 3 in previous research.25 34 These findings indicate a learning effect, supporting the need for a separate familiarisation session prior to testing.

The number of practice trials provided should allow for adequate familiarisation. Munro and Herrington30 showed that learning affects were present in all four hop tests, where scores improved across trials. Single and triple-hop for distance test scores stabilised after three trials; whereas, cross-over hop scores stabilised after four trials. The timed hop stabilised after four trials in women and three in men. Therefore, practice trials should be provided to ensure a more reliable test outcome. Specifically, three practice trials are recommended for the single and triple-hop tests, with an additional trial included for the cross-over hop and timed hop.

Warm up procedures

A warm up is common practice in athletic endeavours. Table 1 shows 12 of the included studies did not state what/if any warm up procedures were applied, while five studies included either cycling, walking or running, with durations ranging from 5 to 10 min. Additional warm up activities included static stretching (one study) and dynamic exercises such as squats and toe raises (one study).

An active warm up can improve performance48 and test outcomes might vary with such a large variation in physiological readiness. It has been shown that practice trials alone are insufficient to elicit maximal strength and jumping performance, which are positively related to muscle temperature.49 50 It is also recommended to avoid static stretching immediately prior to testing as jump performance has been shown to reduce compared with dynamic warm up protocols.51

Selecting the optimal warm up for athletes will depend on a variety of factors. We recommend a protocol consisting of general cardiovascular activity (eg, stationary cycling or jogging performed at approximately 60% of maximum perceived effort) for 5 min and task-specific activities (squats, lunges, practice jumps / hops, etc) to increase muscle temperature and movement pattern sequencing.

Other considerations

Results might be affected by a range of other factors beyond the scope of this review including the use of a knee brace, shoe-surface interaction and the athlete’s state of readiness. These aspects should be considered in the study design and reported within the methods section. In addition, test administrators should be well trained and have a thorough understanding of all protocols and procedures. Examples applied to hop tests have been outlined below.

Point of measurement

Only one study reported this procedure, where measurement was to the heel. Of the cited studies, four measured the distance to the heel and two to the toe. Measuring the distance hopped to the toe does not account for the wide variation in foot length, which is irrespective of the horizontal hop distance achieved during the task. Differences in the point of measurement (heel vs toe) can also affect the LSI score. Practitioners are encouraged to measure the distance hopped from the start line to the participant’s heel. It is also prudent to report the absolute hop distances, and scores relative to leg length or body height to account for different anthropometric profiles, enabling comparisons of performance across different studies and athletes of the same sport/playing level.

Reporting details of the test raters

In cases where more than one person is conducting RTS testing, inter-rater reliability should be examined prior to data collection and the relevant statistics should be reported. To illustrate this point, consider the timed hop and the many inherent sources of error. The timer starts when the athletes heel leaves the ground at the beginning of the test and stops when the athlete completes the 6-m distance. Thus, clinicians encounter four potential sources of error (1) heel raise; (2) hit start; (3) visually observe the athlete complete the 6 m distance and (4) hit stop). Ensuring appropriate consistency between raters prior to testing is therefore essential to ensure accurate data collection.

Check for systematic bias between trials

Raters should check the scores of each individual trial during hop testing to control for systematic bias. An athlete’s score can progressively improve during the test (due to learning, increased confidence or warm up effects) or get worse (maybe from fatigue or wavering motivation). In the studies reviewed, only three stated that they checked for systematic bias. The most frequent protocols were ‘if subjects increased their hop performance in all three trials, additional hops were performed until no increase was seen’.11 18 Beischer et al 8 stated ‘if the test administrator felt that an even better result could be achieved, one or two additional hops were allowed. An alternative is to use values greater than the minimal detectable change (MDC) to indicate the minimal amount of change required to determine if the observed performance increases or decreases between trials are ‘real’, accounting for the measurement error in the test. MDC values for the four hop tests have been reported in ACL patients (single hop: 8.09%; triple-hop: 10.02%; cross-over hop: 12:25%; timed hop: 12.96%; overall combination of hop tests: 7.05%).34 To be confident a ‘real’ systematic increase in performance has been observed, a general guideline of ~10% may be applied, although this may increase for the cross-over and timed hop tests.

Description of hop test procedures in research studies

Several important methodological procedures were often not stated in the studies reviewed (table 1). Even when citing the methods of previous research (table 2), it was not possible to determine which procedures had been used. An example of this is the study of Kyritsis et al.2 Text from their manuscript states: ‘Single and triple hop distance tests were used’.24–26 The authors cited three manuscripts26 34 35 indicating further procedural information. However, there were differences between these studies in test administration (table 2) and the original manuscript2 did not specify which procedures from each of these studies were employed. Inadequate reporting makes it harder to replicate methodological procedures and threatens not only the external validity (ie, generalisability or applicability) of experimental studies but also those of subsequent systematic reviews. A previous systematic review examined the measurement properties of the hop tests and their relationships with future knee injury.52 The authors reported the methods and terminology varied greatly across studies and stated that a lack of standardisation limits the generality of the findings. We believe the heterogeneity in how data are collected and reported could at least in part, account for the equivocal results found within the synthesised literature6 and other systematic reviews in this area of research,4 which might influence current practice recommendations. Furthermore, the results of hop testing will almost certainly vary across patient groups (athletic vs non-athletic, older vs younger, presence of comorbidities vs no comorbidities, etc). A clear description of study population characteristics can aid clinicians in their interpretation of the data and translation of results into clinical practice.


Even small alterations to hop test procedures can affect performance outcomes and we propose the following:

  • Increased rigour of test methods and reporting standards to enhance the quality and reproducibility of future research that examines RTS outcomes following ACL reconstruction.

  • A standardised approach is needed with specific protocol instructions to obtain accurate data, heighten test sensitivity, and avoid overestimating or underestimating knee function.

  • The procedural review guidelines provided in tables 1 and 2 could form part of a reporting checklist for future research.

What is already known?

  • There is a lack of consistency in the ability of anterior cruciate ligament (ACL) return to sport (RTS) tests to predict who has a successful clinical outcome following ACL rehabilitation and who has a serious reinjury.

  • We and other clinicians suspect there is insufficient rigour in how the hop tests are administered during an RTS battery.

What are the new findings?

  • Authors reporting procedures used in RTS testing frequently did not detail how they administered the hop tests.

  • We found substantial variation in RTS test administration, scoring and interpretation in different studies that reported hop testing post-ACL reconstruction

  • Small alterations to hop test procedures can affect performance outcomes, making it difficult to compare the findings of the respective studies; a more standardised approach is required.

  • There is a need for better quality in the reporting of hop test methods to allow practitioners and researchers to compare the findings of different studies. This is an important prerequisite to testing the utility (or not) of RTS batteries (that include the hop test).



  • Twitter @paulread1010, @Seaniemc89, @gregmyer11

  • Contributors The lead author and one other author conceptualised the article contents. The lead author wrote the original draft of the manuscript. All authors contributed substantially to editing the manuscript in preparation to submit a final draft as well as subsequent revisions. All others read and approved the final copy of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.