Background Categorical grading and other measurable MRI parameters are frequently utilised for predicting the outcome of hamstring injuries. However, the reliability and smallest detectable difference (SDD) have not been previously evaluated. It therefore remains unclear if the variability in previously reported results reflects reporting variation or actual injury status.
Methods 25 hamstring injuries were scored by two experienced radiologists using the Peetrons grading and specific prognostic MRI parameters: distance from ischial tuberosity (cm), extent (cranio to caudal, anterior to posterior, medial to lateral; (cm)), maximum cross-sectional area (%), volume (cm3) of the oedema. The interobserver and intraobserver reliability was calculated along with the SDDs for each scale variable.
Results There were 3 Grade 0 (12%), 11 grade 1 (44%), 9 grade 2 (36%) and 2 grade 3 (8%) injuries. Cronbach's α values for grading were 1.00 (inter) and 0.96 (intra), respectively. The intraclass correlation coefficients for the prognostic MRI parameters were between 0.77 and 1.0. The SDDs varied between each parameter.
Conclusions Excellent interobserver and intraobserver reliability was found for grading and prognostic MRI parameters in acute hamstring injuries. In daily practice and research, we can be confident that scoring hamstring injuries by experienced radiologists is reproducible. The documented SDDs allow meaningful clinical inferences to be made when assessing observed and reported changes in MRI status.
Statistics from Altmetric.com
Muscle injuries account for up to 30% of all sporting injuries, with the hamstring complex being the most frequently injured site.1–4 MRI is considered useful in confirming injury diagnosis, severity and prognosis, with categorical and continuous scoring systems constituting validated indicators of time to return to a sport.5–9
A recent cohort study in European football established the clinical relevance of a widely used categorical grading system.10 ,11 However, hamstring injuries may be considered a heterogeneous group and other researchers have focused on prognostic MRI parameters such as intramuscular location and extent of the injury.7 ,9 For example, the location, in particular the continuous distance to the ischial tuberosity, has a fair5 to good6 correlation with time to return to preinjury function. Similarly, measurements of the extent of the injury in three planes have shown correlation coefficients between 0.39 and 0.74 (table 1).5–8 With increasing MRI availability, understanding of the clinical relevance of each of these variables continues to evolve.
Despite the frequent application of these MRI parameters, there are no data published regarding the reliability and smallest detectable differences (SDDs) in the MRI interpretation of hamstring muscle injuries. As a result, it remains unclear if the variability in study findings reflects a variability in the reporting or actual MRI status. The aim of this study was to evaluate the interobserver and intraobserver reliability and document SDDs of MRI grading and other prognostic parameters in acute hamstring injuries.
The investigation formed part of a randomised controlled trial evaluating acute hamstring injuries (ClinicalTrial.gov number NCT01812564). Approval was obtained from the Ethics Committee of Aspetar, Qatar Orthopaedics and Sports Medicine Hospital and informed consent was obtained from all included patients.
Patients were recruited between November 2009 and December 2012 at an orthopaedic and sports medicine hospital in Qatar. For this substudy, 25 patients out of the recruited cohort who met distinct inclusion criteria (acute onset of posterior thigh pain, MRI performed within 5 days from injury, age >18 years and male) were randomly selected. One investigator randomly selected 25 patients by circling the unique anonymised patient study number on a list of all patients.
The players were positioned supine and examined with a 1.5 Tesla Siemens Espree. In addition to a phased array coil, two-body matrix coils were strapped over the thigh and centred over the painful area, identified by the athlete and marked by the physician. Axial and coronal proton density with fat saturation along the longitudinal axis of the thigh (TR/TE 3490/27 and a 512×326 matrix for the coronal images and TR/TE 3000/32 and a 512×333 matrix for the axial images) with one signal average each were obtained. The field of view used on the coronals was 25 cm and 24 cm with the axial images and a 3.5 mm section thickness with no gap.
Prior to the study, two radiologists were familiarised with the MRI scoring protocol, in a trial involving 10 patients. Each radiologist scored the MRIs in random order between May 2012 and January 2013. Radiologist one (EA), who was also involved in other hamstring diagnostic studies, scored 128 MRIs in this period. During this process, MRIs were randomly allocated each week in sets of 3–5 with at least 2 months between the first and second evaluations of the same MRI. Radiologist two (BR) scored sets of 3–5 MRIs on a weekly basis in the same manner.
The radiologists, each with more than 9 years of experience in musculoskeletal radiology and blinded to the clinical status of the injury, independently interpreted the MRIs, scoring them according to a modified Peetrons classification system;10 ,11 grade 0: no abnormalities; grade I: oedema without architectural distortion; grade II: oedema with architectural distortion; and grade III: complete tear.
Additional prognostic MRI parameters measured were: craniocaudal, transverse and anteroposterior dimensions (cm) of identified oedema, and distance from the most proximal site of oedema to the ischial tuberosity (cm). We subsequently calculated the volume (cm3) of muscle involved and the maximum involved cross-sectional area as a percentage of the total muscle cross-sectional area in the transversal plane.
When more than one muscle was involved, the muscle with the most extensive oedema or tear was scored.
Interobserver and intraobserver reliability was calculated with a one-way random model. For the categorical variable of overall grade, a scoring system (with choices of 0, 1, 2 or 3) per observer per hamstring injury was recorded.10 ,11 The interobserver reliability for these measures was estimated using Cronbach's α. For the parametric values, the intraclass correlation coefficient (ICC(2,1)) was calculated to estimate reliability. The inter-rater reliability is considered excellent if the ICC is >0.75, fair to good if 0.4<ICC<0.75 and poor if ICC is <0.4.12 The SDD was calculated from the inter-rater reliability analysis.
A total of 316 patients met the inclusion criteria, and all selected 25 MRIs were included in the analysis. Patient characteristics are presented in table 2. There were 3 Grade 0 (12%), 11 Grade 1 (44%; biceps long head (BLH) injuries), 9 grade 2 (36%; 6 BLH 2 semitendinosis (ST) and 1 semimembranosis (SB) injury) and 2 grade 3 (8%) injuries. The mean values of the prognostic MRI parameters, reliability and SDD data are presented in table 3.
In this study, the interobserver and intraobserver reliability for MRI grading and prognostic parameters in acute hamstring injuries was excellent. When experienced radiologists report MRI data on hamstring strain injuries, we can be confident that the detailed assessment of MRI for injuries in these muscles will be a reproducible finding. This is an important and clinically relevant finding when one considers the increased use of MRIs in the diagnosis and prognosis of hamstring injuries, which has not been reported previously.
Values for SDDs are critical for our understanding of comparative prognosis between patients and continuous evaluation of individual patients. Despite this, the SDDs for hamstring muscle evaluation with MRI have not been reported previously. The SDD for oedema measurement of approximately 1.0–1.5 cm for three planes highlights that very small differences may accurately reflect true variation in oedema in these planes. The SDDs presented here may be used both when comparing between different patients (for the assessment of relative prognosis), and for serial imaging of the same patient to clarify if the reported changes in MRI parameters (eg, in the length of tear, extent of oedema) are potentially due to measurement error (changes less than the SDD) or not (changes greater than the SDD).
Categorical scales for grading muscle injuries are pragmatic and popular with clinicians and patients.10 ,11 ,13 This is despite the seemingly arbitrary delineation of grade descriptions. The SDD data illustrated here may provide guidance for the development of sensible cut-off points for any planned subgroup analyses.
Increasingly, studies are utilising MRIs for assessing hamstring muscle injury, its location, extent and relationship with prognosis.6 ,10 However, the ultimate significance of many of the imaging findings described remains to be determined and, as such, the importance of a clear history and examination must not be lost.13 ,14
While these data suggest good levels of reliability, it is important to note that this was between two experienced musculoskeletal radiologists, after a familiarisation trial involving 10 patients, using 1.5 Tesla field strength and high-resolution MRI (3.5 mm slices). One should be cautious about extrapolating these data to less experienced radiologists, who may not have such an opportunity for familiarisation and appraisal. In any future research, although our reported ICC was excellent, this may not translate to other studies and use of this reliability finding should be considered essential in any research or analysis of radiological MRI grades sub-groups. However, future research may utilise this reliability and the SDD data to clarify the nature of the relations between MRI parameters and clinical outcomes.
Similarly, future technical developments may deem 1.5 Tesla MRI to have inadequate sensitivity for specific variables of interest in muscle injury diagnosis, as 3.0 Tesla MRI already appears to be clinically more sensitive. However, in all recently published high-level studies, 1.5 Tesla MRI was used to classify the hamstring injury (Ekstrand et al2, Asklling et al6 and Silder et al15). Future research may utilise reliability studies on 3.0 Tesla MRI.
In conclusion, this is the first study to evaluate the interobserver reliability and SDD in assessing the MRI grading, location and extent of hamstring injuries. An excellent interobserver and intraobserver reliability was found. The SDDs presented allow clinically meaningful inferences to be made when comparing within-subjects and between-subjects with hamstring muscle injuries.
Contributors BH designed the study, monitored the data collection, interpreted the data, and drafted and revised the paper. RW analysed and interpreted the data and revised the paper. EA and BR analysed the MRIs, interpreted the data and revised the paper. CG interpreted the data and revised the paper. JT designed the study, monitored the data collection, interpreted the data, and drafted and revised the paper.
Competing interests None.
Ethics approval Approval was obtained from the Ethics Committee of Aspetar, Qatar Orthopaedics and Sports Medicine Hospital.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.