Background There is currently no standardised MRI evaluation protocol for athletes who present with symptoms that may relate to the pubic symphysis, the pubic bones, and the adductor muscle insertions. We outline the protocol and reliability data.
Material and methods Three musculoskeletal radiologists developed an 11-element MRI evaluation protocol defined according to precise criteria and illustrated in a pictorial atlas. Eighty-six male athletes (soccer players and non-soccer players) underwent standardised 3 Tesla MRI of the pelvis. Two external musculoskeletal radiologists were trained to use the protocol and pictorial atlas during two sessions of 2–4 h each. Each radiologist rated all 86 MRI independently. One radiologist evaluated the scans once, the other twice 2 months apart. Cohen κ statistics were used to determine intraobserver and interobserver agreement.
Results The main findings were (1) substantial intraobserver (κ range 0.65–0.67) and moderate interobserver (κ range 0.45–0.52) agreement in rating pubic bone marrow oedema, (2) substantial to moderate intraobserver (κ range 0.49–0.72) and moderate-to-fair interobserver (κ range 0.21–0.52) agreement in rating most other MRI findings, (3) slight intraobserver and interobserver (κ range −0.06–0.05) agreement in rating adductor longus tendinopathy.
Conclusions The Copenhagen Standardised MRI protocol demonstrated moderate-to-substantial reliability in rating bone marrow oedema, and varied from fair-to-substantial agreement for the majority of MRI features, but showed only slight agreement in rating adductor longus tendinopathy. This rigorous investigation also confirms that while MRI evaluation seems to provide reasonable reliability in rating pubic bone marrow oedema, the evaluation of adductor tendinopathy in a clinical and research setting needs further resolution by continued development and testing of MRI acquisition protocols.
Statistics from Altmetric.com
Athletes participating in sports such as soccer, rugby and ice-hockey1–3 are frequently affected by hip and groin pain.4 ,5 MRI is often used as part of the clinical assessment and in groin/hip pain research in the sport setting. MRI allows a detailed three-dimensional visualisation of the entire pelvis. Fat-saturated fluid-sensitive sequences combined with conventional T1-weighted sequences enable detection of fluid-related signal intensity (SI) changes, with excellent visualisation of soft tissues and bone marrow.6 ,7 Five types of MRI findings have been reported in long-standing athletic groin pain: degenerative changes around the symphysis,8–14 the secondary cleft sign,11 ,13 ,15 pathology at the adductor muscle insertions to the pubic bones,8 ,14–18 pubic bone marrow oedema (BMO),8–15 ,18–21 and recently the superior cleft sign.22
Only four studies of MRI in groin pain in athletes have included reliability assessments (8;11;12;19). These studies classified MRI findings differently due to different diagnostic criteria, and reported them without a detailed description of specific scan sequences and image evaluation protocols used for analysis,23 thus making it difficult for others to reproduce these results (table 1). No previous MRI study has assessed pathological findings in athletes with long-standing groin pain, using a detailed MRI evaluation protocol and atlas with images easily available for the reader.23
Therefore we developed a standardised MRI evaluation protocol for hip and/or groin pain in athletes with a corresponding pictorial atlas, and we assessed its intra- and interobserver reproducibility when used by two musculoskeletal radiologists.
Material and methods
This paper includes data from a large cohort study exploring hip and/or groin pain, self-reported outcome, clinical data, muscle strength, range of motion and radiological findings in male soccer players.24 ,25 All participants provided written informed consent according to the Helsinki Declaration prior to inclusion in the study, which was approved by the Danish National Committee on Health Research Ethics (H-2-2010-127), and the Danish Data Protection Agency (2011-41-5964).
Forty subelite soccer clubs (Division 1–4) in Eastern Denmark were contacted and informed about the study. Players suffering from hip and/or groin pain were offered the opportunity to be examined clinically at our institution as part of the research project. Forty-eight symptomatic male soccer players referred themselves for research purposes. Two were excluded because of acute injuries needing treatment, thereby preventing the blinding of observers. Twenty activity-matched asymptomatic male soccer players and 20 asymptomatic male non-soccer playing athletes (all participants without hip and/or groin pain in the previous year) were also included. The latter had never played soccer in an organised setting and were practising sports where kicking movements, running and sharp turns were not predominant: fitness (N=9), martial arts (N=3), running (N=3), cycling (N=2), parkour (N=1), kayak (N=1) and basketball (N=1).
Thus, our total participant cohort consisted of 86 male participants representing a mixed group of symptomatic and asymptomatic soccer and non-soccer players (age range: 18–41 years; mean age 24.3±3.86 years (SD); training hours/week: median 6.0, IQR 4.0–8.0) included prospectively from August 2011 to July 2012. Inclusion and exclusion criteria for study participants are listed in table 2. All participants completed a Hip and Groin Outcome Score (HAGOS) questionnaire, developed and validated previously at the Sports Orthopedic Research Center,26 to assess and document the severity of their self-reported symptoms.
All participants underwent an identical MRI protocol performed on a 3 Tesla Siemens Magnetom Verio system (Siemens, Erlangen, Germany), with a surface coil (32 channel high-resolution body coil) centred at the pubic symphysis and covering the pelvic area. Participants were examined in the supine position. Eight scan sequences were performed, and MR parameters are displayed in table 3. The axial oblique plane was tilted 50° from the horizontal plane, and oriented parallel to the long axis of the superior pubic rami (figure 1). A radiologist (SB), unaware of any clinical information on study participants, and communicating with them only to provide instructions about the scan session, performed all MRI and stored them in the institution's Picture Archiving and Communicating System (PACS).
MRI evaluation protocol development
A work group was formed to develop a detailed MRI evaluation protocol, and consisted of two experienced musculoskeletal radiologists (BHB and MB with a combined 25 years musculoskeletal and sports radiological experience), and one junior radiologist (SB, 3 years experience). The work group used PC workstations with imaging software Impax ES, DS3000 (Agfa HealthCare, Mortsel, Belgium) and dedicated high-resolution viewing monitors, using all window settings including magnification, but without access to clinical information. All 86 MRI were included in the same work list, and reordered according to the date of birth.
To develop the standardised MRI evaluation protocol, the work group started with a review of the existing scientific literature on radiological findings in athletic groin pain,23 and identified eight MRI findings: BMO, adductor longus tendinopathy, secondary cleft sign, superior cleft sign, rectus abdominis tendinopathy, and three types of degenerative changes at the symphysis (symphyseal sclerosis, subchondral cysts/joint surface irregularities, and central disc protrusion). The work group defined a standardised measurement technique for symphyseal-related BMO, measuring its extent on an ordinal scale (grade 0–3) from the symphyseal joint margin on axial oblique sequences along the long axis of the involved pubic ramus. Fatty infiltration in the bone marrow was added, although it has never been validated in the literature The initial MRI evaluation protocol thus consisted of nine different categories of MRI findings (V.1), whose presence or absence (including particular MRI sequences used) were defined and described in detail for standardisation purposes.
The evaluation protocol was used to assess all 86 MRI in an initial consensus reading performed by the three members of the work group sitting together, blinded to all clinical details. Subsequently the work group agreed to include two features that enabled better anatomical discrimination: the adductor longus musculotendinous lesion (to differentiate the increased signal at the myotendinous junction from the increased signal within the tendon), and the parasymphyseal high-intensity line. The latter is a high-intensity line visible on fluid-sensitive sequences within the pubic bone underlying and parallel to the subchondral symphyseal bone plate. As opposed to the secondary cleft sign, it does not communicate with the symphyseal joint space. Eleven MRI findings were thus included in the second version of the evaluation protocol.
We created a standardised pictorial atlas to illustrate normal and pathological findings with characteristic images carefully selected according to the MRI sequence (imaging plane and weighting) described in the evaluation protocol. The work group presented the final MRI evaluation protocol and pictorial to three experts in athletic hip and groin pain (one orthopaedic surgeon (PH), one physiotherapist (KT) and one musculoskeletal ultrasound radiologist (MBN)) at a meeting to discuss and agree on the clinical relevance of each of the eleven MRI features included in the MRI evaluation protocol (figure 2). The evaluation protocol and pictorial are illustrated in table 4 and in an online supplementary file.
A second blinded consensus reading of all 86 scans was performed by the work group according to the final version of the MRI evaluation protocol and atlas, in a manner similar to the first joint consensus reading. As there exists at present no gold standard with which to compare MRI images of athletes with long-standing groin pain, this second consensus reading was used as the best available option to assess the prevalence of various MRI appearances in the cohort.
Two experienced external musculoskeletal radiologists were subsequently selected to assess the intraobserver and interobserver agreement when evaluating all 86 MRI independently and blinded according to the evaluation protocol. They had neither been involved in the project, nor in developing the protocol, and they worked with sports medicine radiology, although not assessing athletes with long-standing groin pain on a daily basis. One (MC-P) had 20 years expertise in musculoskeletal radiology, and the other (EM) had 12 years musculoskeletal MRI experience at major hospitals in Copenhagen, Denmark. They were provided with the MRI evaluation protocol and a high-quality printout of the atlas. A radiologist from the work group (SB) instructed them during two sessions of 2–4 h each, where the evaluation protocol and atlas were carefully reviewed and discussed. Subsequently, each observer evaluated five MRI of male patients not included in the study under supervision of the instructor (SB). Finally, one radiologist (EM) assessed all 86 scans in a first and second reading 2 months apart. The second reading was performed without access to the first reading, and scans were ordered in the same sequence as the first reading to ensure an equal time lapse between readings. The other radiologist (MC-P) performed a single reading of all scans. To read the 86 scans took the radiologists 12 h (6 h on the second reading) and 15 h.
The two readings by the same radiologist (EM) were compared for intraobserver agreement, and the first readings by both radiologists (EM and MC-P) for interobserver agreement. All MRI findings were evaluated as binary variables, and BMO additionally as an ordinal variable (grade 0–3). Unweighted Cohen κ statistics were used for the binary variables to determine intraobserver and interobserver agreement on each side (right/left), and linear-weighted κ statistics for BMO on an ordinal scale.27 ,28 Agreement was expressed as κ values between 0 and 1 with 95% CI, and interpreted according to the recommendation of Landis and Koch29 (Table 5). To ensure that a significant κ value could be calculated for as many variables as possible, the blinded consensus reading performed by the work group was used to establish the approximate prevalence of MRI findings in our cohort.30 Prevalences ranged from 10% to 70%, but were below 10% for the secondary cleft sign and rectus abdominis tendinopathy. We had no assumptions on a minimum acceptable value of κ, and set it at zero in the null hypothesis. If the true κ is 0.40 and the prevalence ranges from 0% to 70%, 54 observations provide 90% power at the 5% significance level in a one-tailed test of κ=0. We included all 86 MRI to maximise the power of the study, and calculated 95% CI to reflect sampling error, because if the prevalence of a finding is either very high or very low, chance agreement is also high, and the κ value for the given finding is reduced accordingly.30 Weighted κ analyses were performed with statistical software R (Vienna, Austria),31 and all other analyses with SPSS Statistics V.19.0.
The intraobserver and interobserver agreement analyses are displayed in table 6.
The intraobserver agreement was substantial in rating the presence of BMO (κ=0.67 and 0.65 for the right and left side, respectively), BMO on a grading scale 0–3 (κ=0.65 and 0.66), fatty infiltration in the bone marrow (κ=0.72 and 0.69), superior cleft sign (κ=0.65 and 0.66) and rectus abdominis tendinopathy (κ=0.66). Agreement was moderate for the parasymphyseal high-intensity line (κ=0.50 and 0.58), secondary cleft sign (κ=0.49 and 0.65) and disc protrusion (κ=0.52). However, rating adductor longus tendinopathy yielded poor-to-slight agreement (κ=−0.06 and 0.05).
Overall, there was moderate agreement in rating the presence of BMO (κ=0.46 and 0.52), BMO on a grading scale (κ=0.48 and 0.45) and the parasymphyseal high-intensity line (κ=0.52 and 0.50). There was moderate-to-fair agreement in rating fatty infiltration in the bone marrow (κ=0.33 and 0.33), the superior cleft sign (κ=0.48 and 0.23), the secondary cleft sign (κ=0.21 and 0.28), and central disc protrusion (κ=0.33). For adductor longus tendinopathy (κ=0.02 and 0.05) and rectus abdominis tendinopathy (κ=0.31 and −0.12), the κ values were low, indicating only slight agreement.
We report what we hope will be adopted as a standardised MRI protocol for evaluation of the pubic symphysis, the pubic bones and the adductor muscle insertions in athletes with hip and groin pain. As with any clinical instrument, we are open to modification of the Copenhagen Standardised MRI protocol (CSM protocol for short) as the technology changes, and in response to new data.
We found (1)substantial intraobserver and moderate interobserver agreement in rating BMO; (2) substantial-to-moderate intraobserver and moderate-to-fair interobserver agreement in rating fatty infiltration in the bone marrow, the superior cleft sign, the parasymphyseal high-intensity line, the secondary cleft sign, the adductor longus musculotendinous lesion, and degenerative changes at the symphysis; (3) poor-to-slight intraobserver and interobserver agreement in rating adductor longus tendinopathy and (4) substantial intraobserver, but moderate-to-slight interobserver agreement in rating rectus abdominis tendinopathy.
Bone marrow oedema
We measured the extent of BMO in a standardised manner from the symphyseal joint margin on axial oblique sequences along the long axis of the involved pubic ramus. Our results suggest that this technique is reliable. However, detecting the presence of BMO on fluid-sensitive sequences is difficult because the normal pubic bone marrow is heterogeneous,32 there may be local magnetic field inhomogeneities,33 and motion and truncation artefacts from the urinary bladder mimic diffuse increased SI in the pubic bone marrow.34 The two external radiologists may have differed in rating BMO because (1) they disagreed on whether increased SI in the bone marrow represented BMO or bladder artefacts, and (2) they subjectively evaluated the regional extent of increased SI differently, thereby disagreeing on BMO grade (0–3). In comparison, previous studies assessing the reliability for grading BMO reported an interobserver κ value of 0.8512 and intraclass correlation coefficient (ICC) values of 0.52 for both intraobserver and interobserver agreement19 but described no detailed grading method, thereby preventing retesting of reported results.
Pubic symphysis and rectus abdominis
Our evaluation protocol was specifically aimed at the pubic symphysis region, requiring in-depth knowledge and experience in reviewing pubic anatomy that may only be acquired through regular assessment of this patient category. The difficulty of MRI interpretation is likewise suggested by a previous MRI study on athletic long-standing groin pain that reported reliability values similar to ours8 (interobserver κ values from 0.2 to 0.51 when assessing individual areas of anatomy). Studies assessing the secondary cleft sign found excellent agreement (κ=1.0) for simultaneously identifying a secondary cleft sign at symphyseal cleft injection fluoroscopy and MRI,11 and substantial agreement for identifying this sign on MRI of asymptomatic hockey players (intraobserver and interobserver ICC of 0.64 and 0.61, respectively).19 The lower intraobserver and interobserver κ values for the secondary cleft sign and rectus abdominis tendinopathy in our study may not reflect the true actual agreement30 because of an overall low expected prevalence of secondary cleft signs (5%) and rectus abdominis tendinopathy (<10%) in our cohort. The reported MRI prevalence of rectus abdominis lesions differs: it was low (0%) in a cross-sectional study assessing asymptomatic athletes,19 but high (65% of clinically suspected rectus lesions) in a retrospective study assessing symptomatic athletes.15 In the latter, however, the clinical examination process was not described and cannot therefore be reproduced.
Irrespective of the anatomical region, MRI scans contain inherent image artefacts.35 Moreover, agreement between radiologists varies even if observers have extensive experience in evaluating the anatomical region and patient demography of interest. MRI studies examining pathologies in other anatomical areas have reported poor reliability in assessing tendinopathy (κ=0.12–0.60),36 and variable reliability in assessing Modic changes37 in patients with low back pain38–40 and knee osteoarthritis pathology41 depending on the experience of observers. Taken together, these data highlight the importance of a personal learning curve, and underline that subjective differences between radiologists interpreting MRI have a fundamental influence on the radiological workup of athletes with groin pain.
No previous studies have assessed the reproducibility in rating adductor longus tendinopathy, musculotendinous lesion and the superior cleft sign.22 By standardising the MRI protocol, we aimed to achieve agreement for these adductor-related findings, but we obtained fair-to-moderate interobserver and substantial intraobserver agreement in rating the superior cleft sign and the adductor musculotendinous lesion, and low κ values in rating adductor tendinopathy. One observer (EM) assigned 97% positive adductor tendinopathy at the first reading and 78% at the second reading, whereas the other observer (MC-P) reported 50%, indicating that the first observer had a lower subjective threshold for assigning increased adductor tendon SI. Our reliability analysis suggests that interpreting images according to our definition of adductor longus tendinopathy as “increased signal intensity within the adductor longus tendon on fluid-sensitive sequences and/or bulging of the tendon” may have been influenced by the presence of artefacts in the images. Increased signal within a tendon can be overestimated at MRI, because of artefacts such as the magic-angle phenomenon42 or interdigitation of muscular and tendinous fibres near their bony insertion.43 Such artefacts are well known in shoulder MRI of the supraspinatus and infraspinatus tendons,43 ,44 and patellar tendons45 mimicking tendinopathy in tendons of asymptomatic individuals.46 Evaluation of adductor longus tendinopathy and rectus abdominis tendinopathy may in future be improved through modifications of MRI acquisition protocols, such as using thinner slices and smaller slice gaps, possibly also by using intravenous gadolinium, and volumetric scan techniques.
Strengths and limitations
This is the first standardised MRI evaluation protocol for assessment of athletes with hip and/or groin pain. It was developed in a stepwise manner to ensure a detailed illustration of the method used by the work group; it was subsequently tested by two unbiased blinded external radiologists to provide a realistic assessment of its reproducibility. The cohort consisted of a representative group of athletes (symptomatic and asymptomatic soccer players, and asymptomatic non-soccer players) who may present routinely at a clinical sports medicine practice. The experienced musculoskeletal radiologists assessed images individually, blinded, and at their own pace in a realistic setting, guided by a concise systematic evaluation protocol and pictorial atlas.
There are a number of limitations to this study. Evaluation of rectus abdominis tendinopathy and the secondary cleft sign was limited by their low prevalence, and we obtained wide CIs for κ values of the superior cleft sign, adductor musculotendinous lesion and rectus abdominis tendinopathy. Second, MRI interpretation is difficult to standardise as it is inherently subjective, and as there exists no gold standard with which to compare images. Differences in interobserver and intraobserver assessment of MRI findings occurred because the radiologists would rate individual patients differently, because they would systematically rate a given MRI finding differently as present or absent, or because they would use specific MRI sequences differently for evaluation.
Finally, MRI scans contain inherent image artefacts,35 that may simulate pathological conditions and produce pitfalls in interpretation when evaluating the pubic bones and adductor/rectus abdominis tendons of athletes with long-standing groin pain. The most prominent artefacts in the groin region are motion artefacts from the urinary bladder that follow the phase-encoding axis of the image,34 truncation artefacts appearing as parallel striations close to tissue interfaces where there is an abrupt and marked change in SI,47 and magic angle artefacts48 visualised as increased intratendinous SI on scan sequences with a short echo time, which occur when tendons lie at an angle of approximately 55° to the main magnetic field.42
We developed a standardised MRI evaluation protocol, and overall achieved moderate-to-substantial intraobserver and fair-to-moderate interobserver agreement for most MRI findings, and slight-to-poor agreement for adductor longus tendinopathy (κ from −0.06 to 0.05) in this methodologically rigorous investigation. While the CSM protocol seems to provide reasonable reliability for rating BMO and most other MRI features, adductor longus tendinopathy and rectus abdominis tendinopathy still need resolution by further testing of MRI acquisition protocols.
What are the new findings?
We present the first standardised MRI evaluation protocol designed for patients with athletic hip and/or groin pain. The protocol includes eight specific imaging sequences that take approximately 40 min to complete.
We report 11 features of possible pathology in the region that includes the symphyseal joint, the pubic bones, and the adductor and rectus abdominis insertions at the pubic bones.
The standardised MRI evaluation protocol for hip and/or groin pain in athletes overall achieved moderate-to-substantial intraobserver (κ ranging from 0.49 to 0.72) and fair-to-moderate interobserver (κ ranging from 0.21 to 0.52) agreement for most MRI findings. Rating adductor longus tendinopathy yielded poor-to-slight agreement (κ=−0.06 and 0.05).
How might it impact on clinical practice in the near future?
If adopted by radiologists, the Copenhagen Standardised MRI protocol will enable clinicians, researchers and policymakers to compare MRI findings in athletes with hip and/or groin pain.
The authors wish to thank Professor Carsten Thomsen and radiographer Poul Henrik Frandsen for their contribution to programming MRI sequences; Ulla Brasch Mogensen, Department of Biostatistics, Copenhagen University for statistical assistance; medical student Martin Nielsen for practical assistance; and the medical and physiotherapy students who helped with the initial contact with soccer coaches and teams.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
Contributors SB has substantially contributed to the planning of the study, conception, design, drafting and revision of the manuscript, as well as to the collection, analysis and interpretation of data. KT, BHB, MB, EM, MC-P, MBN and PH have substantially contributed to the conception, design and revision of the manuscript, as well as to the interpretation of data, and they also gave final approval of the version to be published.
Competing interests None.
Ethics approval The Danish National Committee on Health Research Ethics (H-2-2010-127), and the Danish Data Protection Agency (2011-41-5964).
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.