Article Text


The development and validation of a scoring system for shoulder injuries in rugby players
  1. Simon Benedict Roberts1,
  2. Lennard Funk2,3
  1. 1Department of Trauma and Orthopaedic Surgery, University of Edinburgh/South East Scotland Deanery, Edinburgh, Midlothian, UK
  2. 2Department of Orthopaedic & Sports Science, University of Salford, Salford, Lancashire, UK
  3. 3Wrightington Upper Limb Unit, Wrightington Hospital, Wrightington Wigan and Leigh NHS Trust, Wrightington, Lancashire, UK
  1. Correspondence to Simon B Roberts, Department of Trauma and Orthopaedic Surgery, University of Edinburgh/South East Scotland Deanery, 12 Littlejohn Avenue, Edinburgh, Midlothian EH10 5TG, UK; simonroberts100{at}


Background Shoulder injuries are relatively common among professional rugby players and result in a large proportion of days absent from training and competition. No instrument exists that is designed and validated to assess function or outcome following therapeutic interventions in rugby players sustaining shoulder injuries. The objective was to develop and validate an athlete-reported scoring system to assess shoulder function in rugby players following shoulder injuries.

Methods Potential items for the scoring system were identified by a literature review of shoulder-specific scoring systems (n=46), and by interviewing professional rugby players (n=38) and medical staff (n=12). Redundant and clinician-assessed items were excluded. A second set of interviews with rugby players (n=8) determined the frequency importance product (FIP) of potential items. The 20 items with the highest FIPs were selected for the provisional Rugby Shoulder Score (RSS) that was tested for internal consistency and reliability by administering to rugby players with stable shoulder injuries (n=11).

Results The literature review and interviews identified 575 items, of which 105 items were neither clinician-assessed nor redundant. Twenty items with the highest FIPs were selected for the RSS. The RSS demonstrated excellent internal consistency (Cronbach's α=0.96) and reliability (intraclass correlation coefficient= 0.941, paired student t test p>0.05).

Conclusions A reliable athlete-reported scoring system for assessing shoulder injuries in rugby players has been developed that incorporates the most important factors for rugby players recovering from shoulder injuries. Further prospective testing of the instrument is being undertaken to determine its discriminative and evaluative functions and construct validity.

Statistics from


Shoulder injuries comprise between 9% and 11% of injuries among professional rugby players,1–3 resulting in a high proportion of days absent from competitive sport.1 ,2 Shoulder injuries in rugby players have a high recurrence rate,3 indicating that effective treatment and rehabilitation is hard to accomplish, and significant scope exists for improvements in treatment strategies. No scoring system or outcome instruments have been specifically designed or validated for use in rugby players.4 Development of a scoring system specifically for shoulder injuries in rugby players would enable improved evaluation of these injuries and outcomes after treatment interventions.

Several instruments exist to assess disease of the upper limb and the shoulder specifically, including the ASES,5 Constant Score,6 DASH,7 Oxford Shoulder Scores,8 ,9 Rating Sheet for Bankart Repair,10 RC-QoL, Shoulder Rating Questionnaire,11 Simple Shoulder Test,12 SPADI,13 UCLA Shoulder Score14 and Western Ontario Shoulder Indices.15–17 These are not appropriate for use as athlete-reported scoring systems in rugby players as they are validated only for use in the general population and are either specific for a single disease or operation, incompletely validated or include clinician-assessed variables.

The requirements of shoulder function for rugby players are different to those of the general population. Elite athletes often continue to compete despite shoulder injuries, manifest symptoms only during training and competition, function well in activities of daily living and obtain satisfactory scores on existing shoulder scoring systems when injured. It is therefore appropriate to develop assessment tools specifically for groups of athletes with unique functional requirements to optimise the assessment of outcomes in these groups.

Three shoulder outcome scores have been previously developed for the athlete's shoulder. Tibone and Bradley developed an athlete's shoulder outcome score arbitrarily that was modified by Kuhn and Hawkins to improve the evaluation of an athlete's return to preinjury performance level,18 though no published data are available regarding validation of these two instruments. The Kerlan-Jobe Orthopaedic Clinic Score19 is a recently developed, athlete-reported outcome measure for the upper limb but is not specific to shoulder pathology.

The objective of this research was to develop and validate an athlete-reported scoring system for the assessment of shoulder injuries in rugby players. The scoring system will be designed to assess shoulder function in rugby players after injury (discriminative function), as well as before-and-after treatment interventions including surgery (evaluative function). The scoring system will be tested for reliability, responsiveness, discriminative and evaluative functions, and construct validity. The scoring system will hopefully permit performance-based assessment of outcomes in this high-demand patient group and assist in the development of evidence-based treatment strategies to improve outcomes for rugby players suffering shoulder injuries.

Materials and methods

Participant recruitment

The aim was to develop an athlete-reported scoring system for assessing the severity and treatment of shoulder injuries in rugby players. Rugby players employed in the professional Rugby Union and Rugby League teams in the UK and Ireland were invited to participate in the research.

Item generation

Identification of potential items for inclusion in the scoring system must be comprehensive as only items included in this stage can be selected for the final scoring system, and items cannot be introduced after this stage. The items for the scoring system were generated in three phases that included literature review, interviewing clinician experts (team physiotherapists) and interviewing rugby players.

A literature review was performed using the MEDLINE database to search for existing relevant assessment tools and to collate all items from these tools. The literature search keywords included ‘orthopaedic’, ‘outcome measure’, ‘trauma’, ‘upper extremity’, ‘upper limb’, ‘shoulder’, ‘arm injury’, ‘instrument’, ‘assessment’, ‘score’ and ‘function’. Medical journals regularly publishing research regarding outcome measures in orthopaedic and trauma surgery were searched individually for outcome instruments involving patient self-evaluation of musculoskeletal disorders of the upper extremity with previous use in athletes or orthopaedic and trauma patients. Journals reviewed included Journal of Bone and Joint Surgery (Am+Br), Journal of Hand Surgery, Journal of Trauma, Injury and Journal of Shoulder and Elbow Surgery. All items were extracted from identified outcome instruments.

Professional rugby club physiotherapists and professional rugby players then completed questionnaires to identify items representing factors and functions affected by significant shoulder injuries, and whose recovery was important to a player's return to availability for competitive sport. Rugby physiotherapists and players were asked to list ‘functions or abilities of your arm or body that have been or were affected by your shoulder injury’, ‘symptoms that developed from shoulder injuries’ and ‘psychological or quality-of-life issues that developed or were experienced due to shoulder injuries’ and whose resolution was important in returning to full training and/or competitive sport. A significant shoulder injury was defined as ‘any shoulder injury preventing a player from competitive sport or full training’. Only players who had previously sustained a significant shoulder injury and subsequently recovered to competitive sport were invited to complete the questionnaire. To ensure that the full spectrum of types of significant shoulder injury could be included, no restrictions were placed on the underlying shoulder diagnosis. To ensure that the questionnaires were exhaustive, physiotherapists and players completed the questionnaires until five consecutive interviews with each type of professional failed to generate any new items.

Item reduction

The literature review and interviews generated a list of potential items for inclusion in the scoring system. Item reduction involved selecting which items from this list should be selected for the scoring system. Redundant items were first eliminated. As the scoring system was designed as an athlete-reported scoring system, clinician-assessed and objective items were then discarded. Professional rugby players who had previously sustained and recovered from a significant shoulder injury were then interviewed to rate the remaining list of items for each item's importance on a scale of 1 (unimportant) to 7 (very important), and to check whether or not the issue represented by the item was experienced during the player's shoulder injury. The frequency of players with shoulder injuries experiencing each item was calculated. The frequency importance product (FIP) was generated for each remaining item by multiplying two scores together (total frequency × mean importance score). The 20 highest scoring items were then retained for the provisional scoring system for validity testing.

Internal consistency

Internal consistency of the 20 items selected for the scoring system was calculated by determining the correlation between mean scores of individual items and the mean total score, reported as the item reliability coefficient (Cronbach's α).

Scoring system formatting

Individual items were refined to be concise and to exclude any ambiguity, jargon, leading or value-laden terms. A seven-point Likert scale was selected as the response format. The scoring system was tested for interpretability using the Flesch-Kincaid Grade Level and the Flesch Reading Ease Score. The interpretation of each item by rugby players was checked.

Scoring system reliability

To determine whether or not the scoring system generated reproducible results, the scoring system was completed by rugby players on two occasions with an interval period of 2 weeks. Rugby players with chronic and/or stable shoulder injuries who were not undergoing active treatment were invited to participate in this reliability testing. The mean test–retest item score differences and SD of these differences were calculated. The intraclass correlation coefficient (ICC) for the agreement between repeated measures was calculated.20 To assess the distribution of repeated measures, the paired student t test was calculated (significance level p=0.05).

Statistical analysis and ethical approval

All statistical analyses were performed using the Statistical Package for Social Sciences (SPSS V.15.0). Local ethics committee approval was granted by the Research Governance and Ethics Committee of the University of Salford for this study.


Item generation

The literature search identified 2601 articles and 61 scoring systems applicable to the shoulder. Fifteen of these scoring systems were irretrievable. The 46 retrievable scoring systems were reviewed and 481 items were extracted. The structure of the 46 reviewed scoring systems is summarised in table 1. Interviews with professional rugby club physiotherapists (n=12) and players (n=38) generated a further 94 items, creating a total of 575 items.

Table 1

Assessment methods, domains assessed and response methods of existing shoulder scoring systems (n=46)

Item reduction

Item reduction refined the item total to 105 items. The demographics of rugby players interviewed to determine the FIP of these items are shown in table 2. The 20 items with the highest frequency importance products are shown in table 3. A flow diagram of the item selection process is shown in figure 1.

Table 2

Demographics of rugby players interviewed to determine the frequency importance products of items

Table 3

Items selected for the Rugby Shoulder Score from item reduction interviews with rugby players (n=8)

Figure 1

Flow diagram of item selection for the Rugby Shoulder Score.

Internal consistency

The provisional scoring system was completed on two occasions by 11 rugby players with chronic and/or stable shoulder injuries whose demographics are shown in table 4. The internal consistency was calculated and the Cronbach's α for the scoring system was 0.96. All items correlated with the total score at >0.4. The removal of any single item did not improve the Cronbach's α result (table 5).

Table 4

Demographics of rugby players completing the Rugby Shoulder Score for internal consistency and reliability testing

Table 5

Internal consistency results for the Rugby Shoulder Score tested on rugby players with chronic stable shoulder injuries (n=11)

Scoring system formatting

The 20 items with the highest FIPs were formatted into concise statements for the scoring system (table 6). The Flesch-Kincaid level was scored at 9, indicating that a ‘grade 9’ student's appreciation of English was required to understand the items. The Flesch Reading Ease Score was calculated at 46. None of the items were misinterpreted by the rugby players interviewed.

Table 6

Rugby Shoulder Score

Scoring system reliability

The mean scores of items rated by rugby players demonstrated no significant differences for the initial and retest results (table 7). The ICC for the total score on the two occasions was high at 0.941, indicating excellent reliability. Only one item demonstrated a low ICC value of less than 0.4 (item 16). Four further items showed fair reliability indicated by ICC values between 0.4 and 0.75 (items 1, 6, 9 and 15). The remaining 15 items demonstrated high ICC values of greater than 0.75.

Table 7

Reliability and validation results for the Rugby Shoulder Score (n=11)


The development of validated outcome measures is important to improve the practice of evidence-based medicine. Patient-reported variables reflect a patient's function more accurately than clinician-assessed variables.21 The most relevant single outcome measure for athletes is a return to the preinjury level of functioning and performance in their usual sport.22

Contact athletes may only experience symptoms during training and competitive sport, not during activities of daily living. There is no outcome measure designed and validated specifically for shoulder injuries in rugby players. Established shoulder scoring systems that are currently used to assess shoulder injuries, such as the Constant Score,6 Oxford Shoulder Score8 and Oxford Shoulder Instability Score,9 were developed for use in the general population and have not been validated for use in any group of contact athletes. These instruments either include clinician-assessed variables or focus on activities of daily living. Consequently, existing shoulder scoring systems are not suitable for evaluating shoulder function in high-demand contact athletes such as rugby players.

Currently, three scoring systems exist that have been designed for shoulder injuries in athletes. Tibone and Bradley developed a scoring system for shoulder function in athletes. This was modified by Kuhn and Hawkins to create the Athlete Shoulder Assessment Tool.18 The third existing scoring system is the Kerlan-Jobe Orthopaedics Clinic Score.19 These scoring systems have neither been designed nor validated for use in any group of contact athletes. The purpose of this research was to develop and validate an athlete-reported scoring system for the specific evaluation of shoulder function in rugby players.

The first stage in developing an athlete-reported shoulder scoring system was to review all existing scoring systems that are relevant to the shoulder to generate a list of potential items for the new scoring system. The literature search identified 61 distinct scoring systems, 46 of which were retrievable. A review of the use of outcome scores in shoulder surgery in 2005 identified a total of 44 different shoulder scores.23 This suggests that the literature search in the item generation phase was comprehensive. The majority (60.9%) of reviewed scoring systems included patient-assessed items only (table 1). This most likely reflects the trend over the last decade to develop patient-reported outcome measures.23 A review of the distribution of items in scoring systems across different domains confirmed that most scoring systems evaluated power (56.5%), range of movement (71.7%), function in necessary activities of daily living (87%) and pain (76.1%). Only a minority of scoring systems (28.3%) directly assessed patient satisfaction, which is now recognised as a reliable indicator of outcome.

Following the literature review and completion of questionnaires by professionals, 20 items were selected for the provisional scoring system by calculating the FIP.15–17 One limitation of the methodology in developing this scoring system is that only eight players were interviewed to determine the FIPs of 105 potential items. A balance must be obtained between the number of items reviewed and the number of players to be interviewed. As scoring systems for shoulder injuries in contact athletes or rugby players have not been previously developed, a comprehensive evaluation of items of relevance to rugby players was favoured over sampling of a large number of players. The included players represented a range of positions, player ages and severity of injury (table 2). A limit of 20 items was chosen as this represented a reasonable maximum question load for a respondent. The identification and exclusion of redundant items within the scoring system can be performed with confidence when more players have completed the scoring system. It is anticipated that this analysis can be performed when more players (n=100) have completed the scoring system by eliminating items that correlate highly with each other when assessed using the Pearson correlation coefficient.

On comparison of the generated scoring system with previous outcome measures, this scoring system contains eight items (items 9, 12–18; table 7) that have not been identified in any shoulder scoring system previously. This indicates that several of the conditioning exercises and skills that are affected by shoulder injuries in rugby players, and are important for their return to full training and competitive sport, are not evaluated in existing shoulder scoring systems. The number of items in the scoring system covering specific domains shows a similar pattern to the domains covered by previous scoring systems (table 1).

The Flesch-Kincaid Grade Level and Flesch Reading Score were computed to assess the readability of the scoring system. The Grade Level for the new scoring system was 9 and the Flesch Reading Score was 46, indicating that the items were easy to comprehend. No rugby players demonstrated or reported any difficulty in understanding the items, indicating that it was sufficiently formatted for its intended purpose. The Likert scale response format, which is used in the majority of existing shoulder outcome measures (82.6%), was selected as it has been shown to be more reliable. Weighting of items was not performed as this is not necessary if items with very low FIPs are eliminated from the final scoring system.24

The ability of the items of the scoring system to measure the same general latent variable (shoulder function for rugby players in this research) can be assessed by estimating the tool's internal consistency, which is statistically reported as Cronbach's α. An ideal scoring system would contain items that are related and internally consistent, but which each also provide unique information. A Cronbach's α result of less than 0.6 indicates a lack of cohesion, whereas a score of 0.8–0.95 indicates good consistency.8 ,25 The scoring system had an overall internal consistency result of 0.96 (table 5), suggesting that while some items contribute to a cohesive scoring system, some redundancy may exist between items. The exclusion of any single item did not improve the internal consistency result to within the desirable range (0.8–0.95). Two or more redundant items may therefore exist. Redundant items may include those assessing overhead function (items 10, 11 and 13), which had similar mean scores and SD. These results provide supportive evidence for the development of a cohesive scoring system, but definitive identification and elimination of redundant items require further testing of the scoring system, as was previously described.

A reliable outcome measure should produce a similar result on repeated measures of a patient if their condition is unchanged. The most appropriate test statistic for evaluating reliability is the ICC.26 The interval period for reliability testing was selected as 2 weeks as this was deemed to be long enough for the players to have forgotten previous responses but a sufficiently short period to minimise the probability of their level of injury changing. The mean scores of all items were similar at the two time points (table 7). The ICC value was greater than 0.75 for 15 items, indicating acceptable reliability. Further, four items demonstrated ‘fair’ reliability. The only item that showed poor reliability was ‘Carrying a ball with strength in the crook of your arm’ (ICC=0.392). This item will be eliminated from the scoring system if it shows similarly poor reliability after more subjects have been tested. The total score for the scoring system demonstrated excellent reliability (ICC=0.941, table 7).17 If this level of reliability is maintained after further validation of the scoring system, it indicates that it could be used not only for large-scale research purposes but also for decision-making regarding treatments for individual athletes.16 Eleven players were used to test the reliability of the scoring system over time. Although this may seem a low sample number, this produced a statistically valid level of reliability using the ICC. The paired student t test results (table 7) demonstrated no significant differences in either individual items or total score over the two time points, indicating that the distribution of the repeated results was not different, and providing further evidence of the reliability of the scoring system.

This research has developed a reliable provisional scoring system for the evaluation of shoulder function in rugby players. We are continuing this work by validating the scoring system for its responsiveness to change in injury status by recruiting rugby players with new shoulder injuries to complete the scoring system before treatment and during rehabilitation, and by determining the minimal important difference in the overall score that represents a significant change in the clinical condition.

What are the new findings?

  • This is the first study to define the most important aspects of shoulder function for professional rugby players.

  • This study identified eight new aspects of shoulder function that are important for rugby players and are not assessed by existing shoulder scores.

  • This study developed the first reliable athlete-reported scoring system for the assessment of shoulder injuries in rugby players.

How might it impact on clinical practice in the near future?

This scoring system may

  • Provide an accurate assessment of injury severity and shoulder function in rugby players after shoulder injuries.

  • Determine the most appropriate treatment interventions for rugby players sustaining shoulder injuries.

  • Gauge rehabilitation of shoulder function in rugby players after shoulder injuries.




View Abstract
  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • Contributors SBR designed the study. He also collected, analysed and interpreted the data, besides drafting and finally approving the paper. LF was involved in the initial study conception, study design, interpretation of data and drafting of the paper and finally approved the paper. SBR and LF can act as guarantors for the work described in this paper.

  • Competing interests None.

  • Ethics approval Research Governance and Ethics Committee of the University of Salford.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • ▸ References to this paper are available online at

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.