Article Text

Download PDFPDF

The patellofemoral pain and osteoarthritis subscale of the KOOS (KOOS-PF): development and validation using the COSMIN checklist
  1. Kay M Crossley1,
  2. Erin M Macri2,
  3. Sallie M Cowan3,
  4. Natalie J Collins4,
  5. Ewa M Roos5
  1. 1 School of Allied Health, College of Science, Health and Engineering, La Trobe University, Victoria, Australia
  2. 2 Department of Family Practice, Centre for Hip Health and Mobility, University of British Columbia, Vancouver, Canada
  3. 3 Department of Physiotherapy, The University of Melbourne, Parkville, Australia
  4. 4 School of Health and Rehabilitation Sciences, The University of Queensland, Brisbane, Australia
  5. 5 Department of Sports Science and Clinical Biomechanics, Musculoskeletal Function and Physiotherapy, University of Southern Denmark, Odense, Denmark
  1. Correspondence to Kay M Crossley; K.Crossley{at}


Background Patellofemoral pain and osteoarthritis are prevalent and associated with substantial pain and functional impairments. Patient-reported outcome measures (PROMs) are recommended for research and clinical use, but no PROMs are specific for patellofemoral osteoarthritis, and existing PROMs for patellofemoral pain have methodological limitations. This study aimed to develop a new subscale of the Knee injury and Osteoarthritis Outcome Score for patellofemoral pain and osteoarthritis (KOOS-PF), and evaluate its measurement properties.

Methods Items were generated using input from 50 patients with patellofemoral pain and/or osteoarthritis and 14 health and medical clinicians. Item reduction was performed using data from patellofemoral cohorts (n=138). We used the COnsesus-based Standards for the selection of health Measurements INstruments guidelines to evaluate reliability, validity, responsiveness and interpretability of the final version of KOOS-PF and other KOOS subscales.

Results From an initial 80 generated items, the final subscale included 11 items. KOOS-PF items loaded predominantly on one factor, pain during activities that load the patellofemoral joint. KOOS-PF had good internal consistency (Cronbach’s α 0.86) and adequate test–retest reliability (intraclass correlation coefficient 0.86). Hypothesis testing supported convergent, divergent and known-groups validity. Responsiveness was confirmed, with KOOS-PF demonstrating a moderate correlation with Global Rating of Change scores (r 0.52) and large effect size (Cohen’s d 0.89). Minimal detectable change was 2.3 (groups) and 16 (individuals), while minimal important change was 16.4. There were no floor or ceiling effects.

Conclusions The 11-item KOOS-PF, developed in consultation with patients and clinicians, demonstrated adequate measurement properties, and is recommended for clinical and research use in patients with patellofemoral pain and osteoarthritis.

  • Osteoarthritis
  • Questionnaire
  • Measurement
  • Knee

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Patellofemoral pain (PFP) and related conditions such as patellofemoral osteoarthritis (OA) are prevalent1–6 and have substantial impact on pain, function, activities of daily living and quality of life.7–13 Currently, there is no gold standard, condition-specific patient-reported outcome measure (PROM) designed to assess patellofemoral pain or OA. This is reflected in the vast array of PROMs used in studies,14–28 including knee-specific scales (rather than patellofemoral specific)29–32 or generic instruments.15 16 22 33

Up to 37 knee-related outcome measures were identified in recent systematic reviews, with limited utility for patellofemoral pain.34 35 Howe et al 35 reported only one scale, Kujala’s Anterior Knee Pain Scale (AKPS),14 to be of ‘sufficient quality’ for use in patellofemoral pain. The AKPS, commonly used for patellofemoral pain,16 36–40 has established test–retest reliability, internal consistency16 22 24 and responsiveness where self-reported improvements correlated with AKPS scores.16 Limitations include time frame for symptoms not specified; lack of unidimensionality; use of ‘jargon’ language (‘atrophy’, ‘flexion deficiency’ and ‘subluxation’); the presence of only three or four response options for 7 of the 13 items (loss of discriminative information); arbitrary weighting of response options and concerns of inadequate content validity.15 24

Another systematic review41 evaluated five PROMs based on frequency of use in the literature, and recommended the Knee Outcome Survey-Activities of Daily Living Scale (KOS-ADL) (above the AKPS) for use with patellofemoral pain. However, the KOS-ADL is less frequently used than the AKPS, and its measurement properties have not been evaluated in an exclusive patellofemoral pain sample.31 41 As with the AKPS, there are concerns regarding a lack of content validity for KOS-ADL. The definition of content validity has evolved to require that patients be counted among the ‘experts’; thus, their input during PROM development is vital to ensure accurate reflection of the patient experience.42 43

Patellofemoral OA is increasingly being recognised as an important source of pain and disability in those with chronic knee pain44 45 and knee OA.4 7 8 However, there are no known PROMs for this condition, with recent trials using the Knee injury and Osteoarthritis Outcome Score (KOOS)46 to measure treatment outcomes.47 48 This is problematic given that the symptoms and characteristics of patellofemoral OA often differ from tibiofemoral OA.49–51 Given the lack of an existing suitable PROM for patellofemoral pain and OA, we aimed to (1) develop a new KOOS subscale, the ‘Patellofemoral pain and osteoarthritis’ subscale (KOOS-PF) to evaluate patellofemoral pain and OA and (2) evaluate the measurement properties of this subscale.


We conducted this study in three parts: instrument development (phase 1); item reduction (phase 2) and final evaluation of the fully developed subscale’s measurement properties (phase 3).

Phase 1: subscale development

To ensure content validity, initial items were generated with input from patients with patellofemoral pain and/or OA, and clinical and research experts in the field of patellofemoral disorders. We surveyed participants using open-ended questions, and included all suggested items in the initial draft instrument. Fifty patients completed the surveys to identify items relating to their knee that were important to them. Nineteen of these participants were enrolled in a study on patellofemoral pain (eligibility criteria: 18–35 years, insidious onset of retropatellar or peripatellar pain > 6 weeks, provoked by at least two activities including running, stair ambulation, kneeling, squatting, prolonged sitting),52 and 31 participants were in a study on patellofemoral OA (eligibility criteria: 40+ years, anterior patellar or retropatellar pain on most days of past month during activities including stair ambulation, rising from sitting and squatting, and radiographic evidence of patellofemoral > tibiofemoral OA).48

Fourteen health and medical clinicians with patellofemoral expertise (treating a minimum of one patient with patellofemoral pain or OA per week) provided input via a similar open-ended survey. These included orthopaedic surgeons (n=3), rheumatologists (n=1), sports physicians (n=2), medical doctors (n=1) and physiotherapists (n=7). In addition, 14 researchers who attended the International Patellofemoral Pain Research Retreat (Baltimore, 2009) completed a survey. A small focus group (KMC, EMR, SMC) then refined the items and drafted the preliminary patellofemoral subscale. As for the other KOOS subscales, individuals answer items regarding symptoms during the past week, using five graded response options. The total subscale is scored from 0 to 100, with 100 representing no disability and 0 representing maximum disability.53

Phase 2: item reduction


Item reduction (phase 2) and evaluation of measurement properties (phases 2 and 3) were performed using PROM data from field testing of two patellofemoral pain/OA cohorts. The Brisbane cohort (age ≥18 years) was a convenience sample recruited via advertising in staff and student newsletters and noticeboards at The University of Queensland, as well as external print advertising. The Melbourne cohort (age 26 to 50 years) was a convenience subsample of participants in a prospective longitudinal study, primarily recruited through staff and student newsletters at The University of Melbourne, as well as via local physiotherapy and sports medicine clinics. Inclusion criteria for both cohorts were as follows: (1) peripatellar or retropatellar pain aggravated by activities that load the patellofemoral joint (eg, squatting, ascending/descending stairs, running); (2) pain severity rated at least 30 mm on a 100 mm visual analogue scale (VAS); and (3) pain of at least 3 months duration. In addition, volunteers in the Melbourne cohort also had pain during aggravating activities on most days of the previous month.

Exclusion criteria for the Brisbane cohort were as follows: (1) diffuse or generalised knee pain; (2) history of total knee or total hip replacement and (3) severe trauma to the target knee in the previous year (eg, meniscal injury, surgery). Exclusion criteria for the Melbourne cohort were as follows: (1) concomitant pain from other knee structures, hip or lumbar spine that may impede testing procedures; (2) recent knee injections (prior 3 months); (3) planned or previous lower limb surgery; (4) moderate to severe concomitant tibiofemoral OA (Kellgren and Lawrence grade >3 on X-ray)54; (5) systemic medical conditions (eg, rheumatoid arthritis, inflammatory joint disease); (6) physical inability to undertake testing procedures; (7) inability to understand written and spoken English or (8) contraindications to X-ray (eg, pregnancy).

Data collection

At baseline, participants completed several questionnaires in paper format or in an online survey. This included preliminary patellofemoral items, KOOS, AKPS and Medical Outcomes Study 36-item Short Form Health Survey (SF-36),55 56 along with demographic questions (age, sex, height, weight, knee pain history, current knee-related limitations, surgical history, physical activity and format preference (ie, online vs paper)). Alternate forms reliability found paper and online methods to be comparable (see appendix). To assess test–retest reliability, we asked participants to complete the same questionnaires within 1 to 2 weeks of baseline.42 57 58 We selected this time interval so that it was sufficiently long to minimise recall of previous responses, and sufficiently short to reduce the chance of change in the participants’ condition,57 59 as demonstrated in a previous PFP study.60 Finally, to assess responsiveness participants were asked to complete questionnaires 3 months after baseline, together with a Global Rating of Change (GROC) score. To enhance participant adherence, we sent up to three reminders at each time point via telephone and/or e-mail if completed questionnaires were not returned within the expected time frame (ie, 1–2 weeks for paper versions, 3–5 days for online versions).

Item reduction

All statistical analyses were performed using Stata Intercooled 13.0 (StataCorp, TX, USA). To consider an item for deletion, we evaluated the following performance indicators: (1) endorsement by more than 50% of participants on the ‘no problem’ response option (ie, score of 0); (2) mean item score of <1 on a 0–4 worst to best scale or (3) item test–retest reliability, defined as intraclass correlation coefficient (ICC 3.1) absolute agreement <0.50.58 Additionally, we considered missing items (and reasons for missing), feedback from participants, clinical considerations and changes to internal consistency—specifically, item–test correlation, item–rest correlation, interitem covariance and Cronbach’s α (with an item removed as well as the full test α). Four study group members (KMC, EMM, NJC, EMR) reviewed any items that performed on the cusp of these cut points and considerations for clinical and statistical importance prior to final decisions for item removal.

Phase 3: evaluation of measurement properties

We evaluated the final KOOS-PF measurement properties according to the COnsesus-based Standards for the selection of health Measurements INstruments (COSMIN) guidelines.43 Specifically, we evaluated reliability (internal consistency, test–retest reliability and measurement error), construct validity (structural validity, convergent and divergent hypothesis testing, known-groups validity) and responsiveness. We also evaluated interpretability (smallest detectable change (SDC), minimal important change (MIC), minimal important difference (MID) and floor and ceiling effects). Measurement properties of the five original KOOS subscales (Pain, Symptoms, Function in daily living (ADL], Function in sport and recreation (Sport/Rec) and knee-related Quality of Life (QOL)) were also evaluated.

Internal consistency

Using baseline KOOS-PF data, we calculated Cronbach’s α and average interitem covariance.61 62 Cronbach’s α values between 0.7 and 0.95 were deemed to be adequate.42 A low α suggests lower correlation among subscale items and limits interpretability of a summed overall score. A very high α suggests item redundancy. Average interitem covariance was deemed to be adequate if at least 0.25.57

Test–retest reliability

We used a Bland-Altman plot to confirm homoscedasticity.63 We then used ICC (2.1) absolute agreement to evaluate reliability,64 with values≥0.7 considered adequate.42 58 We also reported the SEM. This is the equivalent of Embedded Image (where SD is the standard deviation of the observed scores).65

Structural validity

We conducted exploratory factor analysis to explore unidimensionality of the 11 items of the KOOS-PF. Using raw data (ie, missing items not replaced), we used parallel analysis and created scree plots, and conducted promax rotation with Kaiser normalisation, to determine the number of factors retained. Items were considered to load on a factor if the factor loading was 0.30 or greater.66 A sample size of between 4 and 10 participants per item is recommended for factor analysis, with at least 100 participants in the sample.42

Convergent and divergent validity

We evaluated construct validity by formulating a priori hypotheses for expected correlations between the KOOS-PF, AKPS and SF-36.

The AKPS consists of 13 items relating to specific symptoms and aggravating activities associated with patellofemoral pain (eg, stairs, squatting, prolonged sitting, pain). Participants select one response for each item. All items are scored on a weighted basis, and summed to give a score out of 100, where 0 represents maximal disability and 100 represents no disability.

We used the Australian English version of the SF-36 version 2 questionnaire. The SF-36 is used extensively in the literature, with well-documented validity and reliability across multiple populations. This generic measure of health is divided into eight subscales which are combined and weighted to form two large subscales, the physical component summary and mental component summary. Scores are obtained through use of proprietary software, and are out of a maximum of 100, where 0 represents poor health and 100 represents perfect health.

We used Pearson’s correlation coefficients to evaluate convergent and divergent validity. The following a priori hypotheses were posited regarding the strength and direction of correlations between KOOS-PF, AKPS and SF-36:

(H1): KOOS-PF will demonstrate the strongest positive correlation with the AKPS, followed by the SF-36 physical component summary (convergent validity).

(H2) KOOS-PF will demonstrate the weakest correlation with the SF-36 mental component summary (divergent validity).

Known-groups validity

At baseline, participants were asked ‘How would you rate your knee pain now?’ and offered categorical response options ranging from no pain to severe pain. The following a priori hypothesis was posited:

(H3): Participants who rate their knee pain as ‘moderate’ or worse will have lower KOOS-PF scores than those who rate their pain as ‘no problem’ or ‘mild’.

Welch’s two-sample t-test was used to evaluate known-groups validity, with significance set at p<0.05.


The GROC score is a single-item questionnaire that reads, ‘Place an ‘X’ in the box which best represents the change in pain in your study knee since you last completed the questionnaires’. Five response options (Likert scale) range from ‘much worse’ (score of 0) to ‘much better’ (score of 4).

No intervention was offered between test administrations, although we did not restrict participants from using self-management strategies (eg, analgesics, non-steroidal anti-inflammatory drugs). Thus, responsiveness was based on the natural course of each participant’s condition and their self-evaluated direction and amount of change in pain over the 3 months.

A priori hypotheses

(H4) KOOS-PF will demonstrate a moderate positive correlation with GROC scores.

(H5) The within-group effect size (Cohen’s d) of KOOS-PF for ‘any improvement’ (GROC score of 3 or 4) will be larger than the remaining KOOS subscales.

(H6) Participants who rate their change in pain as ‘any improvement’ (ie, GROC score of 3 or 4) will have higher KOOS-PF change scores than those who report ‘no change’ (score of 2) using an unpaired Welch’s t-test.


For individual changes, the smallest detectable change at 90% confidence (SDC90) was estimated as 1.65 x √2 x SEM.67 For evaluating group changes, we estimated the SDC90 as 1.65 x √2 x SEM / n .68

We estimated the within-group MIC using an anchor-based method by reporting the mean KOOS-PF change score for participants who reported ‘slight improvement’ in pain using the GROC (score of 3). We defined the between-group MID as the difference in mean change scores between those reporting ‘slight improvement’ and those with ‘no change’ (score of 2).67

Finally, we reported floor and ceiling effects defined as 15% or more of the sample scoring the lowest or highest scores possible on the KOOS-PF, respectively.


Phase 1: subscale development

A total of 80 different items were generated, with 31% of items identified by 50% of respondents, and others only identified by one or two of the 72 respondents. Eleven items, including ‘increasing body weight associated with increasing pain/increasing pain with activity leads to weight gain’ and ‘my pain is increasing over time’, were considered to be features that were unlikely to change over time or with an intervention, and were not included in the draft subscale. Other items such as ‘pain aggravated by wearing high heels/aggravated by yoga’ were considered too specific and not applicable to the general population with patellofemoral pain and, hence, were not included in the draft subscale. In addition, items including ‘acceptance of pain’, ‘low mood’, ‘changes in emotional well-being’ and ‘not knowing the solution to my knee pain’ were not included in the draft subscale, as they were considered likely to be captured by existing generic PROMs such as the SF-36.69

As expected, many items (43 in total) were similar to existing KOOS items, and were mapped to these (online supplementary table S1). This process left 12 new items for the preliminary KOOS-PF subscale (table 1). For six items, the item represented a merging of a number of suggested items into a single item. For example, six different items described pain after different sports or recreational activities and they were merged to the item PF10: ‘Pain after sport and recreational activities’.

Table 1

Preliminary KOOS-PF items

Twenty patients completed the instrument to pilot the items. This sample size is consistent with recommended sizes (8 to 15) aimed to ‘sample to redundancy’.57 Cognitive debriefing followed, where participants were asked specific questions about how they interpreted and understood different aspects of the instrument. The wording of items was modified slightly following this feedback process.

Phases 2 and 3: item reduction


Study enrolment occurred from February 2012 to March 2015. For the Brisbane cohort, 54 volunteers participated in this phase of the study, and in the Melbourne study 84 volunteers participated, resulting in a total sample of 138 (figure 1). A description of participant characteristics is presented in table 2.

Figure 1

Number of completed KOOS questionnaires at each time point, by location (note number of questionnaires included in subsequent analyses varies slightly due to missing data or invalid subscales).

Table 2

Participant demographics and baseline scores. Values are mean (standard deviation) and (range) unless otherwise stated (n=138)

Using our a priori criteria, we excluded one item: ‘How often do you experience pain with cold weather’, which performed on the cusp with 56% of respondents indicating ‘no problem’. We removed this item given the possibility of variable geographical exposure to cold weather, and because seasonal fluctuations in weather might influence study responses/outcomes on account of recall bias. One additional item (‘How much pain do you have hopping/jumping?’) had several missing items, yet we kept it since the item performed well and the task is clinically important for athletes (non-athletes can choose not to answer this item if it is not relevant to them, without invalidating the subscale) (table 1).

Phase 3: evaluation of measurement properties

Mean score (SD) for the final KOOS-PF subscale for the full sample with valid scores (n=132) was 55 (19), with scores ranging from 16 to 95. Administrative error resulted in six invalid subscales for KOOS-PF, Pain, Sport/Rec and ADL. Scores were normally distributed (skew 0.99, kurtosis 0.0001).

Internal consistency

Internal consistency of the KOOS-PF was good, with Cronbach’s α of 0.86 and interitem covariance of 0.50 (table 3).

Table 3

Internal consistency for KOOS subscales

Test–retest reliability

For participants who completed both baseline and re-test questionnaires within 2 weeks (n=55), test–retest reliability revealed an ICC (2.1) of 0.86 and an SEM of 6.8 (table 4).

Table 4

Test–retest reliability and measurement error

Structural validity

The unrotated KOOS-PF loaded predominantly on one factor, with an eigenvalue of 4.29 (proportion of variance 82%), and item factor loadings ranged from 0.42 to 0.75. While parallel analysis suggested a possible second factor, the eigenvalue of the second factor was low at 0.82, and items loading on this factor loaded more highly on the first factor than the second. Furthermore, after rotation the second factor appeared to represent more intense activities (hopping, running/jumping) rather than a distinct construct, suggesting the introduction of simple structure bias with rotation.70 Thus, we concluded that the items in the subscale represented a single dimension70 and that this factor describes knee pain relating to activities that load the patellofemoral joint (online supplementary table S2).

Convergent and divergent validity

All a priori stated hypotheses were confirmed for validity:

(H1): As hypothesised, KOOS-PF demonstrated the strongest positive correlation with the AKPS (r=0.74), followed by the SF-36 physical component summary (r=0.45).

(H2): As hypothesised, KOOS-PF demonstrated the lowest correlation with the SF-36 mental component summary (r=0.07).

Known-groups validity

(H3): As hypothesised, participants who rated their baseline knee pain as ‘moderate’ or more severe (n=47) had mean (SD) baseline KOOS-PF scores of 43.9 (15.3), while those who reported ‘no problem’ or ‘mild’ pain (n=83) had higher mean (SD) KOOS-PF scores of 61.3 (18.1) (p<0.0001).


(H4) As hypothesised, KOOS-PF change scores were moderately correlated with GROC scores (r=0.52) (figure 2, table 5).

Figure 2

KOOS-PF change scores (negative change represents worsening) vs GROC scores (0 = much worse; 1 = worse; 2 = no change; 3 = better; 4 = much better) demonstrated moderate correlation (r=0.52).

Table 5



The SDC90(individual) for KOOS-PF was 16, and SDC90(group) was 2.3. The MIC was 14.2 and MID was 11.8. There were no ceiling or floor effects (table 6).

Table 6


Performance of KOOS-PF compared with original KOOS subscales

For the majority of measurement properties evaluated, KOOS-PF performed within the range of the five original KOOS subscales (tables 2–6). Responsiveness of KOOS-PF measured with Pearson’s rho was at the upper end of the other subscales, and Cohen’s d for KOOS-PF was higher than any other subscale (table 5).


Patients with patellofemoral pain and OA frequently present to health and medical practices, and represent important subgroups of people with knee pain and OA.1 5 71–73 To our knowledge, KOOS-PF is the first PROM for patellofemoral pain and/or OA to have been developed in consultation with the COSMIN checklist.


Our study demonstrated adequate construct validity. Additionally, this is the first PROM for patellofemoral pain and/or OA to incorporate patient perspective in its development, thus enhancing content validity. The KOOS-PF scores were lower than the KOOS Symptom, Pain and ADL subscales in our study, suggesting the KOOS-PF may be more sensitive to pick up self-reported limitations than the original KOOS scores in a patellofemoral population.


Reliability of the KOOS-PF was adequate and similar to the other KOOS subscales in our study, as well as to published studies of the AKPS.15 16 22 24


The KOOS-PF had a larger effect size than the other KOOS subscales in our study, and thus it may be more responsive than the original KOOS subscales in a patellofemoral population. On this basis, KOOS-PF appears to be more useful than the existing KOOS subscales as an endpoint for clinical trials evaluating treatments for patellofemoral pain and/or OA.


The KOOS-PF SDC90(individual) of 16 is similar to the original KOOS subscales in our study, comparable to KOOS SDC95 values in a knee OA sample (range 13.4–21.1)74 and higher than AKPS SDC95 in values in anterior knee pain (range 7–14).16 22 24 SDC reflects the instrument but also the individuals being assessed, and because these KOOS studies did not assess a patellofemoral pain and/or OA sample, and one AKPS study excluded people with knee OA,22 comparisons to these findings should be made cautiously.

The KOOS-PF MIC of 14.2 is at the upper end of the KOOS subscales in our study, and compares with values of 8 to 10 for the AKPS in younger adults with patellofemoral pain.16 Although the MIC for KOOS-PF was larger than the SDC90 (group), it was smaller than the SDC90 (individual). Therefore, for individuals with patellofemoral pain or OA, the amount of change in KOOS-PF required to be both clinically meaningful and outside the range of measurement error is at least 16. This may be problematic in situations where small but clinically meaningful individual changes are expected, such as with exercise or physical therapies.

Methodological limitations

We aimed to develop and evaluate a KOOS subscale that could be used in people with patellofemoral disorders over the spectrum of disease from pain to OA. This would enable the same scale to be used in longitudinal cohort studies, and facilitate comparisons between people with and without structural disease. Some items identified during item generation (eg, difficulty with stairs) are already contained within the KOOS-Pain and KOOS-ADL subscales, and thus were not included in the KOOS-PF. This did not detract from its psychometric performance, while enhancing the overall content validity of the KOOS for use with a population affected by patellofemoral pain or OA.

As with any P, there is a trade-off between generalisability and specificity. For example, some items included in the KOOS-PF (eg, pain with running/jogging; pain with hopping/jumping) may not be a task that a patient typically participates in on account of either their knee-related symptoms or simply due to lifestyle. If patients omit an item, a valid score can still be calculated provided at least 50% of items in the subscale are completed.53 75 Thus, generalisability is still achievable with this subscale. At the other end of the spectrum, specific tasks that cause a patient pain may not be listed in this questionnaire, since it may not be a task reported by enough of the population of interest to be included. For example, only one participant (2%) in our study reported cycling as a source of pain or difficulty. We do, however, know cycling may be related to PFP. These activities are likely to be captured in other KOOS-PF items, namely PF10 (‘How much pain do you have after sport and recreational activities?’) or PF 11 (‘Have you modified your sport or recreational activities due to your knee pain?’). It is suggested that if less common items like this are of interest to a clinician, that a patient-specific outcome measure be included such as the Patient-Specific Functional Scale (PSFS),76 where items are generated by the patient. These types of questionnaires are less useful in research settings where generalisability is important. To our knowledge, there are no existing knee pain questionnaires that ask about cycling.32

This development paper used classical test theory to evaluate the measurement properties of KOOS-PF. This was largely due to the sample size, which was not sufficient to use item response theory. Further validation of KOOS-PF is required, using item response theory where possible.

An additional limitation is that an intervention was not administered in this study. Responsiveness and interpretability (MIC, MID) will be refined with future intervention trials using the KOOS-PF. Based on debate in the literature67 77 regarding the optimal method to calculate MIC and MID and whether they represent truly important change, future studies may also use revised methods to interpret change. The values reported in this study, however, provide good starting points for estimating sample sizes required for such studies.

Clinical application and conclusion

PROMs are vital to evaluate status and change in constructs that are important to patients. Until now, there has been no PROM designed for patellofemoral OA, and PROMs available for patellofemoral pain have had substantial methodological limitations. In addition, isolated patellofemoral OA is a prevalent and distinct phenotype of knee OA that often precedes the onset of generalised knee OA.4 78 Since patellofemoral OA tends to present clinically quite differently from tibiofemoral OA,51 including the KOOS-PF in samples with or at risk for knee OA may enhance content validity of the KOOS in this subpopulation where symptoms may differ. The 11-item KOOS-PF subscale is valid, reliable and responsive for people with patellofemoral pain and/or OA. Clinicians and researchers can use KOOS-PF in conjunction with KOOS to gain additional information about symptoms relevant to people with patellofemoral pain and/or OA, and to evaluate change over time.

What are the new findings?

  • We developed an 11-item patellofemoral pain and osteoarthritis subscale of the KOOS, which can be used alongside the KOOS to evaluate patellofemoral-specific symptoms.

  • KOOS-PF is reliable, valid and responsive.

How might it impact on clinical practice in the future?

  • KOOS-PF has utility for both research and clinical practice.

  • In clinical practice, patients with patellofemoral pain or OA need to change by 16 points or more on the KOOS-PF, to be both clinically meaningful and outside the range of measurement error.

Supplementary table S1.

Supplementary table S2.

Supplementary Appendix 1.

Supplementary Appendix 2.


The authors thank to the clinicians and researchers, including those attending the International Patellofemoral Pain Research Retreat, who provided valuable input to the development of the Knee injury and Osteoarthritis Outcome Score for patellofemoral pain and osteoarthritis subscale. The authors also wish to thank the clinical study participants from both Queensland and Victoria, Australia. EMM is completing her PhD under the supervision of Professor K Khan. Finally, thank you to Sofie Siden and Kerry Mellifont for their assistance with data collection and research administrative support.



  • Funding The authors gratefully acknowledge funding support for EMM from the Australian Endeavor Award Research Fellowship, and Vanier Canada Graduate Scholarship (CIHR). NJC was supported by a National Health and Medical Research Council (Australia) Research Training (Post-Doctoral) Fellowship (#628918).

  • Competing interests EMR is deputy editor of Osteoarthritis and Cartilage and the developer of Knee injury and Osteoarthritis Outcome Score (KOOS).

  • Provenance and peer review Not commissioned; externally peer reviewed

  • Provenance and peer review Not commissioned; externally peer reviewed.