Article Text

Comparative efficacy of exercise therapy and oral non-steroidal anti-inflammatory drugs and paracetamol for knee or hip osteoarthritis: a network meta-analysis of randomised controlled trials
  1. Qianlin Weng1,
  2. Siew-Li Goh2,3,
  3. Jing Wu4,
  4. Monica S M Persson5,6,
  5. Jie Wei4,7,
  6. Aliya Sarmanova5,6,
  7. Xiaoxiao Li4,
  8. Michelle Hall6,8,9,
  9. Michael Doherty5,6,9,
  10. Ting Jiang1,5,6,10,
  11. Chao Zeng1,4,11,
  12. Guanghua Lei1,4,11,12,
  13. Weiya Zhang5,6,9
  1. 1Department of Orthopaedics, Xiangya Hospital, Central South University, Changsha, Hunan, China
  2. 2Centre for Epidemiology and Evidence-Based Practice, University of Malaya, Kuala Lumpur, Malaysia
  3. 3Sports and Exercise Medicine Research and Education Group, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
  4. 4Hunan Key Laboratory of Joint Degeneration and Injury, Changsha, Hunan, China
  5. 5Academic Rheumatology, School of Medicine, University of Nottingham, Nottingham, UK
  6. 6Pain Centre Versus Arthritis, University of Nottingham, Nottingham, UK
  7. 7Health Management Center, Xiangya Hospital, Central South University, Changsha, Hunan, China
  8. 8Division of Physiotherapy Rehabilitation Sciences Education, University of Nottingham, Nottingham, UK
  9. 9Versus Arthritis Centre for Sport, Exercise and Osteoarthritis Research, University of Nottingham, Nottingham, UK
  10. 10Department of Ultrasonography, Xiangya Hospital, Central South University, Changsha, Hunan, China
  11. 11National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, China
  12. 12Hunan Engineering Research Center of Osteoarthritis, Changsha, Hunan, China
  1. Correspondence to Professor Guanghua Lei, Department of Orthopaedics, Xiangya Hospital, Central South University, 87 Xiangya Road, Changsha, Hunan, China; lei_guanghua{at}csu.edu.cn; Professor Chao Zeng, Department of Orthopaedics, Xiangya Hospital, Central South University, 87 Xiangya Road, Changsha, Hunan, China; zengchao{at}csu.edu.cn; Professor Weiya Zhang, Academic Rheumatology, Clinical Sciences Building, University of Nottingham, City Hospital, NG7 2RD, Nottingham, UK; weiya.zhang{at}nottingham.ac.uk

Abstract

Objective Clinical guidelines recommend exercise as a core treatment for knee or hip osteoarthritis (OA). However, how its analgesic effect compares to analgesics, for example, oral non-steroidal anti-inflammatory drugs (NSAIDs) and paracetamol—the most commonly used analgesics for OA, remains unknown.

Design Network meta-analysis.

Data sources PubMed, Embase, Scopus, Cochrane Library and Web of Science from database inception to January 2022.

Eligibility criteria for selecting studies Randomised controlled trials (RCTs) comparing exercise therapy with oral NSAIDs and paracetamol directly or indirectly in knee or hip OA.

Results A total of n=152 RCTs (17 431 participants) were included. For pain relief, there was no difference between exercise and oral NSAIDs and paracetamol at or nearest to 4 (standardised mean difference (SMD)=−0.12, 95% credibility interval (CrI) −1.74 to 1.50; n=47 RCTs), 8 (SMD=0.22, 95% CrI −0.05 to 0.49; n=2 RCTs) and 24 weeks (SMD=0.17, 95% CrI −0.77 to 1.12; n=9 RCTs). Similarly, there was no difference between exercise and oral NSAIDs and paracetamol in functional improvement at or nearest to 4 (SMD=0.09, 95% CrI −1.69 to 1.85; n=40 RCTs), 8 (SMD=0.06, 95% CrI −0.20 to 0.33; n=2 RCTs) and 24 weeks (SMD=0.05, 95% CrI −1.15 to 1.24; n=9 RCTs).

Conclusions Exercise has similar effects on pain and function to that of oral NSAIDs and paracetamol. Given its excellent safety profile, exercise should be given more prominence in clinical care, especially in older people with comorbidity or at higher risk of adverse events related to NSAIDs and paracetamol.

CRD42019135166

  • Exercise
  • Osteoarthritis
  • Medicine
  • Meta-analysis
http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THIS TOPIC

  • Exercise is an effective treatment for osteoarthritis. It has been recommended as a core therapy by National Institution for Health and Care Excellence and other treatment guidelines for osteoarthritis because of its favourable safety profile.

  • However, whether the analgesic effect of exercise is comparable to that of oral non-steroidal anti-inflammatory drugs (NSAIDs) and paracetamol, the most common analgesic given for osteoarthritis, remains unknown.

WHAT ARE THE NEW FINDINGS

  • Exercise has been compared with oral NSAIDs and paracetamol through a network meta-analysis of 152 randomised controlled trials (17 431 participants) for knee or hip osteoarthritis.

  • Exercise is indeed a medicine and its analgesic effect is similar to that of oral NSAIDs and paracetamol.

Introduction

Osteoarthritis (OA) is the most common form of joint disease and the leading cause of pain in older people.1 Pain symptoms associated with knee or hip OA result in increased physical and walking disability and an increased risk of all-cause mortality.2 The main management goal in OA is to relieve pain without increasing treatment-related adverse effects.

Oral non-steroidal anti-inflammatory drugs (NSAIDs) and paracetamol (or acetaminophen) are the most frequently prescribed analgesics to control pain and improve physical function of OA,3 4 with 10% to 35% of the OA population reporting the use of oral NSAIDs or paracetamol.5 6 However, oral NSAIDs and paracetamol are associated with gastrointestinal or cardiovascular complications and even increased risk of death, especially in older people with comorbidities.6–8 Current National Institution for Health and Care Excellence and international guidelines strongly recommend exercise as a core therapy for management of knee or hip OA.9–12 However, implementation of exercise in clinical practice remains limited and suboptimal, in part due to the time commitment required by health practitioners, absence of agreed standard protocols, lack of confidence in capability to exercise and concerns about joint overloading among people with OA.13–15 Also, it is still unclear whether exercise has an analgesic effect equivalent to that from analgesics such as oral NSAIDs and paracetamol. The current evidence on the direct comparison between these treatments is sparse.16–19 The majority of exercise trials used usual care as a control whereas the majority of drug trials used placebo as a control; therefore, the effect size from exercise trials is not comparable to that from oral NSAIDs and paracetamol trials. This may further preclude the uptake of exercise as a core therapy in clinical practice.20–22 The comparative efficacy of exercise and oral NSAIDs and paracetamol will help to confirm the analgesic benefit of exercise. Such information may enhance public awareness of exercise as an effective treatment for OA rather than just a physical activity for general health. It will inform patient–practitioner discussion and shared decision-making and encourage patients to be more proactive about including exercise in their individualised OA management plan.23

However, few head-to-head randomised controlled trials (RCTs) are available,16–19 and the results were discordant. In these four previous RCTs, one study (n=48) reported a more beneficial effect of oral NSAIDs,16 two studies (n=141 and 142, respectively) showed no difference between the effect of exercise and oral NSAIDs and paracetamol17 18 and one study found more benefit from exercise.19 We, therefore, undertook this network meta-analysis (NMA) to gather all RCTs, which either directly compared exercise with oral NSAIDs and paracetamol, or indirectly compared these two treatments via a common comparator in an NMA (e.g., usual care). We estimated the comparative effect size for the two common outcomes in OA trials, that is, pain and function in people with knee or hip OA.

Methods

Protocol and registration

The reporting of this NMA followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses24 and the NMA protocol was registered with PROSPERO.

Eligibility criteria

The inclusion criteria were as follows: (1) RCTs; (2) studies on participants with knee or hip OA; (3) studies comparing exercise therapy with oral NSAIDs and paracetamol, or studies comparing exercise therapy with any common comparator that may be shared with oral NSAIDs and paracetamol (usual care/no treatment/waiting list control, glucosamine sulphate/chondroitin sulphate, intra-articular hyaluronic acid, topical NSAIDs, acupuncture); (4) studies reporting pain or function; (5) studies published in any language.

The following studies were excluded: (1) secondary analyses, such as combined data analyses of published RCTs; (2) studies with less than 1 week of follow-up; (3) studies using a cross-over design; (4) exercise therapy or oral NSAIDs and paracetamol or any common comparator combined with other active interventions; (5) studies for postoperative pain; (6) abstract only (insufficient data).

Throughout this text, the term ‘exercise’ was used to refer to ‘exercise therapy’. Exercise therapy is defined as a planned, structured, repetitive and purposeful physical activity for the improvement or maintenance of a specific health condition (or disease).25 Exercise therapy encompasses aerobic, muscle strengthening, flexibility/neuromotor skills training (flexibility/skill) or mind–body exercise (e.g., tai chi, yoga).26 Studies were classified as mixed exercise when they included more than one exercise type mentioned above, or when the authors did not specify it as a single component exercise. Any form of exercise therapy was eligible, regardless of content, duration, frequency or intensity.

‘Usual care’ control was classified based on the report. In ‘usual care’, participants were expected to continue their routine general care. Control groups that were not given any specific intervention such as ‘waiting list’, usual physical activity or no treatment, or where the authors did not specify the nature of the control, were also classified as ‘usual care’. ‘Waiting-list’ controls were given an active intervention after the trial period, with no new intervention being delivered during the trial period.27

Literature search

Systematic literature searches were conducted using PubMed, Embase, Scopus, Cochrane Library and Web of Science up to January 2022 (online supplemental appendix 1). Additionally, the references of relevant reviews and selected articles were examined for potentially relevant trials.

Study selection

After removing duplicates, all titles and abstracts were screened independently for potentially eligible studies by two reviewers (QW and JW), and relevant RCTs were identified. Reports of studies considered potentially relevant by at least one reviewer were retrieved in full text. The eligibility of the retrieved full-text articles for final inclusion was assessed independently by two reviewers. Disagreement was resolved through discussion, and if no consensus was reached, a third reviewer (CZ) was involved to make the final decision.

Data extraction

Baseline characteristics and outcome data were extracted into a standardised form by two independent assessors (QW and JW). The outcome measures of interest were pain and function scores reported at baseline, 4, 8 and 24 weeks. When 4-week data were not available, we used the data reported at the closest time point from 2 to 6 weeks of follow-up. All types of exercise and oral NSAIDs and paracetamol were included during data extraction. Change-from-baseline pain scores were extracted or calculated. If a study reported multiple pain scales, the scale with the highest sensitivity to change was used.28 The function subscale of the Western Ontario and McMaster Universities Arthritis Index (WOMAC) was used for the assessment of functional improvement. If a study did not measure or report WOMAC function, the Lequesne Index or one of the other functional measurement scales was used instead. Corresponding authors of studies with missing data were contacted through ResearchGate or email with a request for the data.

Quality assessment

The Cochrane risk of bias assessment tool29 was used to assess the methodological quality of RCTs. The following seven domains were evaluated in each included study: random sequence generation; allocation concealment; blinding of participant and personnel; blinding of outcome assessment; incomplete outcome data; selective reporting and other biases. Each domain was assigned a judgement of low risk, high risk or unclear risk of bias. Because it is not possible to truly blind health practitioners and participants in any study related to exercise treatment, this was not included in the overall risk of bias assessment of each study. In addition, the quality level of this NMA was evaluated by the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach.30 31 The results were evaluated for evidence quality, which was classified into high, moderate, low or very low levels (online supplemental appendix 2). The summary of findings of outcome was presented using the template of the GRADE NMA summary of findings (NMA-SoF) table for exercise compared with oral NSAIDs and paracetamol (online supplemental appendix 2).32

Statistical analysis

For pain and functional outcomes, the standardised mean difference (SMD) was used to standardise the results to a uniform scale when the studies assessed the same outcome with different instruments.29 Two SMDs were calculated, one for exercise versus usual care to confirm whether exercise was effective (better than usual care) and the other for exercise versus oral NSAIDs and paracetamol to examine whether exercise was as effective as oral NSAIDs and paracetamol. For both SMDs, exercise was the intervention and usual care or oral NSAIDs and paracetamol were the control; hence, a positive value favours exercise, whereas a negative value favours usual care or oral NSAIDs and paracetamol. It was calculated by dividing the difference in mean values between treatment groups by the pooled standard deviation (SD) of change-from-baseline. If SDs were not reported, we calculated them from standard errors (SEs) or confidence intervals (CIs). When change-from-baseline SDs were not presented, we calculated the missing SDs using the formula: Embedded Image = Embedded Image .29 Those studies which did not report mean scores at baseline and endpoint or change-from-baseline scores with SDs, SEs or CIs were not included in the analyses. For studies with multiple intervention groups, we will combine all relevant groups to create a single pairwise comparison.29 Clinically, an SMD of 0.20 was considered as a small effect, 0.50 a moderate effect and 0.80 a large effect, according to Cohen.33

Bayesian NMA methods were used to assess the comparative efficacy of exercise versus oral NSAIDs and paracetamol. Bayesian NMA methods pool direct and indirect evidence on relative treatment effects.34 35 The Markov Chain Monte Carlo method was used to estimate posterior densities for unknown variables.35 36 A random effects model was adopted as the most appropriate and conservative method to account for differences between RCTs. Two Markov chains ran simultaneously with different initial values.36 37 Bayesian random effects model was based on the Dias model and uninformative prior probability distributions were used for all parameters.38 39 A total of 50 000 simulations were generated for each of the two sets of initial values and the first 10 000 simulations were discarded as the burn-in period. The WinBUGS codes are available at http://www.bristol.ac.uk/social-community-medicine/projects/mpes/.

The pooled SMDs were generated from the median of the posterior distribution.37 40 The 2.5th and 97.5th percentiles of the posterior distribution were considered the lower and upper limit, respectively, of the 95% credible interval (CrI). A 95% CrI may be interpreted as there being a 95% probability that the parameter takes a true value in the specified range.41 Heterogeneity was defined as the variability of results across trials, with τ2<0.04 indicating a low level and τ2>0.4 a high level.42–44 The fit of the model to the data was measured by calculating the posterior mean residual deviance.38 If the mean of the residual deviance is similar to the number of data points of the model, it indicates that the model fits the data adequately.38 We estimated the inconsistency between the direct and indirect evidence. The global inconsistency of the entire network was assessed with the design-by-treatment interaction model,45 and the local inconsistency in the network was estimated with the node-splitting method.46 In order to facilitate the clinical interpretation, we assessed the probability that the exercise intervention would be likely to reach the minimum clinically important difference (MCID). We prespecified a MCID of 0.37 SD units, corresponding to 0.9 cm on a 10 cm visual analogue scale.2 This threshold of 0.37 SD units was based on the median MCID found in recent studies of people with OA.2 47

All statistical analyses were conducted using WinBUGS software (V.1.4.3, MRC Biostatistics Unit, Cambridge, UK) and STATA software (V.11.0, Stata, College Station, Texas, USA).

Additional analysis

The transitivity assumption was met as we used the same definition for the common comparator, for example, usual care or glucosamine sulphate/chondroitin sulphate, for both exercise and oral NSAIDs and paracetamol. It was also assessed by comparing the distribution of trial characteristics (publication year, percentage female, mean age, baseline pain and function score) across studies grouped by comparison. To assess the robustness of the results obtained by the primary model, we performed several sensitivity analyses on the primary outcomes of pain and function to explore potential causes for heterogeneity. Four sensitivity analyses were conducted according to sample size (≥ 30/arm), low risk of allocation concealment, intervention without prescribing paracetamol and studies without outliers (effect size >5).27 We estimated publication bias by visual assessment of funnel plot asymmetry (online supplemental appendix 3).48 We then assessed the probability that each intervention could be ranked as the most effective treatment for pain relief or functional improvement. We obtained a hierarchy of the competing interventions using the surface under the cumulative ranking curve (SUCRA) and mean ranks. SUCRA values were expressed as percentages, the higher value representing the higher probability of being the best option (online supplemental appendix 4).49

Results

Study selection and characteristics

The search strategy retrieved 46 635 related articles after duplicates were removed. After title or abstract screening, the full texts of 2738 potentially eligible articles were reviewed (figure 1). Of these, 152 studies (17 431 participants) met the inclusion criteria for NMA. The network of all treatment comparisons analysed for pain and function is presented in figure 2 and online supplemental appendix 5. The assumption of transitivity was met as we used the same definition of the common comparator. It was confirmed by comparing distributions of baseline characteristics where no variability was observed in the study (online supplemental appendix 6).

Figure 1

Summary of studies identification and selection according to the PRISMA flow diagram. NSAIDs, non-steroidal anti-inflammatory drugs; OA, osteoarthritis; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Figure 2

Structure of network formed by interventions (A) pain relief at or nearest to 4 weeks; (B) functional improvement at or nearest to 4 weeks. NSAIDs, non-steroidal anti-inflammatory drugs; IAHA, intra-articular hyaluronic acid.

The characteristics of the included studies are shown in table 1 and online supplemental appendix 7. Of the 152 trials, 132 (15 005 participants) reported pain-related outcomes and 125 (12 929 participants) reported physical function outcomes. Apart from four trials comparing exercise with oral NSAIDs and paracetamol directly, there were 49 studies that had data available at or nearest to 4 weeks, two studies had data available at 8 weeks and nine studies had data available at 24 weeks. Most of the trials recruited participants with knee OA (n=127, 83.6%), 12 studies investigated hip OA (7.9%) and 13 studies (8.6%) recruited a mix of participants with knee or hip OA. In the current study, a total of 95 articles were finally extracted for the analysis, 59 articles were included for the comparative efficacy of exercise versus oral NSAIDs and paracetamol and 83 were used for the comparative efficacy of exercise versus usual care, each contributed outcomes at different time points.

Table 1

Basic characteristics of included randomised controlled trials (n=152 studies)

The methodological quality was evaluated for all included trials (online supplemental appendix 8. The generation of the allocation sequence was adequate in most trials (n=105, 69.1%). Allocation concealment was adequate in almost half of the trials (n=73, 48.0%) and 95 trials (62.5%) masked outcome assessors to treatment allocation. The potential risk of bias likely to be introduced by incomplete data was high in 16 trials (10.5%). The risk of selective reporting bias was low in most trials (n=144, 94.7%) and 19 (12.5%) trials were commercially funded.

Efficacy of exercise over usual care

For pain relief, exercise was more effective than usual care at or nearest to four (SMD=1.31, 95% CrI 0.61 to 2.01), eight (SMD=0.78, 95% CrI 0.58 to 0.98) and 24 weeks (SMD=0.19, 95% CrI 0.00 to 0.38). For functional improvement, exercise was also more effective than usual care at or nearest to 4 (SMD=1.08, 95% CrI 0.29 to 1.88), 8 (SMD=0.94, 95% CrI 0.70 to 1.18) and 24 weeks (SMD=0.20, 95% CrI 0.04 to 0.37). The effect size for function exceeded the prespecified MCID of 0.37 at or nearest to 4 weeks. Similarly, there was enough evidence to support a MCID treatment effect at or nearest to 4 (pain) and 8 weeks (pain and function), with the probability that the effect size for exercise compared with usual care was 0.37 or higher being >95%.

Comparative efficacy between exercise versus oral NSAIDs and paracetamol

As shown in table 2, there was no statistical difference between exercise and oral NSAIDs and paracetamol in pain relief at or nearest to 4 (SMD=−0.12, 95% CrI −1.74 to 1.50), 8 (SMD=0.22, 95% CrI −0.05 to 0.49) and 24 weeks (SMD=0.17, 95% CrI −0.77 to 1.12). For functional improvement, no statistical difference between exercise and oral NSAIDs and paracetamol was evident at or nearest to 4 (SMD=0.09, 95% CrI −1.69 to 1.85), 8 (SMD=0.06, 95% CrI −0.20 to 0.33) and 24 weeks (SMD=0.05, 95% CrI −1.15 to 1.24). The league table presenting all results of the NMA is available in online supplemental appendix 9.

Table 2

Comparative efficacy for exercise versus oral NSAIDs and paracetamol among knee or hip OA

Model fit, heterogeneity and sensitivity analysis in NMA

Evaluation of the goodness of fit demonstrated a good fit with a posterior mean residual deviance of 94.6 (94 data points) for pain at or nearest to 4 weeks, 3.8 (4 data points) for pain at 8 weeks, 18.1 (18 data points) for pain at 24 weeks, 80.1 (80 data points) for function at or nearest to 4 weeks, 3.9 (4 data points) for function at 8 weeks and 18.2 (18 data points) for function at 24 weeks. At or nearest to 4 weeks, we did not find any inconsistency between evidence derived from direct and indirect comparisons on exercise and oral NSAIDs and paracetamol for pain and function using the design-by-treatment inconsistency model (p>0.05, online supplemental appendix 10); also, the node-splitting analysis did not show any significant difference between direct and indirect evidence (p>0.05, figure 3). Test for inconsistency was not available at 8 weeks and 24 weeks due to the absence of indirect evidence at these two time points. τ2estimates suggested low statistical heterogeneity for both pain (τ2=0.00) and physical function (τ2=0.00) at 8 weeks. However, there was significant heterogeneity across the trials (pain at or nearest to 4 weeks: τ2=1.84; pain at 24 weeks: 0.45; function at or nearest to 4 weeks: τ2=2.04; function at 24 weeks: 0.60). As summarised in online supplemental appendix 11, several sensitivity analyses for pain and function suggested that the results obtained were robust.

Figure 3

Forest plots depicting estimates from direct and indirect comparison for exercise to oral NSAIDs and paracetamol. CrI, credible interval; NSAIDs, non-steroidal anti-inflammatory drugs; SMD, standardised mean difference.

Discussion

Principal findings

This NMA is based on 152 RCTs, which included 17 431 participants, to compare the efficacy between exercise and oral NSAIDs and paracetamol for knee or hip OA. The results showed that exercise was indeed a clinically effective treatment (better than usual care) in reducing pain and improving physical function in people with knee or hip OA. While these effect sizes of exercise gradually decreased over a period of time, they were not different from those obtained from oral NSAIDs and paracetamol at short (4 weeks), medium (8 weeks) or long (24 weeks) term periods of treatment.

Comparisons with previous studies

To date, there is limited NMA and one conventional meta-analysis comparing the relative effect of exercise versus analgesics (opioids and NSAIDs) on knee pain.20 21 A recent NMA integrated two direct evidence and 91 indirect evidence to confirm that exercise provides superior analgesia compared with NSAIDs (0.54, 95% CI 0.19 to 0.89).21 Nevertheless, in that NMA, the NSAIDs group consisted of both oral and topical interventions, the control group for analgesics contained placebo and the control group for exercise included usual care, education, ultrasound therapy and physiotherapy.21 As a treatment intervention, placebo was treated the same as other above control groups, such as usual care, during the analysis in that NMA. These study designs may potentially affect the structure of the network (due to inconsistency and heterogeneity) and, therefore, affect the intervention outcomes. Another conventional meta-analysis included six Cochrane reviews (four pharmacology, two exercise) with 9806 participants (5627 pharmacology and 4179 exercise). The pooled effect size was 0.41 (95% CI 0.23 to 0.59) for pharmacological treatments and 0.46 (95% CI 0.34 to 0.59) for exercise. The authors concluded that the effects of exercise on knee pain were similar to the effects of analgesics.20 However, the effect size of analgesics in both meta-analyses is not comparable with the effect size of exercise as the former was the effect over placebo, whereas the latter was the effect over usual care, education, ultrasound therapy and physiotherapy.20 21 In addition, it is well known that placebo is more effective than usual care or no treatment,50 hence an intervention may not be superior to placebo, for example, acupuncture versus sham acupuncture, but may still be superior to usual care.51 The only way to examine the difference between exercise and oral NSAIDs and paracetamol is to run a head-to-head comparison, either through an RCT or an NMA of RCTs, where both interventions are placed in the same context, for example, no blinding. Our NMA fulfils this context, where exercise was compared with oral NSAIDs and paracetamol directly within a trial or indirectly through a common comparator between trials, both being randomised but not blinded.

Only a few direct comparison RCTs have assessed the efficacy of exercise versus oral NSAIDs and paracetamol for OA and the results were inconsistent.16–19 In a recent RCT, oral NSAIDs (n=48) were reported more beneficial than exercise (n=46) in people with knee OA at or nearest to 4 weeks.16 Additionally, two randomised trials (n=141 and 142, respectively) in people with knee OA compared 8 weeks of exercise and oral NSAIDs and paracetamol and found reduction in pain and improvement in function from baseline in each group but no differences between groups.17 18 However, another RCT (n=166) showed that exercise was more effective than oral NSAIDs on pain in people with knee OA over a 12-week follow-up period.19 These studies might be limited to draw sufficient evidence due to the low sample size of the great majority of the studies and the differences in methodological quality. The use of an NMA, representing the most comprehensive RCT evidence, allows for greater power and greater precision to confirm the comparative effect between exercise and oral NSAIDs and paracetamol.52

Limitations

Several limitations of this study should be noted. First, we were unable to fully explore the reasons for heterogeneity because many covariates for exercise effect in OA are normally not recorded in trials.27 41 53 Second, the inconsistency between direct and indirect evidence at 8 and 24 weeks could not be examined due to the lack of head-to-head comparisons, so caution must be taken when interpreting the results at these two time points. Moreover, the numerical difference between the direct and indirect evidence was large at or nearest to 4 weeks, which may be in part due to fewer trials and a relatively small sample size included in direct evidence (one RCT with 94 participants for pain, two RCTs with 210 participants for function), and to the lack of allocation concealment.54 However, there was no local or global inconsistency between the indirect estimates and the available direct evidence; thus, the pooled results in the current study seem to be reliable.35 37 Third, the included RCTs and participants in this study were restricted to knee or hip OA. The conclusions may not be generalisable to OA at other joints. Fourth, there were insufficient data to perform a subgroup analysis according to the type of exercise. Therefore, we could not clarify which types, frequency or intensity of exercise are comparable to oral NSAIDs and paracetamol. Fifth, the effect sizes for exercise at or nearest to 4 weeks compared with usual care are very large. This may be because we used a more stringent definition for ‘usual care’ control, and we pooled both direct and indirect evidence which increased the variations between trials and amplified the effect sizes by contextual effects.55 Sixth, the lack of placebo and blinding in this NMA may overestimate the true effect of both exercise and oral NSAIDs and paracetamol efficacy. However, as discussed before, it did make the two treatments comparable in the same context without placebo and blinding and allowed us to estimate the relative efficacy between these two treatments. Furthermore, we included all trials in the main analysis irrespective of their risk of bias. The inclusion of studies with small sample size may lead to the small study effect, that is, a smaller study will inevitably result in a larger SMD.56 57 However, we undertook two sensitivity analyses, one based on the allocation concealment and the other based on sample size (≥30 per arm, online supplemental appendix 11), and the results did not differ between these and the main analysis. As the sample size in exercise trials was in general smaller than drug trials, we were unable to perform sensitivity analysis for studies with sample size of ≥100/arm. The small study effect remains a caveat for this NMA. However, this may not affect the conclusion as the difference would become even smaller if the small study effect was removed. Finally, like other types of meta-analysis, it is prone to publication bias and other risk of bias and is limited to the information reported in the paper. The methodology of combining direct and indirect evidence may increase heterogeneity. Moreover, due to the large number of comparisons in the NMA, multiplicity may have increased the rate of false positives for the statistically significant results (type I error).58

Clinical and research implications

This study has confirmed that exercise is a medicine. Its analgesic effect is similar to that obtained from the most commonly used analgesics, oral NSAIDs and paracetamol, without serious side effects as those associated with oral NSAIDs and paracetamol.41 The findings support the current recommendation of using exercise as a core therapy for OA. It also suggests that exercise may be used as an analgesic replacement therapy for older people with comorbidity or multimorbidity and people at higher risk of adverse events related to NSAIDs and paracetamol.59 However, it is worth emphasising that although there is no direct evidence that exercise has significant side effects in the treatment of knee or hip OA, inadequate type and intensity of exercise might aggravate the symptoms and progression of OA.60

Conclusions

Exercise is effective for pain and function due to knee and hip OA. Its effect is similar to that of oral NSAIDs and paracetamol at short (4 weeks), median (8 weeks) and long-term (24 weeks) follow-up. However, this conclusion is based mainly on indirect comparisons. Further direct evidence for OA outcomes and the long-term benefits of exercise over oral NSAIDs and paracetamol for other outcomes such as comorbidities are still needed.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.

Acknowledgments

Everyone who contributed significantly to the work has been listed.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • QW and S-LG contributed equally.

  • Contributors GL, CZ and WZ are joint corresponding authors. WZ, CZ and GL had full access to the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Concept and design: all authors. Acquisition and interpretation of data: all authors. Statistical analysis: QW, JW, JW and XL. Drafting of the manuscript: QW. Critical revision of the manuscript for important intellectual content: all authors. Study supervision: WZ, CZ and GL. The corresponding authors attest that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

  • Funding This work was supported by the National Natural Science Foundation of China (81930071, 82072502, U21A20352), Project Program of National Clinical Research Center for Geriatric Disorders (Xiangya Hospital, 2020LNJJ03) and the Science and Technology Program of Hunan Province (2019RS2010).

  • Disclaimer The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and the decision to submit the manuscript for publication.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.