Article Text

Comparative effectiveness of physical exercise interventions for chronic non-specific neck pain: a systematic review with network meta-analysis of 40 randomised controlled trials
Free
  1. Rutger MJ de Zoete1,2,
  2. Nigel R Armfield1,
  3. James H McAuley3,
  4. Kenneth Chen1,4,
  5. Michele Sterling1
  1. 1 RECOVER Injury Research Centre, NHMRC Centre of Research Excellence in Recovery Following Road Traffic Injuries, The University of Queensland, Herston, Queensland, Australia
  2. 2 School of Allied Health Science and Practice, The University of Adelaide, Adelaide, South Australia, Australia
  3. 3 Neuroscience Research Australia and School of Medical Sciences, University of New South Wales, Sydney, New South Wales, Australia
  4. 4 Geriatric Education and Research Institute, Singapore
  1. Correspondence to Dr Rutger MJ de Zoete, School of Allied Health Science and Practice, The University of Adelaide, Adelaide, South Australia, Australia; rutger.dezoete{at}adelaide.edu.au

Abstract

Objective To compare the effectiveness of different physical exercise interventions for chronic non-specific neck pain.

Design Systematic review and network meta-analysis.

Data sources Electronic databases: AMED, CINAHL, Cochrane Central Register of Controlled Trials, Embase, MEDLINE, Physiotherapy Evidence Database, PsycINFO, Scopus and SPORTDiscus.

Eligibility criteria Randomised controlled trials (RCTs) describing the effects of any physical exercise intervention in adults with chronic non-specific neck pain.

Results The search returned 6549 records, 40 studies were included. Two networks of pairwise comparisons were constructed, one for pain intensity (n=38 RCTs, n=3151 participants) and one for disability (n=29 RCTs, n=2336 participants), and direct and indirect evidence was obtained. Compared with no treatment, three exercise interventions were found to be effective for pain and disability: motor control (Hedges’ g, pain −1.32, 95% CI: −1.99 to −0.65; disability −0.87, 95% CI: –1.45 o −0.29), yoga/Pilates/Tai Chi/Qigong (pain −1.25, 95% CI: –1.85 to −0.65; disability –1.16, 95% CI: –1.75 to −0.57) and strengthening (pain –1.21, 95% CI: –1.63 to −0.78; disability –0.75, 95% CI: –1.28 to −0.22). Other interventions, including range of motion (pain −0.98 CI: −2.51 to 0.56), balance (pain −0.38, 95% CI: −2.10 to 1.33) and multimodal (three or more exercises types combined) (pain −0.08, 95% CI: −1.70 to 1.53) exercises showed uncertain or negligible effects. The quality of evidence was very low according to the GRADE (Grading of Recommendations Assessment, Development and Evaluation) criteria.

Conclusion There is not one superior type of physical exercise for people with chronic non-specific neck pain. Rather, there is very low quality evidence that motor control, yoga/Pilates/Tai Chi/Qigong and strengthening exercises are equally effective. These findings may assist clinicians to select exercises for people with chronic non-specific neck pain.

PROSPERO registration number CRD42019126523.

  • neck
  • meta-analysis
  • exercise
  • chronic

Statistics from Altmetric.com

Introduction

Neck and back pain are the leading cause of years-lived-with-disability,1 and neck pain is responsible for a substantial burden to society.2 Up to 70% of the global population experiences neck pain at least once in their lives,3 4 of which 50% to 85% is expected to become recurring within 1 to 5 years after the initial onset.4 This leads to neck pain being a global burdensome problem,5 contributing to a rapidly increasing trend in spinal pain-related healthcare expenditures.5

Different types of physical exercise, including strengthening, range of motion, motor control, stretching and proprioceptive training, are recommended in clinical guidelines6 and are commonly used as a management strategy in the first-line treatment of neck pain.7 However, review articles investigating the effectiveness of different physical exercise interventions for people with chronic neck pain report modest effect sizes at best, on pain intensity and pain-related disability.8–12 Together with the patient, clinicians are therefore required to choose the type of exercise they prefer or expect to most effectively improve clinical outcomes.

While randomised controlled trials (RCTs), systematic reviews and meta-analyses allow for pairwise comparisons of two types of exercise, they are not suitable to compare the effectiveness of all types of physical exercise. Furthermore, as study interventions often incorporate different types of exercise, it is difficult to draw conclusions regarding the effectiveness of separate types of physical exercise. Investigating the effectiveness of these exercise interventions in separate pairwise meta-analyses is not possible, as insufficient data are available to address each type of physical exercise.

In order to investigate the relative effectiveness of different physical exercise interventions, network meta-analysis (NMA)13 14 enables for the interpretation of an entire body of evidence,15 even though some interventions may not have been directly compared with others.16 By using direct and indirect evidence from a network of pairwise RCTs, the effectiveness of interventions can be estimated.17 This approach was recently used to investigate the effectiveness of pharmacological management for depression18 and physical exercise for chronic low back pain.19 By generating a hierarchy of interventions, these NMAs were able to provide valuable information for clinical decision-making.

While evidence indicates that exercise therapy has modest effects on pain and disability in individuals with chronic neck pain, there are currently no treatment options that demonstrate medium or large effect sizes.12 Meta-analyses generally provide ambiguous evidence with modest effect sizes for exercise therapy versus ‘no treatment’, whereas NMA can add valuable information by combining evidence from both direct and indirect comparisons. In this NMA, we aimed to systematically investigate the effectiveness of different types of physical exercise interventions in people with chronic non-specific neck pain. The primary research question is: What is the effectiveness of different types of physical exercise on neck pain intensity and pain-related disability? The secondary research question is: What is the effectiveness of different durations, frequencies and intensities of physical exercise interventions on neck pain intensity and pain-related disability?

Methods

Protocol and registration

This network meta-analysis is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement and PRISMA extension for NMA (PRISMA-NMA).20 21 This study was prospectively registered on PROSPERO and a detailed protocol was published elsewhere.22

Information sources

We searched nine electronic databases: AMED, CINAHL, Cochrane Central Register of Controlled Trials, Embase, MEDLINE, Physiotherapy Evidence Database, PsycINFO, Scopus and SPORTDiscus. The database search was conducted on 12 March 2019.

Search strategy

We developed a search strategy with a medical librarian consisting of three parts, including terms for (1) physical exercise, (2) chronic neck pain and (3) RCTs. The neck pain search terms were consistent with those recommended by the Cochrane Back and Neck review group.23 Initially the search strategy was developed for the MEDLINE database (online supplemental file A), subsequently we adjusted this strategy to the requirements of the other databases. The electronic searches were complemented with manual searches for prospectively identified systematic reviews and meta-analyses.

Supplemental material

Eligibility criteria

We included RCTs describing the effects of any physical exercise intervention in adults (age ≥18 years) with chronic non-specific neck pain (symptoms persisting for ≥12 weeks). Neck pain was defined as pain between the occiput and the first thoracic vertebra as primary complaint. Other terms for non-specific neck pain that may be used are idiopathic neck pain, non-traumatic neck pain, insidious onset neck pain, mechanical neck pain and work-related neck pain. As comparator, we included any physical exercise intervention, or a control group, sham group, placebo group or no-treatment group. We excluded studies if they included participants younger than 18 years, non-human participants, participants with traumatic neck pain (eg, whiplash associated disorder) or participants with specific pathology (eg, cancer). Studies reporting primary complaints other than neck pain, such as post-concussion syndrome, headache and migraine, were excluded. Full-text papers published in the English language were included, and no date limits were applied.

Study selection

Two reviewers (RMJdZ and KC) independently screened titles and abstracts to identify potentially eligible studies. For each identified study, two reviewers independently reviewed the full-text papers. In either stage, a third reviewer (MS) resolved any disagreements on study inclusion as necessary. Inter-rater agreement was calculated using Cohen’s kappa coefficient (k). Where studies were reported in multiple papers, only the paper reporting the most complete analysis of effectiveness was included (ie, reports of subgroups or secondary analyses were discarded).

Data extraction

Two reviewers extracted and recorded data from the included studies using a standardised extraction table agreed on by all authors.

Extracted data comprised: study characteristics (author and year), participant characteristics (sample size, age and sex), type of exercise intervention, duration, frequency and intensity and timing of follow-up assessment. Means and SDs for primary outcome measures at baseline and the follow-up time point closest to the end of the treatment period were extracted. Data were converted where necessary as described in our protocol.22 Where studies reported more than two physical exercise interventions which independently could be included in this NMA, data from all study arms were extracted.

Primary outcomes

The primary outcome measures were pain intensity (eg, Visual Analogue Scale (VAS), Numeric Rating Scale (NRS)) and pain-related disability (eg, Neck Disability Index (NDI), Northwick Park Neck Pain Questionnaire (NPNPQ), Neck Pain and Disability scale (NPAD)), measured at the time point after, and closest to, the end of the treatment.

Categorisation of studies

We identified 10 categories of physical exercise interventions through an iterative process of reviewing relevant RCTs.22 The definitions of these physical exercise interventions are provided in table 1.

Table 1

Definitions of physical exercise interventions and non-exercise comparators

Risk of bias assessment

Methodological quality of the included studies was assessed independently by two reviewers using the PEDro (Physiotherapy Evidence Database) scale, which is a validated tool24 for assessing the risk of bias of RCTs and commonly used to assess physiotherapeutic interventions. A third reviewer was available to resolve disagreements as required. Inter-rater agreement was calculated using Cohen’s kappa coefficient (k). All studies were included in the systematic review, however only those with a summary score ≥5 (cut-off for moderate-quality evidence) were included in the statistical analysis.25

GRADE assessment

The Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach was used to assess study limitations, indirectness and transitivity, statistical heterogeneity and inconsistency, imprecision and publication bias.26 Considering the certainty of evidence across paired comparisons might differ, the GRADE approach was used for each pairwise comparison, in line with the GRADE framework which has been adapted for NMA.27 28 As all studies included in this NMA were RCTs, consistent with the Cochrane Handbook we assumed the highest quality rating for each comparison.29 Based on the assessment of each of the above-mentioned factors, the certainty of evidence was downgraded to moderate, low or very low quality where appropriate.

Methods of analysis

The characteristics of the included trials (details of the physical exercise intervention, outcomes) were summarised and tabulated.

The available evidence was summarised visually to depict the comparative relationships between the different exercise interventions and no treatment. A network diagram was created for each outcome (pain, disability), in which nodes represent a class of intervention (as categorised in the inclusion criteria). The effect of pairwise comparisons of two interventions are shown as edges interconnecting the nodes, where the thickness of the edge lines represents the weight of pairwise comparisons. The number of studies contributing to each pairwise comparison are also shown on each edge.

NMA assumptions

The presence of inconsistency was assessed from indirect and direct evidence using node-splitting.30 31 Disagreement was tested statistically and reported using z-scores and p values.

To assess transitivity, we made the assumption that all physical exercise interventions included in the NMA are in-principle jointly randomisable and we inspected the distribution of measures that could potentially modify effects (age and sex) across the comparisons in the network.

Model heterogeneity was quantified using I2 . Forest plots were visually examined to identify any obvious inconsistency between direct and indirect treatment effects (loop consistency); further Cochrane’s Q statistic was calculated (Qtotal) and decomposed to describe within-design heterogeneity (Qwithin), and between-design (Qbetween) inconsistency. Comparison-adjusted funnel plots were used to visually inspect and assess for small study effects, and assess potential publication bias.32

Statistical models

Two frequentist NMAs were conducted: one for pain intensity and one for pain-related disability. Pairwise effect sizes were calculated by including all evidence available in the network.33 Effect measures for treatments that have not been compared in a pairwise RCT were compared indirectly by contrasting effect sizes of comparisons with a common comparator.15 34 35 Because previous systematic reviews of exercise for neck pain have shown varying effects, a random effects (DerSimonian and Laird 36) model was used to generate pooled standardised effect sizes. Corrected effect sizes (Hedges’ g) were used to allow for the inclusion of smaller studies. While Cohen’s d and Hedges’ g are similar, we used Hedges’ g as it has better performance over Cohen’s d with inclusion of small samples.37 Network forest plots, interval plots, league tables and P-scores were used to present the ranking of mixed (direct and indirect) effect sizes and 95% CIs for all combinations of treatments in the network. For a frequentist analysis, the P-score may be interpreted in a comparable way to the SUCRA.38 Contribution matrices were used to demonstrate the influence of individual comparisons, and the influence of direct and indirect evidence on the overall summary of effects.

Statistical package R39 was used for all statistical analyses. The netmeta R-package (V.1.2.0, https://cran.r-project.org/web/packages/netmeta/netmeta.pdf) was used to conduct the NMA. The netmeta package function forest.netmeta was used to create a visual network of nodes and connections.

Patient and public involvement

There was no patient or public involvement in this review.

Results

Study selection

The search strategy returned 6549 records, and the flow of studies through the review is presented in figure 1. Records were excluded based on the included participants (not chronic non-specific neck pain), the intervention or comparator (not a physical exercise intervention) or the study design (not an RCT). Ineligible studies were excluded for not reporting pain intensity or neck disability outcomes, not reporting follow-up assessment for either pain intensity or neck disability, or for reporting physical exercise interventions that were delivered in addition to other treatments (such as manual therapy or electrophysical agents). Forty RCTs were included in this network meta-analysis. The inter-rater agreement for the title and abstract screening was κ=0.992 (for chance-corrected, weighted kappa κw=0.916), and for the full-text screening κ=0.886 (κw=0.772).

Figure 1

Flow of studies through the review. RCT,randomised controlled trial.

Outcomes of several trials were reported in multiple articles, which we identified by authors, study dates, number of participants, demographics, and baseline characteristics. For the purpose of this NMA, only one of these articles was cited, however relevant data from all publications of the same trial were combined to extract the most complete data. This was the case for the following publications: Falla et al. (2008) was cited for three articles,40–42 Lansinger et al (2007) was cited for two articles,43 44 and Ylinen et al (2003) was cited for seven studies.45–51

Risk of bias assessment

Inter-rater agreement for the risk of bias assessment was κ=0.964 (κw=0.919). Overall, the risk of bias within individual studies assessed using the PEDro scale ranged between 5/11 and 9/11 (table 2). One of the included studies52 was found to be of poor methodological quality (PEDro score=4) and was therefore not included in the statistical part of the network meta-analysis.29 Due to the nature of included interventions being exercise and clinician-delivered, most studies were unable to blind participants. One study53 indicated subjects were unaware of the different intervention groups within the trial, however, as subjects were aware they received an exercise intervention, this was marked as high risk consistent with the assessment of other studies.

Table 2

Risk of bias assessment using the Physiotherapy Evidence Database (PEDro) scale

Study characteristics

Study characteristics for all 40 included studies are presented in table 3. Ambiguity around the sample (possible inclusion of traumatic neck pain when only ‘non-specific neck pain’ was reported) was resolved through contacting the original authors. Studies including a mixed sample (participants with non-traumatic and traumatic neck pain) were included if the proportion of traumatic neck pain was ≤25%. Details on participant allocation into treatment arms is provided in online supplemental files B and C. Pain intensity was reported in 38 studies (97%), 22 studies assessed pain intensity using VAS and 16 used NRS. Disability was reported in 29 studies (72%), assessed by NDI (n=24), NPNPQ (n=3) and NPAD (n=2).

Table 3

Overview of included studies (n=40)

The following sections (ie, presentation of network structure, summary of network geometry, synthesis of results) are reported separately for each of the two networks (ie, pain intensity and pain-related disability). Table 4 provides a summary of study characteristics, heterogeneity and inconsistency, as well as treatment rankings.

Table 4

Summary of study characteristics, heterogeneity and inconsistency, and treatment rankings

Pain intensity

Presentation of network structure

Thirty-eight studies were included in the NMA for pain intensity (figure 2A), including a total of 3151 participants with chronic non-specific neck pain. The NMA demonstrated the following ranking of P-scores for the interventions: strengthening and motor control (p=0.705), proprioceptive (p=0.704), strengthening and stretching (p=0.686), motor control (p=0.664), yoga/Pilates/Tai Chi/Qigong (p=0.627), stretching (p=0.600), strengthening (p=0.592), range of motion (p=0.485), prescribed physical activity (p=0.379), balance (p=0.282) and multimodal (p=0.186) exercise training. The P-score indicates the likelihood that an intervention is more effective than the other interventions in the network.38

Figure 2

Network constructed for pain intensity. The number of studies contributing to each comparison is shown as label on each edge. PA, physical activity; ROM, range of motion; YPTCQ, yoga/Pilates/Tai Chi/Qigong.

Summary of network geometry

As illustrated in figure 2, most evidence of the pain intensity network comes from pairwise comparison that have been reported by several studies, for example yoga/Pilates/Tai Chi/Qigong exercises versus no treatment and strengthening exercises versus no treatment. It is important to note that pairwise comparisons in the network were not only between different exercise interventions and no treatment, but also between two separate exercise interventions, for example, strengthening versus motor control exercises and yoga/Pilates/Tai Chi/Qigong exercises versus prescribed physical activity. The network plot further demonstrates that several multi-arm studies with more than two exercise interventions were included, indicated by the coloured areas between multiple nodes.

Synthesis of results

A ranked forest plot of the intervention effects demonstrates that, compared with no treatment, combined strengthening and stretching had the largest Hedges’ g effect size (−1.53, 95% CI: −3.47 to 0.41; only indirect evidence, GRADE=very low). Several other interventions demonstrated similar effect sizes: proprioceptive (−1.47, 95% CI: −2.76 to −0.18, only indirect evidence, GRADE=very low), combined strengthening and motor control (−1.44, 95% CI: −2.42 to −0.47; direct evidence: n=4, GRADE=moderate), motor control (−1.32, 95% CI: −1.99 to −0.65; direct evidence: n=3, GRADE=very low to moderate), yoga/Pilates/Tai Chi/Qigong (−1.25, 95% CI: −1.85 to −0.65; direct evidence: n=5, GRADE=low to moderate), stretching (−1.23, 95% CI: −2.23 to −0.24; direct evidence: n=1, GRADE=very low to low) and strengthening (−1.21, 95% CI: −1.63 to −0.78; direct evidence: n=9, GRADE=low to moderate) exercises (figure 3). Motor control, yoga/Pilates/Tai Chi/Qigong and strengthening exercises had the narrowest CIs providing some certainty in the results for these exercise types. In contrast the wide CIs of strengthening and stretching, proprioceptive, prescribed physical activity (−0.84, 95% CI: −1.47 to −0.20; direct evidence: n=5, GRADE=very low to low), balance (−0.38, 95% CI: −2.10 to 1.33; direct evidence: n=1, GRADE=very low), multimodal (−0.08, 95% CI: −1.70 to 1.53; direct evidence: n=1, GRADE=very low) and range of movement (−0.98, 95% CI: −2.51 to 0.56; no direct studies, GRADE=very low) exercises indicates greater uncertainty in the effects of these exercise types. The effect sizes for balance and combined exercises approaches were small with wide CIs. All comparative effects for all interventions in the pain intensity network are presented in a league table (table 5).

Figure 3

Forest plot for pain intensity, ranked by treatment effectiveness. PA, physical activity; ROM, range of motion; YPTCQ, yoga/Pilates/Tai Chi/Qigong.

Table 5

League table reporting the comparative effects for all interventions for the pain intensity network

Pain-related disability

Presentation of network structure

The NMA for pain-related disability included 28 studies and a total of 2336 participants with chronic non-specific neck pain (figure 4). The NMA demonstrated the following ranking of P-scores for the interventions: yoga/Pilates/Tai Chi/Qigong (p=0.888), motor control (p=0.739), strengthening (p=0.656), stretching (p=0.622), strengthening and stretching (p=0.602), proprioceptive (p=0.485), strengthening and motor control (p=0.479), range of motion (p=0.488) and multimodal (p=0.286). Notably, prescribed physical activity (p=0.156) was ranked below ‘no treatment’.

Figure 4

Network constructed for pain-related disability. The number of studies contributing to each comparison is shown as label on each edge. PA, physical activity; ROM, range of motion; YPTCQ, yoga/Pilates/Tai Chi/Qigong.

Summary of network geometry

The network geometry for the disability network, as illustrated in figure 4, is very similar to the network for pain intensity since most studies reported both pain intensity and disability outcome measures. While most evidence in the network comes from pairwise comparison, also in this network several multi-arm studies were included. Notably, similar to the pain intensity network, several pairwise comparisons were included between separate exercise interventions (that were not ‘no treatment’), for example, strengthening versus motor control exercises and yoga/Pilates/Tai Chi/Qigong exercises versus prescribed physical activity.

Synthesis of results

A ranked forest plot of the intervention effects demonstrates that, compared with no treatment, yoga/Pilates/Tai Chi/Qigong exercise training had the largest Hedges’ g effect size (−1.16, 95% CI: −1.75 to −0.57; direct evidence: n=4, GRADE=very low to moderate). Several other interventions demonstrated similar effect sizes: motor control (−0.87, 95% CI: −1.45 to −0.29; direct evidence: n=4, GRADE=very low to low), strengthening (−0.75, 95% CI: −1.28 to −0.22; direct evidence: n=3, GRADE=very low) and stretching (−0.70, 95% CI: −1.52 to −0.11; direct evidence: n=1, GRADE=very low) exercises (figure 5). Yoga/Pilates/Tai Chi/Qigong, motor control and strengthening exercises had the narrowest CIs providing some certainty in the results for these exercise types. In contrast, the wide CIs of strengthening and stretching (−0.69, 95% CI: −1.83 to 0.46 only indirect evidence, GRADE=very low), proprioceptive (−0.47, 95% CI: −1.62 to 0.67; only indirect evidence, GRADE=very low), combined strengthening and motor control (−0.45, 95% CI: −1.25 to 0.35; direct evidence: n=3, GRADE=very low), range of movement (−0.26, 95% CI: −1.62 to 1.09; only indirect evidence, GRADE=very low), multimodal (−0.01, 95% CI: −1.37 to 1.34; direct evidence: n=1, GRADE=very low) and prescribed physical activity (0.13, 95% CI: −0.53 to 0.78; direct evidence: n=3, GRADE=very low) exercises indicates greater uncertainty in the effects of these exercise types. The effect sizes for balance and combined exercises approaches were small with wide CIs. All comparative effects for all interventions in the disability network are presented in a league table (table 6).

Figure 5

Forest plot for pain-related disability, ranked by treatment effectiveness. PA, physical activity; ROM, range of motion; YPTCQ, yoga/Pilates/Tai Chi/Qigong.

Table 6

League table reporting the comparative effects for all interventions for the pain-related disability network

Of the included studies, one study with a high risk of bias (Duray et al 52) was not included in the NMA. This study investigated the effects of proprioceptive exercises versus ‘no treatment’ in 40 individuals. While both groups significantly improved in terms of both pain intensity and pain-related disability, the proprioceptive exercise group had a significantly greater improvement. This is consistent with the results of the pain intensity NMA. The effect on disability is less consistent in this NMA, reflected by a wide CI that crosses ‘0’.

Exploration for inconsistency

Node splitting for both the pain intensity and disability outcome networks showed no statistically significant inconsistency between direct and indirect estimates (refer to online supplemental file F for the p values for inconsistency). Forest plots (online supplemental files H and I) demonstrate direct, indirect and network estimates for both the pain and disability network.

GRADE assessment

The GRADE approach was used to assess study limitations, indirectness and transitivity, inconsistency, imprecision and publication bias. Table 7 presents a summary of the certainty of evidence for the two networks (all details on the GRADE assessment for all pairwise comparisons are provided in online supplemental files D and E). Reasons for downgrading were imprecision, severe imprecision and risk of bias. Funnel plots were visually inspected and the p value for the Egger test was considered in order to assess the symmetry of both networks (online supplemental files F and G). The funnel plot for the pain intensity network was considerably asymmetrical, and the p value for its Egger test (p=0.0006) indicated statistically significant asymmetry. This mean that there is a potential existence of publication bias, which was not the case for the disability network’s funnel plot.

Table 7

Summary of certainty of evidence (GRADE approach) for networkmeta-analysis in studies examining the effects of physical exercise in individuals with chronic non-specific neck pain, for pain intensity and pain-related disability.

Discussion

This is the first study using NMA to investigate the comparative effectiveness of different physical exercise interventions for people with chronic non-specific neck pain. While none of the interventions was superior, three types of exercise (ie, motor control, yoga/Pilates/Tai Chi/Qigong and strengthening) were found to be most effective in improving both pain intensity and disability, when compared with no treatment. According to the GRADE criteria, the quality of the evidence was very low, indicating that the findings should be interpreted with caution. ‘No treatment’ was least effective followed by combined exercise approaches, range of movement exercise and prescribed physical activity (for disability).

Summary of findings

Two networks were constructed investigating the effects of physical exercise interventions on pain intensity (n=38 RCTs, n=3151 subjects) and on pain-related disability (n=29 RCTs, n=2336 subjects). In terms of improving pain intensity, several exercise interventions showed similar effect sizes. However, the effects of some of these interventions (combined strengthening and stretching, proprioceptive exercise) had wide CIs and were based on only a few studies and small sample sizes. In contrast, motor control, strengthening and yoga/Pilates/Tai Chi/Qigong exercise showed similar large effect sizes with more narrow CIs and a greater number of studies and participants. These exercise types would seem to be promising approaches for neck pain. Similar results were found for improving pain-related disability. For other exercise types, including physical activity, range of motion, balance, strengthening and stretching combined, and multimodal exercises, effects were small with wide CIs indicating inconsistency and uncertainty in the interpretation of findings.

We were not able to address the secondary research question because insufficient data were available within separate nodes to allow for an investigation of the differential effects of different exercise durations, intensities and frequencies. Therefore, we cannot provide guidance on the dose-response relationship between exercise therapy and treatment effectiveness. Clinicians can refer to table 3 for details around the parameters of the exercise interventions.

Clinical implications of findings

The results of the NMA indicate that there is not one superior type of exercise that should be prescribed for chronic neck pain. However, three interventions (ie, motor control, yoga/Pilates/Tai Chi/Qigong and strengthening) showed large effect sizes, narrow CIs and a relatively large number of studies and participant numbers for both pain and disability outcomes. Using the recommended method in the Cochrane handbook29 and SD of 2.0 for VAS pain intensity54 and 13% for the NDI,55 the effect sizes can be re-expressed in the units of measurement of these commonly used tools. Compared with no treatment, motor control exercises would reduce pain intensity by 2.6 points, yoga/Pilates/Tai Chi/Qigong by 2.5 points and strengthening exercises by 2.4 points, all clinically relevant effects (usually defined by a 2.0 point change on a VAS).56 For disability and using the NDI as the example, motor control exercises would improve disability by 11%, yoga/Pilates/Tai Chi/Qigong by 15% and strengthening exercises by 9.8%, close to or exceeding a clinically relevant difference of 10% on the NDI.57 These results indicate that these three exercise approaches are the most promising and could be clinically useful in the management of chronic neck pain. As such, a clinician could choose any of these three physical exercise interventions with the highest confidence of improving patient-reported outcomes. Until further high-quality evidence becomes available, the selection of motor control, yoga/Pilates/Tai Chi/Qigong or strengthening exercises may be aligned with clinician and/or patient personal preference.

Although the majority of the exercise interventions showed positive effects on pain intensity, most of these are associated with wide CIs. In particular, the CIs for combined strengthening and stretching, range of motion, balance and multimodal exercises cross ‘0’, indicating the effectiveness of these interventions is uncertain. A similar pattern of results was found for disability outcomes; aside from motor control, yoga/Pilates/Tai Chi/Qigong and strengthening exercises, the effects of all other exercise types had wide CIs that cross ‘0’ indicating uncertainty of treatment effectiveness. Small effect sizes and wide CIs for range of motion, multimodal and prescribed physical activity indicate that these interventions may not be effective in improving pain-related disability. Range of motion, balance and multimodal exercises are commonly recommended in clinical guidelines for the management of neck pain.6 Based on the results of the NMA, these recommendations should be revisited in the development of future clinical guidelines.

It should be noted that some of the evidence was obtained from a small number of pairwise comparisons with small sample sizes. This may have led to some unlikely results, for example, strengthening and stretching as separate interventions were effective in improving pain and disability (compared with ‘no treatment’: nine direct comparisons for strengthening and one for stretching), but when combined (no direct comparison) appeared to lose their effectiveness. Similarly, the finding that prescribed physical activity did not improve disability is surprising, but perhaps also reflects imprecision due to the small number of studies (online supplemental file E). We found that the signs (ie, positive or negative) of some effect estimates point in opposite directions, which is unsurprising in cases where the estimates are close to zero or where precision is low because of lack of data. In the evaluation of precision, we did therefore not focus on sign alone.

Other network meta-analyses

No other NMA investigating the effectiveness of different interventions for chronic neck pain is available for comparison, though some other groups have published protocols indicating they are conducting similar work.58 59 These NMAs are substantially different to the current NMA, however, as they (1) include any intervention, not only exercise, for neck pain, (2) do not differentiate between the numerous different types of exercise and (3) do not only include non-specific neck pain, but also traumatic neck pain.

In contrast to the two above-mentioned NMAs, our NMA only included physical exercise interventions. By focussing on these exercise interventions, we were able to differentiate between specific, separate types of physical exercise. This can provide more nuanced information for clinicians who prescribe therapeutic exercise. International guidelines for the treatment of neck pain state that different types of exercise should be incorporated for the best treatment effects.60 These include strengthening and motor control exercises, and the current NMA demonstrates that these types of exercise are expected to exert similar effects.

Including different types of chronic neck pain in one NMA affects the transitivity assumption of NMA, as all interventions should be joint randomisable across all participants.26 From a clinical point of view, there may be differences in the therapeutic exercise prescription between individuals with chronic non-specific neck pain and those with chronic traumatic neck pain, for example, whiplash associated disorder.6 Through a preliminary literature search we identified that it was not possible to conduct separate NMAs for non-specific neck pain and traumatic neck pain due to an insufficient number of RCTs for traumatic neck pain. For this reason, we have included only chronic non-specific neck pain, a group that is more common than any other type of neck pain.61 While not all potential effect modifiers were investigated, we compared the descriptive statistics of the participant characteristics across the included studies. Based on this investigation, we made the assumption of joint randomisability.

Two recent NMAs investigated the effectiveness of exercise interventions19 and exercise and education62 in individuals with chronic non-specific low back pain. The findings of the current NMA are similar to these two NMAs, both concluding that several types of exercise are equally effective. Owen and colleagues19 found that Pilates, aerobic, stabilisation/motor control, multimodal, resistance and ‘other’ exercise training were effective in improving pain. Stabilisation and motor control, resistance, water-based, Pilates, yoga, multimodal, aerobic and ‘other’ exercises were most effective in improving disability. While slightly different exercise nodes were identified compared with the two aforementioned NMAs, results of our NMA on neck pain were broadly similar to those conducted in low back pain.

Limitations

An inherent limitation of NMA is deriving indirect evidence from a limited number of direct pairwise comparisons. For the pain intensity NMA, 19 direct pairwise comparisons were available between the 10 nodes of the network, from which indirect evidence for a total of 63 pairwise comparisons was derived. For the pain-related disability NMA, 18 direct comparisons were available between the 10 nodes, from which 59 indirect comparisons were derived. Despite NMA allowing for combining direct and indirect evidence, some of the results may be derived from small sample studies further affecting the certainty of evidence. In the current study, this is apparent from the forest plots for both NMAs, demonstrating uncertainty in the point measure as well as large CIs for many of the exercise interventions. Although these limitations are inevitable, the proportion of indirect evidence and the precision of the effects have been taken in consideration in the GRADE assessment. From this assessment, we determined that the certainty of evidence for both networks is very low. Furthermore, the inconsistency measure was not statistically significant, indicating that evidence from direct and indirect comparisons is not different. While PEDro is commonly used to assess methodological quality, compared with the Cochrane Risk of Bias 2 tool it may underestimate risk of bias of included studies.63

Measures of treatment effectiveness for pain intensity and pain-related disability were extracted from the included studies at the follow-up time point closest to the completion of the treatment period. From the data provided in table 3 it is apparent that treatments of varying durations, ranging from 1 to 12 months, were included in the analyses.

Our grading of the evidence quality as very low is consistent with a recent Cochrane review investigating therapeutic exercise for neck disorders, which found that no high quality evidence was available.64 Reasons for downgrading the quality of evidence included in our NMA were substantial risk of bias and imprecision. Although imprecision may be caused by variance in treatment outcomes, that is, some patients will respond well to a treatment, and others do not, the quality of evidence appears to be primarily limited by small sample sizes, few studies for some comparisons and (for future studies preventable) flaws in study designs. Performing more small-sized RCTs, and systematic reviews and meta-analyses of existing literature, will inevitably show similar low quality evidence for modest effects of exercise. Therefore, it appears to be more beneficial to conduct well-designed, large trials investigating the comparative effective of different types of physical exercise for people with neck pain.

Compared with the forest plot for pain intensity (figure 3), visual inspection of the forest plot for pain-related disability (figure 5) reveals that the CIs are considerably wider. A possible reason for this is that the tools used for the assessment of pain-related disability (ie, NDI, NPAD and NPNPI) are multifactorial questionnaires and comprise of different aspects of disability. Rather than only assessing one aspect in the pain intensity NMA, the pain-related disability NMA assesses effectiveness on aspects such as pain interference, physical function, psychological function and social status. While this is inherent to these types of tools, it is worth noting that it is likely that combining pairwise comparisons with a high level of uncertainty have led to wide CIs in the NMA.

Although we acknowledge there are differences between yoga, Pilates, Tai Chi and Qigong exercises, for the purpose of this NMA we combined these regimes into one exercise ‘type’ (node). Considering these are all whole-body, non-specific exercise regimes (ie, the exercises do not specifically target cervical function), they did not suit combining with any of the other types of exercise. It was not possible to separate these interventions into four separate nodes due to the low number of studies. Although it is possible that they may have different effects on pain intensity and pain-related disability, the forest plots would suggest otherwise. The 95% CIs were small, indicating precision in the outcome measure and consistent results across the exercise regimes within this group.

Lastly, due to a variety of mixed strengthening regimes, we were not able to separate out strengthening exercises. While Gross et al 64 reported upper limb and cervical exercises separately, both interventions included also scapulothoracic and shoulder stabilisation exercises. Since clinically there is often no clear distinction between just cervical exercises and exercises for the cervical, scapulothoracic and shoulder girdle regions, exercises commonly target multiple areas. While this reflects common clinical practice, it does not allow for further investigation of the effectiveness of isolated strengthening exercises.

Conclusion

The findings of this NMA indicate that there is not one superior type of exercise for people with chronic non-specific neck pain. Rather, compared with ‘no treatment’ some exercise types have positive effects on pain intensity and pain-related disability, including motor control, yoga/Pilates/Tai Chi/Qigong and strengthening exercises. Other types of physical exercise were less consistently effective, and some interventions (range of motion, balance and multimodal exercises) were found to be not effective. Comparing all interventions against each other did not identify which types of exercise were superior to others. These novel findings may assist clinicians to choose an appropriate exercise intervention for individuals with chronic non-specific neck pain, while recognising that the certainty of evidence included in the two NMAs is very low.

What is already known, what are the new findings?

  • Exercise therapy is commonly prescribed for people with chronic non-specific neck pain.

  • The most effective exercise type for people with neck pain is not known.

  • Motor control, yoga/Pilates/Tai Chi/Qigong and strengthening exercises were found to have large effects on pain and disability. There was no one form of exercise that was superior to others.

  • Some interventions (range of motion, balance and multimodal exercises) had uncertain or negligible effects.

  • The certainty of evidence for physical exercise interventions to manage chronic non-specific neck pain was graded as very low.

Supplemental material

Ethics statements

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Twitter @DrRdeZoete, @MicheleSterlin7

  • Contributors RMJdZ conceived the content, wrote the paper and approved the final version. RMJdZ and NRA conducted the analyses. NRA, JM and KC contributed and approved the final version of the paper. MS conceived the design of the study, contributed to writing the paper and approved the final version of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.