Outcomes | No of studies | Comparisons | Effects estimate (95% CI), p value | Certainty (GRADE) | Heterogeneity | |

Average estimate/assumed risk in exercise (PT± HEP) group | Average estimate/assumed risk in wait-and-see group | |||||

GROC
Short-term follow-up | 4 | 120 of 201 (59.7%) participants reported satisfactory recovery | 102 of 197 (51.8%) participants reported satisfactory recovery |
RR 1.18
(1.00 to 1.40), p=0.05 |
⊕⊕
Low§§¶¶
| χ^{2}=2.25,I ^{2}=0%, p=0.52 |

GROC
Mid-term follow-up | 3 | 146 of 181 (80.7%) participants reported satisfactory recovery | 135 of 177 (76.3%) participants reported satisfactory recovery |
RR 1.06
(0.95 to 1.17), p=0.29 |
⊕⊕
Low§§¶¶
| χ^{2}=0.11,I ^{2}=0%, p=0.94 |

GROC
Long-term follow-up | 3 | 162 of 185 (87.6%) participants reported satisfactory recovery | 152 of 181 (84%) participants reported satisfactory recovery |
RR 1.05
(0.97 to 1.13), p=0.26 |
⊕⊕
Low§§¶¶
| χ^{2}=0.70,I ^{2}=0%, p=0.70 |

Mean change in pain rating
Short-term follow-up | 4 | Mean pain score was 25.8 (range 18.5 to 42.0) in 197 participants | Mean pain score was 33.4 (range 27.3 to 63.0) in 193 participants |
SMD −0.33†‡
(−0.60 to–0.05), p=0.02 |
⊕
Very low§§*¶¶***
| χ^{2}=5.44,I ^{2}=45%, p=0.14 |

Mean change in pain rating
Mid-term follow-up | 3 | Mean pain score was 15.8 (range 13.3 to 21.0) in 176 participants | Mean pain score was 18.7 (range 17.4 to 19.8) in 174 participants |
SMD −0.13‡
(−0.34 to 0.09), p=0.24 |
⊕
Very low§§¶¶***
| χ^{2}=1.86,I ^{2}=0%, p=0.39 |

Mean change in pain rating
Long-term follow-up | 3 | Mean pain score was 7.8 (range 6.6 to 9.0) in 180 participants | Mean pain score was 13.3 (range 13.0 to 13.9) in 175 participants |
SMD −0.30†‡
(−0.51 to –0.09), p=0.005 |
⊕
Very low§§¶¶***
| χ^{2}=0.30,I ^{2}=0%, p=0.86 |

PFGS ratio
Short-term follow-up | 3 | Mean ratio was 67.6 (range 55.0 to 80.8) in 176 participants | Mean ratio was 66.7 (range 63.0 to 72.1) in 173 participants |
MD 1.37§(−8.10 to 10.84), p=0.78 |
⊕
Very low§§*¶¶***
| χ^{2}=5.38,I ^{2}=63%, p=0.07 |

PFGS ratio
Mid-term follow-up | 3 | Mean ratio was 84.1 (range 76.0 to 96.3) in 175 participants | Mean ratio was 79.4 (range 74.0 to 86.5) in 170 participants |
MD 5.03§(−0.69 to 10.74), p=0.08 |
⊕
Very low§§*¶¶***
| χ^{2}=1.66,I ^{2}=0%, p=0.44 |

PFGS ratio
Long-term follow-up | 3 | Mean ratio was 94.7 (range 90.0 to 100.9) in 179 participants | Mean ratio was 91.3 (range 85.9 to 96.5) in 172 participants |
MD 3.77§(−1.01 to 8.55), p=0.12 |
⊕
Very low§§*¶¶***
| χ^{2}=1.75,I ^{2}=0%, p=0.42 |

MGS ratio
Short-term follow-up | 2 | Mean ratio was 90.6 (range 80.8 to 92.9) in 118 participants | Mean ratio was 88.5 (range 88.1 to 89.9) in 114 participants |
MD 2.72¶
(−3.00 to 8.44), p=0.35 |
⊕⊕
Low§§¶¶
| χ^{2}=0.91,I ^{2}=0%, p=0.34 |

MGS ratio
Mid-term follow-up | 2 | Mean ratio was 98.8 (range 98.7 to 99.0) in 116 participants | Mean ratio was 97.7 (range 96.5 to 99.0) in 112 participants |
MD 1.32¶
(−3.34 to 5.98), p=0.58 |
⊕⊕
Low§§¶¶
| χ^{2}=0.21,I ^{2}=0%, p=0.65 |

MGS ratio
Long-term follow-up | 2 | Mean ratio was 102.4 (range 102.0 to 102.8) in 116 participants | Mean ratio was 101.9 (range 100.0 to 104.4) in 110 participants |
MD 0.32¶
(−4.38 to 5.02), p=0.89 |
⊕⊕
Low§§¶¶
| χ^{2}=1.13,I ^{2}=12%, p=0.29 |

Elbow disability
Short-term follow-up | 5 | Mean score was 28.8 (range 17.1 to 45.2) in 234 participants | Mean score was 35.3 (range 18.7 to 53.6) in 233 participants |
SMD −0.31**
(−0.64 to 0.02), p=0.07 |
⊕
Very low§§*¶¶***
| χ^{2}=12.20,I ^{2}=67%, p=0.02 |

Elbow disability
Mid-term follow-up | 3 | Mean score was 20.4 (range 12.8 to 26.5) in 176 participants | Mean score was 23.7 (range 13.5 to 32.8) in 172 participants |
SMD −0.11††
(−0.32 to 0.10), p=0.30 |
⊕
Very low§§¶¶***
| χ^{2}=0.47,I ^{2}=0%, p=0.79 |

Elbow disability
Long-term follow-up | 3 | Mean score was 11.5 (range 8.9 to 12.9) in 180 participants | Mean score was 18.2 (range 12.0 to 24.6) in 173 participants |
SMD −0.27†††
(−0.47 to –0.06), p=0.01 |
⊕
Very low§§¶¶***
| χ^{2}=0.75,I ^{2}=0%, p=0.69 |

Data for one study (44) were requested and provided from authors as means and SD.

*Inconsistent results between included studies.

†In favour of exercise (PT±HEP) compared with WSP.

‡Pooled weighted baseline pain 56.3—clinically significant difference 16.9 points change on pain scale (0–100); pooled weighted SDs at short-term, mid-term and long-term follow-up were 24.7, 21.9, and 17.5 points, respectively.

§Pooled weighted baseline PFGS ratio was 38.3—clinically significant difference 7.7 points change.

¶Pooled weighted baseline MGS ratio was 74.1—clinically significant difference 14.8 points change.

**Pooled weighted disability score at baseline was 55.5—clinically significant difference 37% or 20.5 points change; pooled weighted SD at short-term follow-up was 22.6 points.

††Pooled weighted disability score at baseline was 64.0—clinically significant difference 37% or 23.7 points change; pooled weighted SDs at mid-term and long-term follow-up was 23.5 and 22.8 points, respectively.

‡‡Clinically significant differences.

§§High risk of bias.

¶¶Indirect comparison in one or more included studies.

***Substantial differences at baseline pain scores among included studies.

GRADE, Grading of Recommendations Assessment, Development and Evaluation; GROC, Global Rating of Change Scale; HEP, home exercise programme; MD, mean difference; MGS, maximum grip strength; PFGS, Pain Free Grip Strength; PRFQ, Pain Free Function Questionnaire; PRTEE, Patient-rated Tennis Elbow Evaluation; PT, physiotherapy; RR, risk ratio; SMD, standardised mean difference; WSP, wait-and-see policy.