Article Text

Download PDFPDF

Some methodological issues in the design and analysis of cluster randomised trials
  1. Mohammad A Mansournia1,2,
  2. Douglas G Altman3
  1. 1 Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran
  2. 2 Sports Medicine Research Center, Neuroscience Institute, Tehran University of Medical Sciences, Tehran, Iran
  3. 3 Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Centre for Statistics in Medicine, University of Oxford, Oxford, UK
  1. Correspondence to Professor Mohammad A Mansournia, Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran; mansournia_ma{at}

Statistics from

Randomised trials are widely used for assessing the effect of interventions on outcomes, because on average randomisation balances covariates between treatment groups, even if those covariates are unobserved.1 2 In some situations, it is more convenient to assign treatments at random to clusters (eg, clubs, schools, teams and so on) into which individuals fall naturally. This design, which minimises the risk of contamination that would occur if individuals from the same cluster were randomised to different treatment groups, is a cluster randomised controlled trial.3 Several recent BJSM papers have reported cluster randomised trials.4–6 Among these, two studies assessed the effect of movement control exercise programmes on musculoskeletal injury and concussion risk in schoolboy and adult rugby players4 5 and a third assessed the effect of an exercise programme on the prevalence of shoulder problems in elite handball players.6 Here we review some important methodological aspects of cluster randomised controlled trials in the context of these studies. Some of these issues also affect individual randomised trials.

Design effect

The most important methodological aspect of cluster randomised trial is that the effective sample size is less than the number of recruited individuals because the responses of individuals within the same cluster are likely to be positively correlated. As a result, the variance of the effect estimate in cluster randomised trials will be inflated compared with an individual randomised trial with the same sample size. This variance inflation factor, sometimes known as the design effect, increases with both intracluster correlation coefficient (ie, the proportion of the total variance of the effect estimate which can be explained by the variation between clusters) and average cluster size.

Although the value of intracluster correlation coefficient in cluster randomised trial tends to be small (typically <0.1), the resulting design effect can be quite substantial if the clusters are large. The cluster randomised trial design effect should be accounted for in sample size calculation; otherwise the study power would be lower than the nominal level (eg, 80%). Likewise, clustering should be adjusted for in the analysis if it is at individual level, to avoid spurious precision of the estimates (CIs which are too narrow) and optimistic hypothesis testing (p values which are too small). Sometimes it is sensible to analyse at cluster level (analysing summary of individual responses).3

Cluster randomised trials are strongest (in terms of power and precision of the estimates) with a large number of small clusters and weakest with a small number of large clusters. As a general rule of thumb, power does not increase much once the number of subjects per cluster exceeds the inverse of intracluster correlation coefficient.7 Of course, in sports medicine, the size of cluster is determined by context not investigators.

Hislop and colleagues4 do not mention the cluster randomised trial design effect in relation to either the sample size calculation or data analysis. Attwood et al 5 however, mention that the sample size calculation was adjusted for a cluster coefficient (between-cluster coefficient of variation which is closely related to intracluster correlation coefficient) of 0.26. Also, they used generalised estimating equations along with a Pearson χ2 scaling parameter to account for design effect. Andersson et al 6 say that their sample size calculation was adjusted for an intracluster correlation of 0.1. They used generalised estimating equations too, but it is unclear whether this analysis accounted for two sources of clustering including team (used as cluster for randomisation) and repeated outcome measurements over months. None of the three papers clarified the level of analysis (cluster vs individual) or reported the intracluster correlation coefficient or k statistic for primary outcomes which can be helpful with planning of future cluster randomised trials.3

Adjustment for baseline imbalance

Randomised trials are subject to random (chance) confounding 8 as randomisation does not prevent confounding by baseline risk factors, but it only makes confounding random.1 2 The risk of random confounding is generally greater in cluster randomised trials than individual randomised trials as the number of clusters is often small.3 Both Hislop et al 4 and Attwood et al 5 papers presented the baseline characteristics of participants in table 1, but they did not adjust for non-negligible imbalances in some baseline characteristics in their analysis.9 Confusingly, Hislop et al 4 paper reported p values for baseline comparisons (some of which were significant at 5% level), a common misuse of p values in the randomised trial literature. Andersson et al 6 used p value for comparisons at baseline as well as forward selection procedure for confounding adjustment10 which are likely to bias the treatment effect estimate.1 10

Attrition bias

Attrition due to loss-to-follow-up can occur in both individual and cluster randomised trials, leading to reduced sample size and consequent loss of precision. More importantly, attrition will result in biased estimates if attrition is associated with both treatment and risk factors of the outcome.11 The modified flow diagram in the CONSORT extension for cluster randomised trials should show attrition in both individuals and clusters.3

In cluster randomised trials, there is the specific possibility that not all clusters will provide data at the end of the trial. An extreme example is the study of Attwood et al 5 who reported that 19 out of 41 clubs in the intervention group and 21 out of 40 clubs in the control group did not provide outcome data. Hislop et al 4 paper reported more moderate drop out: 3 out of 20 schools in the intervention group and 6 out of 20 schools in the control group. Andersson et al 6 paper reported moderate exclusions: 67 out of 331 players in the intervention group and 59 out of 329 players in the control group but there is no statement about whether there were any lost clusters (teams).

Non-adherence to treatment and intention-to-treat analysis

In practice, both individual and cluster randomised trials can suffer from imperfect adherence to the assigned treatment. Three approaches to analysis of randomised trials with non-adherence are intention-to-treat (ITT) analysis (analyse according to assigned treatments regardless of adherence), as treated analysis (analyse treatments actually received), and per-protocol analysis (analyse received treatments among those who adhered to the assigned treatment).11 The ITT analysis is generally preferred to the other two as it provides a valid statistical test of the null hypothesis of no treatment effect. On the other hand, ITT analysis will often underestimate the effect of a treatment among those who adhere to it.

The true ITT effect can be estimated only in the absence of censoring and other forms of missing outcome, which is not the case for these studies. In the presence of non-negligible amount of missing outcomes, a multiple regression model can be used to adjust for baseline risk factors affecting censoring in an available case analysis. Other statistical adjustment methods for selection bias due to missing outcomes include multiple imputation and inverse probability weighting.1 12 13 Hislop et al 4 and Attwood et al 5 accounted for attrition in the sample size calculations, but they (and also Andersson et al 6) failed to adjust appropriately for selection bias due to attrition in their analysis. Attwood et al 5 mention using the last observation carried forward, a method well known to produce biased results.14

Hislop et al 4 reported non-adherence to exercise programmes as 69% and 83% in the intervention and control groups, respectively. Attwood et al 5 reported median (IQR across clusters) of non-adherence percentage to exercise programmes for different weeks as 85% (62%–90%) and 83% (65%–92%) in the intervention and control groups, respectively. Andersson et al 6 paper reported a compliance rate of 53% in the intervention group. While Hislop et al 4 and Attwood et al 5 papers reported the results of both ITT analysis and per-protocol analysis, Andersson et al 6 did not clarify whether their analysis is indeed ITT.

Moreover, an ITT analysis need not be conservative in the context of a trial comparing two active interventions such as two exercise programmes contrasted in the Hislop et al 4 and Attwood et al 5 papers.11 More advanced methods such as instrumental variable analysis and G-estimation can be used to adjust for imperfect adherence to the assigned treatment.15

Interpretation of the results

The effect sizes contrasting intervention and control groups should be presented for all pre-specified outcomes along with the corresponding 95% CIs and the results should be interpreted in the light of these estimates and not solely on whether p<0.050.16 It has been suggested that the results of randomised trials should be interpreted in relation to clinically important (eg, a relative risk of 0.5) and null values of the effect measure, as definitely important (CI is wholly below the important value), possibly important (CI is below the null value and includes the important value), not important (CI is below the null value and above the important value), inconclusive (CI includes both important and null values) and negative result (CI is above the important value and includes the null value).16 The difficulty here is how to define the important value of the effect measure, which will depend on the context.17 Of note, the same clinically important value should be used in the sample size calculation.1

Both Hislop et al 4 and Attwood et al 5 papers present rate ratios with 90% CIs and interpret the results based on a special magnitude-based inference18 which has been criticised for greatly inflating the type I error rate of hypothesis testing in small-to-moderate sample sizes.19 We note that CIs represent only uncertainty from random variation, but take no account of systematic errors like attrition bias, ascertainment bias and performance bias. Unfortunately, none of the three papers discussed the possible sources of bias.

Moreover, 90% CIs are narrower than 95% CIs, and in general, the level of confidence should be decided in the trial protocol before examining the data. Hislop et al 4 paper says ‘However, clear effects favouring the intervention programme were noted for head/neck injuries (incidence RR=0.72, 0.51 to 1.01), upper limb injuries (burden RR=0.66, 0.40 to 1.10) and concussion (incidence RR=0.71, 0.48 to 1.05)’. However, labelling these results as ‘clear effects’ looks problematic given that neither clustering nor attrition was adjusted for in the analysis and all 90% CIs included the null value.

Reporting cluster randomised trials

Authors should report cluster randomised trials according to the Consolidated Standards of Reporting Trials (CONSORT) extension to cluster randomised trials.3 There are many items specific to cluster randomised trials in this extension, some of which were mentioned above. Of the three studies, Andersson et al 6 and Attwood et al 5 papers stated that their study was in accordance with CONSORT extension to cluster randomised trials, though this educational review suggests that their adherence was not perfect.

We have addressed only a few main issues in this educational review. There are many other ways in which researchers need to take care in how they design, analyse and interpret cluster randomised trials.3 We hope that future cluster randomised trials in the BJSM will address the critical issues mentioned in this educational review.



  • Competing interests None declared.

  • Provenance and peer review Commissioned; internally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.