Article Text

Download PDFPDF

Identifying the ‘incredible’! Part 1: assessing the risk of bias in outcomes included in systematic reviews
  1. Fionn Büttner1,
  2. Marinus Winters2,
  3. Eamonn Delahunt1,3,
  4. Roy Elbers4,
  5. Carolina B Lura2,
  6. Karim M Khan5,
  7. Adam Weir6,7,8,
  8. Clare L Ardern9,10,11
  1. 1 School of Public Health, Physiotherapy and Sports Science, University College Dublin, Dublin, Ireland
  2. 2 Center for General Practice at Aalborg University, Aalborg, Denmark
  3. 3 Institute for Sport & Health, University College Dublin, Dublin, Ireland
  4. 4 Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
  5. 5 Department of Family Practice, The University of British Columbia, Vancouver, British Columbia, Canada
  6. 6 Sports Groin Pain Centre, Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar
  7. 7 Department of Orthopaedics, Erasmus MC University Medical Center for Groin Injuries, Rotterdam, The Netherlands
  8. 8 Sport Medicine and Exercise Clinic Haarlem (SBK), Haarlem, The Netherlands
  9. 9 Sport and Exercise Medicine Research Centre, La Trobe University, Bundoora, Victoria, Australia
  10. 10 Department of Medicine and Health Sciences, Division of Physiotherapy, Linköping University, Linköping, Sweden
  11. 11 Division of Physiotherapy, Karolinska Institute, Stockholm, Sweden
  1. Correspondence to Fionn Büttner, School of Public Health, Physiotherapy and Sports Science, University College Dublin, Dublin 4, Ireland; fionn.cleirigh-buttner{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Systematic reviews fulfil a vital role in modern medicine.1 However, the results of systematic reviews are only as valid as the studies they include.2 Pooling flawed, or biased, results from different studies can compromise the credibility of systematic review findings. Bias is a systematic deviation from the truth in the results of a research study that can manifest due to limitations in study design, conduct, or analysis.3

The results of sport and exercise medicine research, like results in other fields, are vulnerable to bias.4 It is important that systematic review authors assess for bias in a way that enables a judgement about whether a review outcome is at risk of bias due to methodological limitations in included studies. This two-part education primer focuses on how systematic review authors can perform and interpret risk of bias assessments to avoid misleading systematic review conclusions. In this editorial, we introduce the concept of risk of bias, and the principles of assessing risk of bias.

Bias: the basics

Different biases have effects that vary in direction and magnitude.3 5 It is challenging to precisely determine how bias may overestimate or underestimate a study’s true findings. In fact, bias does not always result in distorted study findings and one can never be certain that bias is present when a study has methodological limitations. However, methodological limitations in study design, conduct, or analysis can be consistently associated with inflated research findings.5 Due to this uncertainty, study outcomes are considered to be at risk of bias rather than ‘biased’.

Studies with ‘some concerns’ or at ‘high’ risk of bias in design, conduct, analysis, or reporting are at greater risk of inflated findings compared with studies at ‘low’ risk of bias, negatively affecting the probability that study findings accurately reflect reality.5 6 Assessing the risk of bias of study outcomes that are included in a systematic review allows readers to interpret the credibility of review findings.

Do not confuse risk of bias with study quality

Risk of bias is a clearly defined term and refers to the perceived risk that the results of a research study deviate from the truth.3 Unfortunately, risk of bias is often conflated with study quality, despite being distinct constructs (table 1).

Table 1

Key terms relating to risk of bias and critical appraisal

Study quality is a vague and multidimensional term that loosely indicates how closely a research study is conducted to the highest possible methodological standards.3 Quality refers to several areas of study methodology, with each area having different implications for how one should interpret a study’s methodological rigor (table 1).7 A risk of bias assessment should not be replaced by an assessment of study quality.8 When critically appraising a research study, assessors should prioritise how closely a study’s findings may approximate the truth (ie, risk of bias) over how well the study was conducted given the capabilities of the study investigators (ie, quality) (table 1).

Use domain-based risk of bias assessment tools instead of quality scales and checklists

A plethora of assessment tools are available to critically appraise a research study.9 However, not all of these tools are appropriate to assess risk of bias. This can confuse researchers about which tool is the most suitable tool to use. Broadly, three types of tools exist to assist researchers and readers in critically appraising a study: (1) quality scales, (2) quality checklists, and (3) domain-based risk of bias tools.3 We explain why domain-based risk of bias tools are preferred over quality scales and checklists when assessing risk of bias.

Quality scales and quality checklists vary substantially in content, complexity, and rating criteria, and often include items that are not related to bias.10 Quality scales assign numeric values to scale items and combine information about several methodological features in a study to produce a summary score.9 For example, the PEDro scale includes items related to internal validity (eg, random allocation) and reporting (eg, clear description of participant eligibility criteria). A lack of a random allocation undermines the credibility of a study’s findings (ie, there is a ‘high risk’ of selection bias). Unclear eligibility criteria challenge a study’s reproducibility and make it difficult to judge to whom the study findings are applicable (ie, external validity). In the presence of good reporting but poor methodological conduct, such a quality assessment may overestimate the credibility of study findings.11 12

Quality checklists contain items that relate to study quality without assigning numeric values or producing a summary score.3 9 For example, the Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies contains items relating to reporting, sample size, statistical power, precision, external validity (applicability), and internal validity (bias); requiring ‘yes’, ‘no’ or ‘other’ responses to each item. Such quality checklist items do not solely address risk of bias and are not intended to be summed to produce one numeric score. However, review authors frequently modify quality checklists (by assigning arbitrary numeric values) to generate summary scores and summarise study quality. Summary scores do not inform the reader which biases might be present.12

Using quality scales and quality checklists is discouraged because different scales tend to generate conflicting conclusions when applied to the same studies.11 Quality scales are also prone to misleading conclusions when using cut-off thresholds that arbitrarily categorise study quality as ‘high’, ‘moderate’, or ‘low’.13

Domain-based risk of bias assessment tools are currently the commonly accepted and preferred method to judge the credibility of study findings.3 Domain-based tools evaluate study limitations in specific domains that represent different biases (eg, bias arising from the randomisation process).3 5 Domain-based tools overcome many shortcomings of quality scales, as they evaluate individual components that relate to study design, conduct, and analysis rather than a single summary score.14 Several study design-specific, domain-based risk of bias assessment tools have been developed.15–20 The Cochrane Risk of Bias tool 2 is a rigorously developed, domain-based risk of bias assessment tool that assesses the limitations of randomised controlled trials across five bias domains.21 Each bias domain possesses strong empirical evidence that study limitations may distort study findings.5

Risk of bias assessment method

Risk of bias assessments should be performed for each outcome of interest rather than as one general assessment for each study.22 If a study includes multiple outcomes and time-points, separate risk of bias assessments should be undertaken for each included outcome (table 2). Bias can impact review outcomes differently,5 22 underscoring the need for separate risk of bias assessments when multiple outcomes are reported (table 2). Cochrane recommends two approaches to risk of bias assessments.3 Both approaches involve a domain-based risk of bias assessment of separate outcomes, assessing:

Table 2

Concepts in the assessment of risk of bias

  1. Individual review outcomes, in each individual study, based on individual risk of bias domains.

  2. Individual review outcomes, across included studies (ie, meta-analysis level), based on individual risk of bias domains.

In part 2, we demonstrate both risk of bias assessment methods.


In this editorial, we introduced risk of bias as the perceived risk that the results of a research study may underestimate or overestimate the truth. Systematic review authors should perform a domain-based risk of bias assessment that reflects risk of bias instead of assessing study quality. If a research study reports multiple outcome measures, separate risk of bias assessments should be performed for each outcome measure.

In part 2 of this risk of bias education primer, we:

  1. Evaluate the prevalence and methods of risk of bias assessments in systematic reviews published in BJSM.

  2. Perform a risk of bias assessment on a sample of RCTs in a systematic review.

  3. Illustrate the impact that different critical assessment tools have on risk of bias assessment findings, and ultimately, systematic review findings.

  4. Provide recommendations for systematic review authors undertaking risk of bias assessments.



  • Twitter @peanutbuttner, @marinuswinters, @EamonnDelahunt, @clare_ardern

  • Contributors MW and CLA conceived the original idea. FCB, MW, ED, and CLA developed the original idea. FCB composed the initial manuscript draft. MW, ED, and CLA provided comments on and contributed towards the writing of the initial manuscript draft. FCB, MW, ED, RE, CBL, KMK, AW, and CLA provided comments on and contributed towards the writing and editing of the final manuscript draft.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests Karim M Khan is BJSM Editor-in-Chief. Adam Weir is a BJSM Deputy Editor. Eamonn Delahunt and Marinus Winters are BJSM Senior Associate Editors. Clare L Ardern was a BJSM Deputy Editor until July 2018. Roy Elbers is a member of the ROB2 Development Group.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles