Article Text

Download PDFPDF

Sharing data–taming the beast: barriers to meta-analyses of individual patient data (IPD) and solutions
  1. Marienke van Middelkoop1,
  2. Stefan Lohmander2,
  3. Sita M A Bierma-Zeinstra1
  1. 1 Department of General Practice, Erasmus MC Medical University Center, Rotterdam, The Netherlands
  2. 2 Faculty of Medicine, Department of Clinical Sciences Lund, Orthopaedics, Lunds Universitet, Lund, Sweden
  1. Correspondence to Dr Marienke van Middelkoop, General Practice, Erasmus Medical Center, Rotterdam 3015 GD, The Netherlands; m.vanmiddelkoop{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Systematic reviews aim to collate all empirical evidence that fits prespecified eligibility criteria to answer a specific research question. Some systematic reviews undertake a meta-analysis to statistically combine study results and provide a more precise estimate of treatment effects. These meta-analyses are commonly based on aggregate data, extracted from publications or obtained from the original authors of these papers1 but aggregating data limits the options for in depth analysis.

Meta-analysis of individual patient data (IPD) appeared in the 1990s2 to address these problems. A great advantage of IPD analysis is that it offers investigators the opportunity to investigate whether an intervention is differentially effective for different types of participants. Quantifying interaction effects using IPD increases power and generalisability of results and is considered the gold standard for subgroup analyses.3

We congratulate Hayden and colleagues (linked paper BJSM 2020, bjsports-2019-101205)4 who undertook an IPD to identify subgroups of patients who particularly benefit from exercise therapy for low back pain.4 From a total pool of 56 eligible trials, the authors retrieved data from 27 studies (3514 participants). This reflects the major challenge when performing IPD analyses—retrieving raw data from multiple trials.

Let us drill down on some specific barriers to successfully obtaining the ‘D’ in IPD—data (figure 1). The first step, contacting original authors, can be challenging and some authors reached do not respond.5 6 Moreover, authors often refuse to participate or report that data is no longer available at their institute, or that they do not have intellectual property rights to the original data. The latter is particularly common when pharmaceutical companies own the data.5–7

Figure 1

Steps to be taken in individual patient data approach and potential barriers and challenges.

Even when authors are willing to share data, the taming of the beast is just beginning. Data regulations and governing laws can be quite complex. As they differ markedly between countries, and most IPD meta-analyses include data from many countries, it can be extremely difficult to obtain a data delivery agreement signed by all parties. In some countries such as Canada and Australia, new analyses (ie, the IPD meta-analysis) require new ethics committee approval.

Once researchers have retrieved data, they face further challenges. Hayden and colleagues tried to verify the data and were able to replicate the main study outcomes of fewer than 50% of the trials. The authors were able to analyse 18 potential effect modifiers but many of these had up to 75% missing data (eg, ‘history of low back pain’). Potential effect modifiers are often not measured in the foundation studies or are inconsistently available—this greatly constrains researchers from analysing potential treatment effect modification and is a frequent problem in IPD analyses.

System-wide efforts to overcome some of these barriers

What is being done to stimulate open access to research data? Funding agencies increasingly require data be shared after a project is finished and in 2016 the Council of the European Union encouraged member countries to transition to an open science system. In the Netherlands, ZonMW (The Netherlands Organisation for Health Research and Development) supports FAIR (findable, accessible, interoperable, reusable research data), requiring researchers to share their data to contribute to future research.8 The US National Institutes of Health has a similar requirement.

Scientific journals increasingly encourage open access of data. BMJ has adopted different policies on data sharing, depending on the specific journal. These include the requirement and encouragement to openly and publicly make generated data available on publication. Notwithstanding, this is still voluntary in many of these journals. BMJ also accepts DataCite DOIs that make it possible to cite publicly available used data in reference lists.

Successful examples of studies with open data by design include the osteoarthritis (OA) initiative and the cohort hip and cohort knee (CHECK)-studies, two multicentre, longitudinal, prospective observational studies of knee and hip OA.9 10 All collected individual data are openly accessible or available on application, resulting in more than 600 publications, attesting to the power of open data sharing.

Multiple initiatives have now been launched to build collaborations for the development of IPD banks to facilitate data accessibility, such as the OA Trial Bank for clinical OA research and the World COACH study for morphological data of the hip.11 These initiatives provide consistent and transparent rules of collaboration and agreements for sustainability and accessible sharing of data.

Sport Data Valley in the Netherlands aims to connect sport with science, government and companies. All sport science and medicine related data can be uploaded into the repository and access rights are adjustable per dataset, and data ownership remains at the principal investigator. Such repositories make data widely accessible to a broad audience.12 Other examples of controlled access repositories of data include the and the Yale University Open Data Access (YODA) project.

Although many challenges remain, the time investment and barriers facing IPD analyses should decrease in the coming years. We expect that data will be richer and more consistent given the disease-specific reporting standards and core data sets launched in many fields of research. However, researchers and pharmaceutical companies must be willing to share data so that the potential value of IPD analyses is realised. To maximise the use of individual participant data collected in clinical studies is also to fulfil the ethics contract with the study participants (table 1).

Table 1

Critical elements to improve future individual patient data analyses



  • Twitter @mvanmiddelkoop

  • Contributors Work was initially conceived by MvM, SL, SMAB-Z. Substantial contributions to the conception of the work were made by all authors. Drafting and revising the work critically was done by all authors. Final approval of the version published was given by all authors.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Commissioned; externally peer reviewed.

Linked Articles

  • Review
    Jill A Hayden Maria N Wilson Samuel Stewart Jennifer L Cartwright Andrea O Smith Richard D Riley Maurits van Tulder Tom Bendix Francesca Cecchi Leonardo O P Costa Ninna Dufour Manuela L Ferreira Nadine E Foster Maruti R Gudavalli Jan Hartvigsen Pieter Helmhout Jan Kool George A Koumantakis Francisco M Kovacs Tiina Kuukkanen Audrey Long Luciana G Macedo Luciana A C Machado Chris G Maher Wolf Mehling Giovanni Morone Tom Peterson Eva Rasmussen-Barr Cormac G Ryan Tuulikki Sjögren Rob Smeets J Bart Staal Monica Unsgaard-Tøndel Henry Wajswelner Ella W Yeung