Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Statistical errors are common in many biomedical fields.1–5 We believe the nature and impact of these errors to be great enough in sports science and medicine to warrant special attention.6–14 Poor methodological and statistical practices have led to calls for change in other fields, such as psychology.15–18 We believe that a similar call to action is needed in sports science and medicine. Specifically, we see two pressing needs: (1) to increase collaboration between researchers and statisticians, and (2) to increase statistical training within the exercise science/medicine/physiotherapy (PT) discipline. Our call to action extends the work of those who have previously called for increased statistical collaboration in sports medicine and sports injury research.19–21
Though some academic sports science and medicine studies employ statisticians, such collaborations are an exception rather than the norm. To determine the extent of collaboration, we performed a systematic review of articles published in quartile one sports science journals in 2019 (see online supplementary file 1 for methods and online supplementary file 2 for data). The initial extraction included 8970 articles; of the 400 articles selected at random, 299 were deemed eligible and included in the review (figure 1). We found that only 13.3% (95% CI: 9.5% to 17.2%) of papers had at least one coauthor affiliated with a biostatistics, statistics, data science, data analytics, epidemiology, maths, computer science or economics department (figure 2). It should be noted that we included a broad set of methodological departments because we recognise that individuals from these fields may possess considerable statistical expertise. When we use the term ‘statistician’ in this paper, we broadly include individuals from other methods-focussed disciplines if they have extensive statistical training and experience.
The shortage of statisticians working in the field means that sports science and medicine researchers are often designing studies and running analyses by themselves. Some of these researchers undertake in-depth training in statistics and are well-equipped to handle these tasks. However—as with other applied disciplines—sports science and medicine researchers often lack adequate training in study design and statistics, which can lead to errors.22–24 This is especially problematic as study designs and data sets become more complex.
We are also concerned by a phenomenon in sports science and medicine. Scientists in these fields are developing statistical methods and introducing them into the literature without adequate peer review from the statistics community.25–27 Many of these methods are statistically and mathematically flawed.28 29 While advances in statistics sometimes have come from applied disciplines (eg, work on measurement done in education and psychology), these novel statistical methods were presented, critiqued and evaluated in the statistical literature before they were introduced and used in an applied context.
In this commentary, we present two series of case studies that illustrate the importance of effective collaboration between sports science and medicine researchers, and statisticians. We discuss barriers that have prevented collaboration. We recommend next steps forward.
Case studies: avoidable statistical errors
Statistical errors can occur during study design, data analysis or reporting. The case studies described below do not provide an exhaustive list of possible errors. Rather, we highlight several instances where an error may have been avoided with more statistical knowledge or greater collaboration with statisticians. Other references provide further examples of common statistical errors in sports science and medicine.6–11
Errors in study design—exercise physiology
A study of 14 active men aimed to establish the reliability of a biomarker test used to measure gastrointestinal (GI) integrity during conditions of heat stress.30 Participants performed two intermittent exertional heat stress tests, and GI integrity was measured with several biomarker tests, including the intestinal fatty-acid-binding protein (I-FABP). Authors reported that the I-FABP test at rest ‘displayed moderate-to-strong relative and acceptable absolute reliability between repetitions.’ However, this was based on finding significant correlation between the repeat measurements, with a Pearson correlation coefficient of 0.75 (p<0.01).
This case illustrates two issues: (1) An intraclass correlation coefficient (ICC) is a more appropriate measure of reliability and (2) when we extracted data from figure 3 of that paper to roughly estimate the ICC (2,1), we found a value and 95% CI of 0.72 (0.32 to 0.89). This estimate is too imprecise to draw useful conclusions; reliability may plausibly be anywhere from insufficient to excellent. In this case, the authors failed to perform an a priori sample size calculation, leading to a study that was too small to adequately answer the question of interest.
Errors in data analysis—nutrition and endocrinology in exercise
A study of vitamin D levels and menstrual status in 77 college-aged women concluded that ‘Women who did not meet the recommended level of 30 ng/mL of 25(OH)D had almost five times the odds of having menstrual cycle disorders as women who were above the recommended vitamin D level’.31 The study has important implications for women athletes, who frequently experience menstrual cycle irregularities.
A closer inspection of the analysis revealed problems. While a higher proportion of the low vitamin D group (40% of 60) had menstrual disturbances compared with the high vitamin D group (12% of 17), the analysis failed to account for important differences between the groups. The low vitamin D group was 17% heavier than the high vitamin D group—average body mass of 66.7 vs 57.0 kg. Body mass was also strongly related to menstrual disturbances: Women with menstrual disturbances had an average body mass of 77.6 vs 57.9 kg in women without menstrual disturbances. Thus, the apparent relationship between low vitamin D and menstrual disturbances may be caused entirely by strong confounding by body mass. The authors should have undertaken a multivariable analysis that accounted for body mass.
Errors in statistical reporting—sports medicine/orthopedics/rehabilitation
A large study was undertaken to understand factors that predict athlete recovery 2 years after an ACL reconstruction.32 The manuscript reports that: ‘Multivariable regression analyses were constructed to examine which baseline risk factors were independently associated with each outcome variable…primary outcome variables were all treated as continuous’, but the manuscript reports ORs. ORs are typically reported for binary, not continuous outcomes. This discrepancy caught the eye of an author in the present commentary, and a series of letters to the editor33 34 determined that a highly nuanced, thoughtful and appropriate analysis was performed on the data. However, the modelling approach was poorly described—which makes it difficult to judge the validity of the study and also hampers reproducibility. In this case, the research team included individuals with statistical expertise who were involved in study planning and data analysis; however, these individuals may have been insufficiently involved in drafting the paper.
Case studies: inventing new statistics
Introducing new statistical methods into the literature typically involves several steps: (1) writing down mathematical equations that explicitly formulate the method; (2) establishing the empirical behaviour of the method through mathematical proofs, simulations or both; and (3) publishing in a statistics journal or in a general interest journal following peer review by statisticians. Given the technical expertise required, statisticians or mathematicians are integral to the process.
A classic example is the significance analysis of microarrays (or SAM) statistical technique, which was introduced in 2001.35 SAM arose from a collaboration between the statistician Robert Tibshirani and biologists Virginia Goss Tusher and Gilbert Chu, who were trying to develop better ways to analyse microarray data. The initial paper on SAM was published in PNAS (Proceedings of the National Academy of Sciences of the United States of America); contains mathematical equations and proofs; and formally compares the performance of SAM to other methods that were popular for analysing microarray data at that time. This is one example; statistical journals publish numerous papers each year introducing new statistical approaches. Here, we highlight three cases where statistical methods were introduced into the literature without proper statistical vetting.
Methods for identifying responders and non-responders
Sports science and medicine researchers are interested in identifying ‘Responders’ and ‘Non-responders’ to exercise interventions. While response heterogeneity has been covered at great length in the applied statistics literature,36 these guidelines have largely been overlooked in sports science and medicine. Authors in these fields have employed a variety of analytical techniques for identifying differential response, including k-means cluster analysis followed by analysis of variance (ANOVA),37 grouping response based on the SE of measurement,38 and more recently, a novel analytical algorithm was suggested.27 However, none of these approaches are statistically or philosophically grounded, and indeed, they have poor statistical properties, such as high Type I error rates.29 39
Modifications of principal components analysis
In another example,26 modifications to the application of principal components analysis (PCA) applied within functional data analysis40 were proposed for the context of exploring high-dimensional kinematic sports science data.41–44 PCA estimates the principal components of a set of curves whose measured values are stored in a data matrix such that each row holds the data for an individual curve. In a recent pre-print on SportRxiv,26 the author argues that estimating the principal components of the data matrix, such that each column holds the measured values of an individual curve, is more appropriate. However, this alternative approach violates the independence assumption of PCA, does not centre the data conventionally, interprets the resulting scores as loadings, and has been criticised by an expert in the field.45 Such modifications to techniques like PCA should be carefully reviewed by the statistics community before being promoted as more appropriate than their conventional application. This is necessary to avoid confusion in the application of PCA by scientists in applied fields.
The case of Magnitude-Based Inference (MBI) is a cautionary tale of what can happen when a novel statistical approach is widely adopted before being vetted. MBI appeared in the sports science literature in 200625 in a way that was highly unusual for a statistical method. The introductory paper contained no mathematical formulas and no mathematical proofs or simulations demonstrating the method’s empirical behaviour. The paper was published as a commentary in the sports science literature, not a methodological journal, and it was not peer-reviewed by statisticians. The method has been criticised by the statistics community (and authors of this paper) for over a decade.28 46–52 In addition to lacking a sound mathematical foundation,28 46 51 the method leads to high Type I error rates that are not transparent25 37 38 42 and frequently leads researchers to reach overly optimistic conclusions.48 52 These critiques of MBI have even garnered negative attention for sports science in the popular media.12–14
In all these cases, effective collaboration with statisticians could have: (i) pointed researchers to existing methods that accomplish the same analytical goals or (ii) helped researchers to mathematically formalise new methods and assess their statistical properties. Indeed, theoretical breakthroughs are often inspired by practical needs.
Barriers to collaboration
The numerous barriers to collaboration between statisticians and sports scientists are comparable to those that hinder collaboration between statisticians and many other applied disciplines. Universities and research institutes are often spatially organised by discipline, which offers little opportunity for sports scientists and statisticians to interact.53 Many scientific disciplines employ intermediate methodological specialists to help bridge this gap—for example, psychology has mathematical psychologists and chemistry has instrumental chemists. Unfortunately, methodological specialists are less common in sports science. Sports analytics is a rapidly growing sub-discipline of sports science,19 but most sports analysts are currently employed by professional sports teams and have a narrow focus on performance metrics; sports scientists with a high level of statistical training are lacking in academia.
Sports science and medicine researchers bring subject matter expertise across a range of disciplines, including physiology, biomechanics, nutrition and psychology. However, some of these researchers may only receive limited training in study design and statistics. Like many applied researchers, they may try to learn applied statistics through self-study or from statistical ‘cookbook’ guides,22–24 and may fail to sufficiently appreciate the complexity of statistics and the in-depth expertise that statisticians, or those pursuing a formal statistics education, can bring to the table.54 The problem may be compounded when poor statistical techniques are passed from mentors to students, thus propagating poor practices to the next generation of researchers.55 Finally, a lack of statistical expertise in the journal peer review process means that many papers are published using suboptimal statistical methods;52 this often creates a positive feedback loop as methods are copied from paper to paper. BJSM (British Journal of Sports Medicine) only began having statistical Deputy Editors in 2019.
Though many sports science and medicine researchers would welcome statistical support for their projects, such support is often unavailable. Mathematical statisticians receive greater academic recognition for theoretical than applied work, and thus may be uninterested in collaborating with applied researchers.56 57 Applied statisticians may be interested in collaborating but often require substantial financial support for their time, which may put them out of reach of the budgets of many sports science and medicine projects. Furthermore, difficulties can arise in communication since statisticians are generally not domain experts in sports science or medicine.57–59 This ‘language’ barrier can lead to a misunderstanding of the domain-specific research problem, the data, or the subsequent analyses.
Differences in culture may be a barrier to collaboration. What is considered genuine knowledge and what are allowable scientific claims differs between scientific disciplines.53 Statisticians tend to be cautious when interpreting data, caution that is often justified when evaluating knowledge claims about biology or drug therapy for a serious disease, but which may be overly stringent when applied to the sort of practical issues common in sports science, such as which of two training regimens is more likely to lead to improved performance. Consequently, the sports scientist may not agree with the interpretation provided by the statistician, and therefore reconsider getting statistical assistance with future projects. Statisticians may also be reluctant to work with sports scientists if their advice is routinely ignored.
We encourage the sport science/medicine/physiotherapy community to seek more interaction with statisticians, including involving statisticians in conferences, departmental talks and events, and departmental teaching. This greater interaction will help the applied science and clinical community gain a greater awareness that statistics is itself a science—with old techniques discarded and new techniques adopted as data become available—rather than a set of recipes. The engaged statisticians will likely gain a better appreciation of the wealth of interesting data and analytical problems that sports science and medicine has to offer.
The applied science and clinical community should be in the habit of formally involving statisticians in research projects from the planning stages. That’s one of the seven habits of highly effective researchers. When statisticians are involved early, it prevents costly study design errors and also allows their time to be factored into budgets. Where financially feasible, we encourage university-based programmes to hire full-time applied statisticians for their departments. This will lead to higher quality, more reproducible research and increased efficiency of study designs. Where it’s currently impossible to hire an applied statistician, an alternative is to hire a sports scientist with in-depth, formal training in statistics (eg, a Master’s degree) and established links with statisticians. This person could act as an intermediate methodologist for other researchers, while also improving statistical education through their teaching involvement. Importantly, both statisticians and sports scientists/clinicians need to recognise that effective collaborations take time and mutual respect to develop.57
Sports science and medicine departments should improve statistical education. This could involve expanding statistical curricula, involving statisticians in teaching, or taking advantage of the wealth of high quality but inexpensive online training programmes available, such as the Johns Hopkins data science programme on Coursera.60 With the very easy access to such online training programmes,19 it is no longer necessary for every institution to design its own statistical curriculum. Rather, online courses can provide didactic training, which can be paired with a local instructor to provide practical hands-on reinforcement of the content. In particular, online lectures could be supplemented with guided data analysis exercises involving sports science and clinically-relevant data sets. We further recommend that such courses include a focus on conceptual issues rather than mathematical proofs and computation; it is more important for a sports scientist to understand, say, a 95% CI than to calculate one.
We reiterate previous calls to promote a sports biostatistician specialisation within sports science and sport, and exercise medicine/sports physiotherapy, similar to other domain-specific quantitative specialisations, such as psychometrics or geostatistics.19 A sports biostatistician concentration could exist within a larger health/kinesiology/sport science department and a student would take the majority of their coursework in statistics (eg, 50% to 70% of credits)61 with elective course work in health/sports science/kinesiology (eg, 15% to 25%), and finish with a thesis that could be published in a statistical or methodological journal. These students would be trained to not only use advanced methods in research, but also work specifically to make methodological advancements.
Statistical and methods errors are common in sports science and sports medicine/physiotherapy research.
Collaboration between researchers and statisticians can reduce errors.
Only about 13% of papers published in quartile one sports science journals include a coauthor affiliated with a statistics or other methods-oriented department.
We identify barriers to collaboration: a lack of appreciation of the importance of statistical expertise by scientists, lack of understanding of the value of sport and exercise research among statisticians, lack of resources to hire statisticians, a dearth of statisticians available to collaborate, and communication and cultural barriers among fields.
We recommend that sports science and medicine programmes increase formal and informal interactions with statisticians and expand their statistical curricula; we also call for the development of a quantitative specialisation within the field.
Twitter @exphysstudent, @TenanATC
Contributors All authors contributed to the intellectual content and drafting of the manuscript. KLS and DNB conducted the systematic review. All authors have read and approved the final manuscript. ARC, MLB, MST, AJV, ADV, JW, RN, KRL, EJK and NB contributed equally to the manuscript; authorship order for these authors was determined by a random number generator.
Funding Dr. Norma Bargary is supported in part by Grants from Science Foundation Ireland (Grant No. 12/RC/2289-P2, 16/RC/3918, 12/RC/2275_P2, 18/CRT/6049) and co-funded under the European Regional Development Fund. Andrew Vickers is funded by a P30-CA008748 Cancer Center Support Grant from the National Cancer Institute to Memorial Sloan Kettering Cancer Center.
Disclaimer The opinions or assertions contained herein are the private views of the author(s) and are not to be construed as official or reflecting the views of the Army or the Department of Defense. Any citations of commercial organisations and trade names in this report do not constitute an official Department of the Army endorsement of approval of the products or services of these organisations. No authors have any conflicts of interest to disclose. Approved for public release; distribution is unlimited.
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval The study was approved by the Hospital Clínico San Carlos Institutional Ethics Committee (approval number 20/268-E-BS).
Provenance and peer review Not commissioned; internally peer-reviewed.
Data availability statement Data from the review are available in the supplemental materials of the paper.