Improving collaboration between statisticians and sports scientists

Daniel J Cleather, Reader, St Mary's University Twickenham, UK/Institute for Globally Distributed Open Research and Education (IGDORE)
Will Hopkins, Professor, Institute for Health and Sport, Victoria University, Melbourne, Australia
Eric J Drinkwater, Senior Lecturer, Centre for Sport Research, School of Exercise and Nutrition Science, Deakin University, Geelong, Australia
Petr Stastny, Docent, Charles University, Faculty of Physical Education and Sport, Prague, Czech Republic
Janet Aisbett, Professor, Meraglim Holdings Corporation, West Palm Beach, Florida, USA

Other Contributors:

Eric J Drinkwater
Petr Stastny
Janet Aisbett
December 29, 2020


We welcome the call of Sainani et al.[1] for greater involvement of statisticians with researchers in sports science. However, effective collaboration requires understanding of context and in sports science research is often exploratory, concerned with small samples or predicated on the need to make practical decisions of relatively low risk. We argue for a collaborative approach that recognises the special needs of sports scientists and end-users of their research.

Where should statistical methods be published?

Sainani et al.[1] suggest that statistical methods should be vetted in statistics or general-interest journals before appearing in discipline-specific journals, implying that statistical methods can be evaluated independently of their context. While the mathematical core of statistics may be invariant among most disciplines, connecting these abstractions to the real world requires theoretical and practical assumptions that often depend on discipline-specific knowledge. Beyond that, there are wide philosophical divides among statisticians of frequentist and Bayesian persuasion. Similarly, we have pragmatic considerations like the degree of uncertainty we can accept when making decisions.

Statistical methods are sometimes developed to answer practical questions to which statisticians are blind. The chemist William Gosset studied the statistics of small sample sizes because he had an interest in barley cultivation arising from his employment as a brewer at Guinness[2]. This work didn’t interest contemporary statisticians, who almost always worked with large samples. Similarly, Ronald Fisher developed analysis of variance in the Journal of Agricultural Science and the method was presented side-by-side with the results of its application[3–5]. Few of the new algorithmic approaches that dominate data science were published in statistical journals but rather were developed by computer scientists tackling real-world problems like speech recognition[6].

Magnitude-based inference (MBI) is presented by Sainani et al.[1] as an example of why new methods should not be published in discipline-specific journals. Yet the procedure was developed to address discipline-specific challenges. Criticisms that statisticians have levied against MBI make no reference to the context in which it is being applied[7,8]. Equally, statisticians' modelling of the method has been incomplete and uses assumptions that are not always valid[9]. If publication of the method in sports science journals had engendered a supportive collaboration, then a long-standing and unhelpful dispute would have been avoided. Instead, statisticians have tended to take an uncompromising stance on MBI that fails to engage thoughtfully with arguments in support of the method[10,11].

Should statistical debate be privileged?

Sainani et al.[1] present several studies as cases supporting their contention that research quality can suffer when statisticians are not involved. They define a statistician as someone from a "methodologically-oriented" academic department, but most of the supposedly unsound studies include authors who by any reasonable definition are experienced applied statisticians. In any case, who qualifies as a statistician should be moot, as science is an open society. One of Robert Merton’s norms of scientific behaviour[12] is universalism – ideas should be judged on their merits alone without reference to their originator. Despite this, some of Sainani et al.’s arguments are calls to authority rather than genuine engagement in scholarly debate. Thus, an alternative approach to principal component analysis, disseminated only as a preprint[13], is described as flawed because it “does not interpret the data conventionally, interprets the resulting scores as loadings, and has been criticised by an expert in the field”. Preprints are used to solicit feedback prior to journal submission – they are an open invitation to collaborate. Rather than accept this invitation, Sainani et al. publicly castigate the work using two appeals to tradition and a reference to a Twitter conversation.

Collaboration will fail if any party grants themselves the authority to decide truth, especially when there is still debate within the literature. Sainani et al.[1] suggest that sports scientists have overlooked statisticians’ guidelines on response heterogeneity, but do not cite reviews on this topic by exercise scientists[14–16] and by exercise scientists collaborating with a statistician[17]. Similarly, meta-analytical and replicated crossover studies by sports scientists[18–20] (informed by statisticians[21–23]) are not cited. Sainani’s criticism of Loenneke and Dankel[24] refers to non-peer reviewed simulations by Tenan et al.[25,26] (one of Sainani’s co-authors) without considering a rebuttal of those simulations[27]. Moreover, Tenan[25] proclaims an intent to write to the editor of any journal that publishes work using the method to notify them that it is incorrect. This is akin to what Gieryn[28] called ‘boundary work’ – ideological activity that circumvents normal scientific discourse.

Statistics is not a unique, higher value skill. Rather, researchers have a range of specialised knowledge and skills (e.g., coding, research design, experimental skill). Deficiencies in any of these areas can lead to flawed research. Science is error-correcting through the organised scepticism of the scientific community (another of Merton’s norms), not through imposing barriers to participation.


Sports scientists and statisticians have different criteria for what makes a method appropriate, and statistical or mathematical concerns don’t automatically trump experimental, philosophical or practical considerations. Instead, effective collaborations involve consensus-building in a spirit of mutual respect. To progress the field effectively, we need genuine partnerships not authoritarian edicts.


1 Sainani KL, Borg DN, Caldwell AR, et al. Call to increase statistical collaboration in sports science, sport and exercise medicine and sports physiotherapy. Br J Sports Med Published Online First: 19 August 2020. doi:10.1136/bjsports-2020-102607
2 Box JF. Guinness, Gosset, Fisher, and small samples. Stat Sci 1987;2:45–52. doi:10.1214/ss/1177013437
3 Fisher RA. Studies in crop variation. I. An examination of the yield of dressed grain from Broadbalk. J Agric Sci 1921;11:107–35. doi:10.1017/S0021859600003750
4 Box JF. RA Fisher and the design of experiments, 1922–1926. Am Stat 1980;34:1–7. doi:10.2307/2682986
5 Fisher RA, Mackenzie WA. Studies in crop variation. II. The manurial response of different potato varieties. J Agric Sci 1923;13:311–20. doi:10.1017/S0021859600003592
6 Jelinek F. Statistical Methods for Speech Recognition. MIT Press 1997.
7 Welsh AH, Knight EJ. “Magnitude-based inference”: a statistical review. Med Sci Sports Exerc 2015;47:874. doi:10.1249/MSS.0000000000000451
8 Sainani KL. The problem with" magnitude-based inference". Med Sci Sports Exerc 2018;50:2166–2176. doi:10.1249/mss.0000000000001645
9 Aisbett J, Drinkwater EJ, Quarrie KL, et al. Advancing statistical decision-making in sports science. StatRxiv 2020;2010:13375. doi:arXiv:2010.13375
10 Hopkins WG, Batterham AM. Error rates, decisive outcomes and publication bias with several inferential methods. Sports Med 2016;46:1563–1573. doi:10.1007/s40279-016-0517-x
11 Batterham AM, Hopkins WG. The Problems with “The Problem with ‘Magnitude-Based Inference.’” Med Sci Sports Exerc 2019;51:599. doi:10.1249/MSS.0000000000001823
12 Merton RK. The Sociology of Science: Theoretical and Empirical Investigations. University of Chicago Press 1973.
13 Cleather D. On the use and abuse of principal component analysis in biomechanics. SportRxiv 2019. doi:10.31236/
14 Hopkins WG. Individual responses made easy. J Appl Physiol Bethesda Md 1985 2015;118:1444–6. doi:10.1152/japplphysiol.00098.2015
15 Atkinson G, Williamson P, Batterham AM. Issues in the determination of ‘responders’ and ‘non-responders’ in physiological research. Exp Physiol 2019;104:1215–25. doi:10.1113/EP087712
16 Atkinson G, Batterham AM. True and false interindividual differences in the physiological response to an intervention. Exp Physiol 2015;100:577–88. doi:10.1113/EP085070
17 Hecksteden A, Kraushaar J, Scharhag-Rosenberger F, et al. Individual response to exercise training - a statistical perspective. J Appl Physiol Bethesda Md 1985 2015;118:1450–9. doi:10.1152/japplphysiol.00714.2014
18 Williamson PJ, Atkinson G, Batterham AM. Inter-individual differences in weight change following exercise interventions: a systematic review and meta-analysis of randomized controlled trials. Obes Rev 2018;19:960–75. doi:10.1111/obr.12682
19 Goltz FR, Thackray AE, King JA, et al. Interindividual responses of appetite to acute exercise: A replicated crossover study. Med Sci Sports Exerc 2018;50:758–68. doi:10.1249/MSS.0000000000001504
20 Goltz FR, Thackray AE, Atkinson G, et al. True interindividual variability exists in postprandial appetite responses in healthy men but is not moderated by the FTO genotype. J Nutr 2019;149:1159–69. doi:10.1093/jn/nxz062
21 Senn S. Mastering variation: Variance components and personalised medicine. Stat Med 2016;35:966–77. doi:10.1002/sim.6739
22 Cortés J, González JA, Medina MN, et al. Does evidence support the high expectations placed in precision medicine? A bibliographic review. F1000Research 2019;7:30. doi:10.12688/f1000research.13490.5
23 Mills HL, Higgins JP, Morris RW, et al. Detecting heterogeneity of intervention effects using analysis and meta-analysis of differences in variance between arms of a trial. medRxiv 2020;:2020.03.07.20032516. doi:10.1101/2020.03.07.20032516
24 Dankel SJ, Loenneke JP. A method to stop analyzing random error and start analyzing differential responders to exercise. Sports Med 2020;50:231–238. doi:10.1007/s40279-019-01147-0
25 Tenan AM. ‘Harming the scientific process:’ An attempt to correct the sports science literature, part 3. Retraction Watch. 2020. (accessed 29 Sep 2020).
26 Tenan MS, Vigotsky AD, Caldwell AR. Comment on:“A method to stop analyzing random error and start analyzing differential responders to exercise.” Sports Med 2020;50:431–434. doi:10.1007/s40279-019-01249-9
27 Dankel SJ, Loenneke JP. Authors’ reply to Tenan et al.:“A method to stop analyzing random error and start analyzing differential responders to exercise.” Sports Med 2020;50:435–437. doi:10.1007/s40279-019-01250-2
28 Gieryn TF. Boundary-work and the demarcation of science from non-science: Strains and interests in professional ideologies of scientists. Am Sociol Rev 1983;48:781–795. doi:10.2307/2095325

Conflict of Interest

None declared