Responses
Other responses
Jump to comment:
- Published on: 8 May 2019
- Published on: 8 May 2019Accounting for Multiple Testing Calls into Question the Significance of these Results
In this article the authors discuss their analysis of 21 female and 22 male athletic events. Testing all 43, they find 3 events significant with p<0.05. When testing 43 events, the expectation is that a well-calibrated statistical test will produce 2 false positives with random data, on average, due to the definition of the p-value. The odds of producing 3 false positives are also rather high; for normally distributed simulated data under the null, I found 3 or more false positives approximately 1/3 of the time such an analysis is performed, see here for a simulation notebook: https://github.com/davidasiegel/False-Positive-Rate-for-Multiple-Tests-i....
This is why adjustments for multiple comparisons needs to be performed. It was neglected in their initial study and neglected again in this study. In the 2017 study they state, "These different athletic events were considered as distinct independent analyses and adjustment for multiple comparisons was not required." This doesn't make sense to me; if the analyses are distinct, then all the more reason to correct for multiple comparisons. If a Bonferroni correction were performed, none of the p-values would test significant at the level of the study (p<0.05/43 = 0.001). Therefore I do not see why there is any reason to reject the null hypothesis for any of these results.
Perfor...
Show MoreConflict of Interest:
None declared.