Kristin L Sainani1. 1. Division of Epidemiology, Department of Health Research and Policy, Stanford University, Stanford, CA.
Abstract
PURPOSE: A statistical method called "magnitude-based inference" (MBI) has gained a following in the sports science literature, despite concerns voiced by statisticians. Its proponents have claimed that MBI exhibits superior type I and type II error rates compared with standard null hypothesis testing for most cases. I have performed a reanalysis to evaluate this claim. METHODS: Using simulation code provided by MBI's proponents, I estimated type I and type II error rates for clinical and nonclinical MBI for a range of effect sizes, sample sizes, and smallest important effects. I plotted these results in a way that makes transparent the empirical behavior of MBI. I also reran the simulations after correcting mistakes in the definitions of type I and type II error provided by MBI's proponents. Finally, I confirmed the findings mathematically; and I provide general equations for calculating MBI's error rates without the need for simulation. RESULTS: Contrary to what MBI's proponents have claimed, MBI does not exhibit "superior" type I and type II error rates to standard null hypothesis testing. As expected, there is a tradeoff between type I and type II error. At precisely the small-to-moderate sample sizes that MBI's proponents deem "optimal," MBI reduces the type II error rate at the cost of greatly inflating the type I error rate-to two to six times that of standard hypothesis testing. CONCLUSIONS: Magnitude-based inference exhibits worrisome empirical behavior. In contrast to standard null hypothesis testing, which has predictable type I error rates, the type I error rates for MBI vary widely depending on the sample size and choice of smallest important effect, and are often unacceptably high. Magnitude-based inference should not be used.
PURPOSE: A statistical method called "magnitude-based inference" (MBI) has gained a following in the sports science literature, despite concerns voiced by statisticians. Its proponents have claimed that MBI exhibits superior type I and type II error rates compared with standard null hypothesis testing for most cases. I have performed a reanalysis to evaluate this claim. METHODS: Using simulation code provided by MBI's proponents, I estimated type I and type II error rates for clinical and nonclinical MBI for a range of effect sizes, sample sizes, and smallest important effects. I plotted these results in a way that makes transparent the empirical behavior of MBI. I also reran the simulations after correcting mistakes in the definitions of type I and type II error provided by MBI's proponents. Finally, I confirmed the findings mathematically; and I provide general equations for calculating MBI's error rates without the need for simulation. RESULTS: Contrary to what MBI's proponents have claimed, MBI does not exhibit "superior" type I and type II error rates to standard null hypothesis testing. As expected, there is a tradeoff between type I and type II error. At precisely the small-to-moderate sample sizes that MBI's proponents deem "optimal," MBI reduces the type II error rate at the cost of greatly inflating the type I error rate-to two to six times that of standard hypothesis testing. CONCLUSIONS: Magnitude-based inference exhibits worrisome empirical behavior. In contrast to standard null hypothesis testing, which has predictable type I error rates, the type I error rates for MBI vary widely depending on the sample size and choice of smallest important effect, and are often unacceptably high. Magnitude-based inference should not be used.
Authors: Luiz Henrique Palucci Vieira; Christopher Carling; Fabio Augusto Barbieri; Rodrigo Aquino; Paulo Roberto Pereira Santiago Journal: Sports Med Date: 2019-02 Impact factor: 11.136
Authors: Stephen W West; Sean Williams; Simon P T Kemp; Robin Eager; Matthew J Cross; Keith A Stokes Journal: J Athl Train Date: 2020-09-01 Impact factor: 2.860
Authors: Andrew W Brown; Douglas G Altman; Tom Baranowski; J Martin Bland; John A Dawson; Nikhil V Dhurandhar; Shima Dowla; Kevin R Fontaine; Andrew Gelman; Steven B Heymsfield; Wasantha Jayawardene; Scott W Keith; Theodore K Kyle; Eric Loken; J Michael Oakes; June Stevens; Diana M Thomas; David B Allison Journal: Obes Rev Date: 2019-08-19 Impact factor: 9.213
Authors: Keith R Lohse; Kristin L Sainani; J Andrew Taylor; Michael L Butson; Emma J Knight; Andrew J Vickers Journal: PLoS One Date: 2020-06-26 Impact factor: 3.240