Hao Wang1, Gary L Rosner1, Steven N Goodman2,3. 1. Division of Biostatistics & Bioinformatics, Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, Johns Hopkins University School of Medicine, Baltimore, MD, USA. 2. Stanford University School of Medicine, Stanford, CA, USA Steve.goodman@stanford.edu. 3. Meta-Research Innovation Center at Stanford (METRICS), Stanford, CA, USA.
Abstract
BACKGROUND: Despite the wide use of the design with statistical stopping guidelines to stop a randomized clinical trial early for efficacy, there are unsettled debates of potential harmful consequences of such designs. These concerns include the possible over-estimation of treatment effects in early stopped trials and a newer argument of a "freezing effect" that will halt future randomized clinical trials on the same comparison since an early stopped trial represents an effective declaration that randomization to the unfavored arm is unethical. The purpose of this study is to determine the degree of bias in designs that allow for early stopping and to assess the impact on estimation if indeed future experimentation is "frozen" by an early stopped trial. METHODS: We perform simulations to study the effect of early stopping. We simulate a collection of trials and contrast the treatment-effect estimates (risk differences and ratios) with the simulation truth. Simulations consider various scenarios of between-study variation, including an empirically derived distribution of effects from the clinical literature. RESULTS: Across the trials whose true effects are sampled from a uniform distribution, estimates from trials that stop early for efficacy deviate minimally from the simulation truth (median bias of the estimate of risk difference is 0.005). Over-estimation becomes appreciable only when the true effect is close to the null value 0 (median bias of the risk difference estimate is 0.04) or when stopping happens with 40% information or less; however, stopping under these situations is rare. We also find slight reverse bias of the estimated treatment effect (median bias of the risk difference estimate is -0.002) among trials that do not cross the early stopping boundaries but continue to the final analysis. Similar results occur with relative risk estimates. In contrast, Bayesian estimation of the treatment effect shrinks the estimate from trials stopping early and pulls back under-estimation from completed trials, largely rectifying any over-estimation among trials that terminate early. Regarding the so-called freezing effect, the pooled effects from meta-analyses that include truncated randomized clinical trials show an unimportant deviation from the true value, even when no subsequent trials are conducted after a truncated randomized clinical trial. CONCLUSION: Group sequential designs with stopping rules seek to minimize exposure of patients to a disfavored therapy and speed dissemination of results, and such designs do not lead to materially biased estimates. The likelihood and magnitude of a "freezing effect" is minimal. Superiority demonstrated in a randomized clinical trial stopping early and designed with appropriate statistical stopping rules is likely a valid inference, even if the estimate may be slightly inflated.
BACKGROUND: Despite the wide use of the design with statistical stopping guidelines to stop a randomized clinical trial early for efficacy, there are unsettled debates of potential harmful consequences of such designs. These concerns include the possible over-estimation of treatment effects in early stopped trials and a newer argument of a "freezing effect" that will halt future randomized clinical trials on the same comparison since an early stopped trial represents an effective declaration that randomization to the unfavored arm is unethical. The purpose of this study is to determine the degree of bias in designs that allow for early stopping and to assess the impact on estimation if indeed future experimentation is "frozen" by an early stopped trial. METHODS: We perform simulations to study the effect of early stopping. We simulate a collection of trials and contrast the treatment-effect estimates (risk differences and ratios) with the simulation truth. Simulations consider various scenarios of between-study variation, including an empirically derived distribution of effects from the clinical literature. RESULTS: Across the trials whose true effects are sampled from a uniform distribution, estimates from trials that stop early for efficacy deviate minimally from the simulation truth (median bias of the estimate of risk difference is 0.005). Over-estimation becomes appreciable only when the true effect is close to the null value 0 (median bias of the risk difference estimate is 0.04) or when stopping happens with 40% information or less; however, stopping under these situations is rare. We also find slight reverse bias of the estimated treatment effect (median bias of the risk difference estimate is -0.002) among trials that do not cross the early stopping boundaries but continue to the final analysis. Similar results occur with relative risk estimates. In contrast, Bayesian estimation of the treatment effect shrinks the estimate from trials stopping early and pulls back under-estimation from completed trials, largely rectifying any over-estimation among trials that terminate early. Regarding the so-called freezing effect, the pooled effects from meta-analyses that include truncated randomized clinical trials show an unimportant deviation from the true value, even when no subsequent trials are conducted after a truncated randomized clinical trial. CONCLUSION: Group sequential designs with stopping rules seek to minimize exposure of patients to a disfavored therapy and speed dissemination of results, and such designs do not lead to materially biased estimates. The likelihood and magnitude of a "freezing effect" is minimal. Superiority demonstrated in a randomized clinical trial stopping early and designed with appropriate statistical stopping rules is likely a valid inference, even if the estimate may be slightly inflated.
Authors: Dirk Bassler; Victor M Montori; Matthias Briel; Paul Glasziou; Stephen D Walter; Tim Ramsay; Gordon Guyatt Journal: Stat Methods Med Res Date: 2011-12-13 Impact factor: 3.021
Authors: Dirk Bassler; Matthias Briel; Victor M Montori; Melanie Lane; Paul Glasziou; Qi Zhou; Diane Heels-Ansdell; Stephen D Walter; Gordon H Guyatt; David N Flynn; Mohamed B Elamin; Mohammad Hassan Murad; Nisrin O Abu Elnour; Julianna F Lampropulos; Amit Sood; Rebecca J Mullan; Patricia J Erwin; Clare R Bankhead; Rafael Perera; Carolina Ruiz Culebro; John J You; Sohail M Mulla; Jagdeep Kaur; Kara A Nerenberg; Holger Schünemann; Deborah J Cook; Kristina Lutz; Christine M Ribic; Noah Vale; German Malaga; Elie A Akl; Ignacio Ferreira-Gonzalez; Pablo Alonso-Coello; Gerard Urrutia; Regina Kunz; Heiner C Bucher; Alain J Nordmann; Heike Raatz; Suzana Alves da Silva; Fabio Tuche; Brigitte Strahm; Benjamin Djulbegovic; Neill K J Adhikari; Edward J Mills; Femida Gwadry-Sridhar; Haresh Kirpalani; Heloisa P Soares; Paul J Karanicolas; Karen E A Burns; Per Olav Vandvik; Fernando Coto-Yglesias; Pedro Paulo M Chrispim; Tim Ramsay Journal: JAMA Date: 2010-03-24 Impact factor: 56.272
Authors: Gabriela J Prutsky; Juan Pablo Domecq; Patricia J Erwin; Matthias Briel; Victor M Montori; Elie A Akl; Joerg J Meerpohl; Dirk Bassler; Stefan Schandelmaier; Stephen D Walter; Qi Zhou; Pablo Alonso Coello; Lorenzo Moja; Martin Walter; Kristian Thorlund; Paul Glasziou; Regina Kunz; Ignacio Ferreira-Gonzalez; Jason Busse; Xin Sun; Annette Kristiansen; Benjamin Kasenda; Osama Qasim-Agha; Gennaro Pagano; Hector Pardo-Hernandez; Gerard Urrutia; Mohammad Hassan Murad; Gordon Guyatt Journal: Trials Date: 2013-10-16 Impact factor: 2.279
Authors: Munyaradzi Dimairo; Philip Pallmann; James Wason; Susan Todd; Thomas Jaki; Steven A Julious; Adrian P Mander; Christopher J Weir; Franz Koenig; Marc K Walton; Jon P Nicholl; Elizabeth Coates; Katie Biggs; Toshimitsu Hamasaki; Michael A Proschan; John A Scott; Yuki Ando; Daniel Hind; Douglas G Altman Journal: BMJ Date: 2020-06-17
Authors: Jennifer E Thorne; Elizabeth A Sugar; Janet T Holbrook; Alyce E Burke; Michael M Altaweel; Albert T Vitale; Nisha R Acharya; John H Kempen; Douglas A Jabs Journal: Ophthalmology Date: 2018-09-27 Impact factor: 14.277
Authors: Munyaradzi Dimairo; Philip Pallmann; James Wason; Susan Todd; Thomas Jaki; Steven A Julious; Adrian P Mander; Christopher J Weir; Franz Koenig; Marc K Walton; Jon P Nicholl; Elizabeth Coates; Katie Biggs; Toshimitsu Hamasaki; Michael A Proschan; John A Scott; Yuki Ando; Daniel Hind; Douglas G Altman Journal: Trials Date: 2020-06-17 Impact factor: 2.279
Authors: S D Walter; H Han; G H Guyatt; D Bassler; N Bhatnagar; V Gloy; S Schandelmaier; M Briel Journal: BMC Med Res Methodol Date: 2020-01-16 Impact factor: 4.615