George C M Siontis1, Ioanna Tzoulaki2, Peter J Castaldi3, John P A Ioannidis4. 1. Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, University Campus, P.O. Box 1186, 45110 Ioannina, Greece. 2. Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, University Campus, P.O. Box 1186, 45110 Ioannina, Greece; Department of Epidemiology and Biostatistics, Imperial College London, Norfolk Place W2 1PG, London, United Kingdom. 3. Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 181 Longwood Avenue, Boston, MA 02115, USA. 4. Department of Medicine, Stanford Prevention Research Center, Stanford University School of Medicine, 1265 Welch Rd, MSOB X306, Stanford, CA 94305, USA; Department of Health Research and Policy, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, CA 94305, USA. Electronic address: jioannid@stanford.edu.
Abstract
OBJECTIVES: To evaluate how often newly developed risk prediction models undergo external validation and how well they perform in such validations. STUDY DESIGN AND SETTING: We reviewed derivation studies of newly proposed risk models and their subsequent external validations. Study characteristics, outcome(s), and models' discriminatory performance [area under the curve, (AUC)] in derivation and validation studies were extracted. We estimated the probability of having a validation, change in discriminatory performance with more stringent external validation by overlapping or different authors compared to the derivation estimates. RESULTS: We evaluated 127 new prediction models. Of those, for 32 models (25%), at least an external validation study was identified; in 22 models (17%), the validation had been done by entirely different authors. The probability of having an external validation by different authors within 5 years was 16%. AUC estimates significantly decreased during external validation vs. the derivation study [median AUC change: -0.05 (P < 0.001) overall; -0.04 (P = 0.009) for validation by overlapping authors; -0.05 (P < 0.001) for validation by different authors]. On external validation, AUC decreased by at least 0.03 in 19 models and never increased by at least 0.03 (P < 0.001). CONCLUSION: External independent validation of predictive models in different studies is uncommon. Predictive performance may worsen substantially on external validation.
OBJECTIVES: To evaluate how often newly developed risk prediction models undergo external validation and how well they perform in such validations. STUDY DESIGN AND SETTING: We reviewed derivation studies of newly proposed risk models and their subsequent external validations. Study characteristics, outcome(s), and models' discriminatory performance [area under the curve, (AUC)] in derivation and validation studies were extracted. We estimated the probability of having a validation, change in discriminatory performance with more stringent external validation by overlapping or different authors compared to the derivation estimates. RESULTS: We evaluated 127 new prediction models. Of those, for 32 models (25%), at least an external validation study was identified; in 22 models (17%), the validation had been done by entirely different authors. The probability of having an external validation by different authors within 5 years was 16%. AUC estimates significantly decreased during external validation vs. the derivation study [median AUC change: -0.05 (P < 0.001) overall; -0.04 (P = 0.009) for validation by overlapping authors; -0.05 (P < 0.001) for validation by different authors]. On external validation, AUC decreased by at least 0.03 in 19 models and never increased by at least 0.03 (P < 0.001). CONCLUSION: External independent validation of predictive models in different studies is uncommon. Predictive performance may worsen substantially on external validation.
Keywords:
Area under the receiver operating characteristics curve; Derivation study; Discrimination; External validation; Prognostic models; Risk prediction model
Authors: Stein J Janssen; Andrea S van der Heijden; Maarten van Dijke; John E Ready; Kevin A Raskin; Marco L Ferrone; Francis J Hornicek; Joseph H Schwab Journal: Clin Orthop Relat Res Date: 2015-07-09 Impact factor: 4.176
Authors: Benjamin A Goldstein; Ann Marie Navar; Michael J Pencina; John P A Ioannidis Journal: J Am Med Inform Assoc Date: 2016-05-17 Impact factor: 4.497
Authors: Suzanne V Arnold; Jonathan Afilalo; John A Spertus; Yuanyuan Tang; Suzanne J Baron; Philip G Jones; Michael J Reardon; Steven J Yakubov; David H Adams; David J Cohen Journal: J Am Coll Cardiol Date: 2016-10-25 Impact factor: 24.094
Authors: Miren Orive; Urko Aguirre; Nerea Gonzalez; Santiago Lázaro; Maximino Redondo; Marisa Bare; Rocío Anula; Eduardo Briones; Antonio Escobar; Cristina Sarasqueta; Susana Garcia-Gutierrez; José M Quintana Journal: Support Care Cancer Date: 2019-02-22 Impact factor: 3.603
Authors: Joshua Elliott; Barbara Bodinier; Tom A Bond; Marc Chadeau-Hyam; Evangelos Evangelou; Karel G M Moons; Abbas Dehghan; David C Muller; Paul Elliott; Ioanna Tzoulaki Journal: JAMA Date: 2020-02-18 Impact factor: 56.272
Authors: R Andrew Taylor; Joseph R Pare; Arjun K Venkatesh; Hani Mowafi; Edward R Melnick; William Fleischman; M Kennedy Hall Journal: Acad Emerg Med Date: 2016-02-13 Impact factor: 3.451