Sharon E Davis1, Thomas A Lasko1, Guanhua Chen2, Edward D Siew3,4, Michael E Matheny1,2,3,5. 1. Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA. 2. Department of Biostatistics, Vanderbilt University School of Medicine. 3. Geriatric Research Education and Clinical Care Service, VA Tennessee Valley Healthcare System, Nashville, TN, USA. 4. Division of Nephrology, Vanderbilt University School of Medicine, Vanderbilt Center for Kidney Disease and Integrated Program for AKI, Nashville, TN, USA. 5. Division of General Internal Medicine, Vanderbilt University School of Medicine.
Abstract
OBJECTIVE: Predictive analytics create opportunities to incorporate personalized risk estimates into clinical decision support. Models must be well calibrated to support decision-making, yet calibration deteriorates over time. This study explored the influence of modeling methods on performance drift and connected observed drift with data shifts in the patient population. MATERIALS AND METHODS: Using 2003 admissions to Department of Veterans Affairs hospitals nationwide, we developed 7 parallel models for hospital-acquired acute kidney injury using common regression and machine learning methods, validating each over 9 subsequent years. RESULTS: Discrimination was maintained for all models. Calibration declined as all models increasingly overpredicted risk. However, the random forest and neural network models maintained calibration across ranges of probability, capturing more admissions than did the regression models. The magnitude of overprediction increased over time for the regression models while remaining stable and small for the machine learning models. Changes in the rate of acute kidney injury were strongly linked to increasing overprediction, while changes in predictor-outcome associations corresponded with diverging patterns of calibration drift across methods. CONCLUSIONS: Efficient and effective updating protocols will be essential for maintaining accuracy of, user confidence in, and safety of personalized risk predictions to support decision-making. Model updating protocols should be tailored to account for variations in calibration drift across methods and respond to periods of rapid performance drift rather than be limited to regularly scheduled annual or biannual intervals.
OBJECTIVE: Predictive analytics create opportunities to incorporate personalized risk estimates into clinical decision support. Models must be well calibrated to support decision-making, yet calibration deteriorates over time. This study explored the influence of modeling methods on performance drift and connected observed drift with data shifts in the patient population. MATERIALS AND METHODS: Using 2003 admissions to Department of Veterans Affairs hospitals nationwide, we developed 7 parallel models for hospital-acquired acute kidney injury using common regression and machine learning methods, validating each over 9 subsequent years. RESULTS: Discrimination was maintained for all models. Calibration declined as all models increasingly overpredicted risk. However, the random forest and neural network models maintained calibration across ranges of probability, capturing more admissions than did the regression models. The magnitude of overprediction increased over time for the regression models while remaining stable and small for the machine learning models. Changes in the rate of acute kidney injury were strongly linked to increasing overprediction, while changes in predictor-outcome associations corresponded with diverging patterns of calibration drift across methods. CONCLUSIONS: Efficient and effective updating protocols will be essential for maintaining accuracy of, user confidence in, and safety of personalized risk predictions to support decision-making. Model updating protocols should be tailored to account for variations in calibration drift across methods and respond to periods of rapid performance drift rather than be limited to regularly scheduled annual or biannual intervals.
Authors: D A Cook; C J Joyce; R J Barnett; S P Birgan; H Playford; J G L Cockings; R W Hurford Journal: Anaesth Intensive Care Date: 2002-06 Impact factor: 1.669
Authors: Thomas P A Debray; Yvonne Vergouwe; Hendrik Koffijberg; Daan Nieboer; Ewout W Steyerberg; Karel G M Moons Journal: J Clin Epidemiol Date: 2014-08-30 Impact factor: 6.437
Authors: Mi Hye Park; Haeng Seon Shim; Won Ho Kim; Hyo-Jin Kim; Dong Joon Kim; Seong-Ho Lee; Chung Su Kim; Mi Sook Gwak; Gaab Soo Kim Journal: PLoS One Date: 2015-08-24 Impact factor: 3.240
Authors: Ruben Amarasingham; Anne-Marie J Audet; David W Bates; I Glenn Cohen; Martin Entwistle; G J Escobar; Vincent Liu; Lynn Etheredge; Bernard Lo; Lucila Ohno-Machado; Sudha Ram; Suchi Saria; Lisa M Schilling; Anand Shahi; Walter F Stewart; Ewout W Steyerberg; Bin Xie Journal: EGEMS (Wash DC) Date: 2016-03-07
Authors: Sharon E Davis; Robert A Greevy; Christopher Fonnesbeck; Thomas A Lasko; Colin G Walsh; Michael E Matheny Journal: J Am Med Inform Assoc Date: 2019-12-01 Impact factor: 4.497
Authors: Jejo D Koola; Sam B Ho; Aize Cao; Guanhua Chen; Amy M Perkins; Sharon E Davis; Michael E Matheny Journal: Dig Dis Sci Date: 2019-09-17 Impact factor: 3.199
Authors: Emily E Haroz; Colin G Walsh; Novalene Goklish; Mary F Cwik; Victoria O'Keefe; Allison Barlow Journal: Suicide Life Threat Behav Date: 2019-11-06
Authors: Carly Eckert; Neris Nieves-Robbins; Elena Spieker; Tom Louwers; David Hazel; James Marquardt; Keith Solveson; Anam Zahid; Muhammad Ahmad; Richard Barnhill; T Greg McKelvey; Robert Marshall; Eric Shry; Ankur Teredesai Journal: Appl Clin Inform Date: 2019-05-08 Impact factor: 2.342