Colin Walsh1, George Hripcsak2. 1. Department of Biomedical Informatics, Columbia University, United States; Department of Medicine, Columbia University, United States. Electronic address: cgw2106@columbia.edu. 2. Department of Biomedical Informatics, Columbia University, United States.
Abstract
BACKGROUND: Hospital readmission risk prediction remains a motivated area of investigation and operations in light of the hospital readmissions reduction program through CMS. Multiple models of risk have been reported with variable discriminatory performances, and it remains unclear how design factors affect performance. OBJECTIVES: To study the effects of varying three factors of model development in the prediction of risk based on health record data: (1) reason for readmission (primary readmission diagnosis); (2) available data and data types (e.g. visit history, laboratory results, etc); (3) cohort selection. METHODS: Regularized regression (LASSO) to generate predictions of readmissions risk using prevalence sampling. Support Vector Machine (SVM) used for comparison in cohort selection testing. Calibration by model refitting to outcome prevalence. RESULTS: Predicting readmission risk across multiple reasons for readmission resulted in ROC areas ranging from 0.92 for readmission for congestive heart failure to 0.71 for syncope and 0.68 for all-cause readmission. Visit history and laboratory tests contributed the most predictive value; contributions varied by readmission diagnosis. Cohort definition affected performance for both parametric and nonparametric algorithms. Compared to all patients, limiting the cohort to patients whose index admission and readmission diagnoses matched resulted in a decrease in average ROC from 0.78 to 0.55 (difference in ROC 0.23, p value 0.01). Calibration plots demonstrate good calibration with low mean squared error. CONCLUSION: Targeting reason for readmission in risk prediction impacted discriminatory performance. In general, laboratory data and visit history data contributed the most to prediction; data source contributions varied by reason for readmission. Cohort selection had a large impact on model performance, and these results demonstrate the difficulty of comparing results across different studies of predictive risk modeling.
BACKGROUND: Hospital readmission risk prediction remains a motivated area of investigation and operations in light of the hospital readmissions reduction program through CMS. Multiple models of risk have been reported with variable discriminatory performances, and it remains unclear how design factors affect performance. OBJECTIVES: To study the effects of varying three factors of model development in the prediction of risk based on health record data: (1) reason for readmission (primary readmission diagnosis); (2) available data and data types (e.g. visit history, laboratory results, etc); (3) cohort selection. METHODS: Regularized regression (LASSO) to generate predictions of readmissions risk using prevalence sampling. Support Vector Machine (SVM) used for comparison in cohort selection testing. Calibration by model refitting to outcome prevalence. RESULTS: Predicting readmission risk across multiple reasons for readmission resulted in ROC areas ranging from 0.92 for readmission for congestive heart failure to 0.71 for syncope and 0.68 for all-cause readmission. Visit history and laboratory tests contributed the most predictive value; contributions varied by readmission diagnosis. Cohort definition affected performance for both parametric and nonparametric algorithms. Compared to all patients, limiting the cohort to patients whose index admission and readmission diagnoses matched resulted in a decrease in average ROC from 0.78 to 0.55 (difference in ROC 0.23, p value 0.01). Calibration plots demonstrate good calibration with low mean squared error. CONCLUSION: Targeting reason for readmission in risk prediction impacted discriminatory performance. In general, laboratory data and visit history data contributed the most to prediction; data source contributions varied by reason for readmission. Cohort selection had a large impact on model performance, and these results demonstrate the difficulty of comparing results across different studies of predictive risk modeling.
Authors: Ewout W Steyerberg; Gerard J J M Borsboom; Hans C van Houwelingen; Marinus J C Eijkemans; J Dik F Habbema Journal: Stat Med Date: 2004-08-30 Impact factor: 2.373
Authors: Elizabeth H Bradley; Leslie Curry; Leora I Horwitz; Heather Sipsma; Yongfei Wang; Mary Norine Walsh; Don Goldmann; Neal White; Ileana L Piña; Harlan M Krumholz Journal: Circ Cardiovasc Qual Outcomes Date: 2013-07
Authors: Mandeep Singh; James C Guth; Eric Liotta; Adam R Kosteva; Rebecca M Bauer; Shyam Prabhakaran; Neil Rosenberg; Bernard R Bendok; Matthew B Maas; Andrew M Naidech Journal: Neurocrit Care Date: 2013-12 Impact factor: 3.210
Authors: Aksharananda Rambachan; Timothy R Smith; Sujata Saha; Mark K Eskandari; Bernard R Bendok; John Y S Kim Journal: World Neurosurg Date: 2013-08-20 Impact factor: 2.104
Authors: Ruben Amarasingham; Parag C Patel; Kathleen Toto; Lauren L Nelson; Timothy S Swanson; Billy J Moore; Bin Xie; Song Zhang; Kristin S Alvarez; Ying Ma; Mark H Drazner; Usha Kollipara; Ethan A Halm Journal: BMJ Qual Saf Date: 2013-07-31 Impact factor: 7.035
Authors: Daniel J Feller; Jason Zucker; Michael T Yin; Peter Gordon; Noémie Elhadad Journal: J Acquir Immune Defic Syndr Date: 2018-02-01 Impact factor: 3.731
Authors: Paolo Fraccaro; Sabine van der Veer; Benjamin Brown; Mattia Prosperi; Donal O'Donoghue; Gary S Collins; Iain Buchan; Niels Peek Journal: BMC Med Date: 2016-07-12 Impact factor: 8.775