Fereshteh S Bashiri1, John R Caskey1, Anoop Mayampurath2, Nicole Dussault3, Jay Dumanian3, Sivasubramanium V Bhavani4, Kyle A Carey5, Emily R Gilbert6, Christopher J Winslow7, Nirav S Shah5,7, Dana P Edelson5, Majid Afshar1,2, Matthew M Churpek1,2. 1. Department of Medicine, University of Wisconsin-Madison, Madison, Wisconsin, USA. 2. Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, USA. 3. Pritzker School of Medicine, University of Chicago, Chicago, Illinois, USA. 4. Department of Medicine, Emory University, Atlanta, Georgia, USA. 5. Department of Medicine, University of Chicago, Chicago, Illinois, USA. 6. Department of Medicine, Loyola University, Chicago, Illinois, USA. 7. Department of Medicine, NorthShore University HealthSystem, Evanston, Illinois, USA.
Abstract
OBJECTIVES: Early identification of infection improves outcomes, but developing models for early identification requires determining infection status with manual chart review, limiting sample size. Therefore, we aimed to compare semi-supervised and transfer learning algorithms with algorithms based solely on manual chart review for identifying infection in hospitalized patients. MATERIALS AND METHODS: This multicenter retrospective study of admissions to 6 hospitals included "gold-standard" labels of infection from manual chart review and "silver-standard" labels from nonchart-reviewed patients using the Sepsis-3 infection criteria based on antibiotic and culture orders. "Gold-standard" labeled admissions were randomly allocated to training (70%) and testing (30%) datasets. Using patient characteristics, vital signs, and laboratory data from the first 24 hours of admission, we derived deep learning and non-deep learning models using transfer learning and semi-supervised methods. Performance was compared in the gold-standard test set using discrimination and calibration metrics. RESULTS: The study comprised 432 965 admissions, of which 2724 underwent chart review. In the test set, deep learning and non-deep learning approaches had similar discrimination (area under the receiver operating characteristic curve of 0.82). Semi-supervised and transfer learning approaches did not improve discrimination over models fit using only silver- or gold-standard data. Transfer learning had the best calibration (unreliability index P value: .997, Brier score: 0.173), followed by self-learning gradient boosted machine (P value: .67, Brier score: 0.170). DISCUSSION: Deep learning and non-deep learning models performed similarly for identifying infection, as did models developed using Sepsis-3 and manual chart review labels. CONCLUSION: In a multicenter study of almost 3000 chart-reviewed patients, semi-supervised and transfer learning models showed similar performance for model discrimination as baseline XGBoost, while transfer learning improved calibration.
OBJECTIVES: Early identification of infection improves outcomes, but developing models for early identification requires determining infection status with manual chart review, limiting sample size. Therefore, we aimed to compare semi-supervised and transfer learning algorithms with algorithms based solely on manual chart review for identifying infection in hospitalized patients. MATERIALS AND METHODS: This multicenter retrospective study of admissions to 6 hospitals included "gold-standard" labels of infection from manual chart review and "silver-standard" labels from nonchart-reviewed patients using the Sepsis-3 infection criteria based on antibiotic and culture orders. "Gold-standard" labeled admissions were randomly allocated to training (70%) and testing (30%) datasets. Using patient characteristics, vital signs, and laboratory data from the first 24 hours of admission, we derived deep learning and non-deep learning models using transfer learning and semi-supervised methods. Performance was compared in the gold-standard test set using discrimination and calibration metrics. RESULTS: The study comprised 432 965 admissions, of which 2724 underwent chart review. In the test set, deep learning and non-deep learning approaches had similar discrimination (area under the receiver operating characteristic curve of 0.82). Semi-supervised and transfer learning approaches did not improve discrimination over models fit using only silver- or gold-standard data. Transfer learning had the best calibration (unreliability index P value: .997, Brier score: 0.173), followed by self-learning gradient boosted machine (P value: .67, Brier score: 0.170). DISCUSSION: Deep learning and non-deep learning models performed similarly for identifying infection, as did models developed using Sepsis-3 and manual chart review labels. CONCLUSION: In a multicenter study of almost 3000 chart-reviewed patients, semi-supervised and transfer learning models showed similar performance for model discrimination as baseline XGBoost, while transfer learning improved calibration.
Authors: Mitchell M Levy; Mitchell P Fink; John C Marshall; Edward Abraham; Derek Angus; Deborah Cook; Jonathan Cohen; Steven M Opal; Jean-Louis Vincent; Graham Ramsay Journal: Intensive Care Med Date: 2003-03-28 Impact factor: 17.440
Authors: Chanu Rhee; Raymund Dantes; Lauren Epstein; David J Murphy; Christopher W Seymour; Theodore J Iwashyna; Sameer S Kadri; Derek C Angus; Robert L Danner; Anthony E Fiore; John A Jernigan; Greg S Martin; Edward Septimus; David K Warren; Anita Karcz; Christina Chan; John T Menchaca; Rui Wang; Susan Gruber; Michael Klompas Journal: JAMA Date: 2017-10-03 Impact factor: 56.272
Authors: Matthew M Churpek; Trevor C Yuen; Christopher Winslow; David O Meltzer; Michael W Kattan; Dana P Edelson Journal: Crit Care Med Date: 2016-02 Impact factor: 7.598
Authors: Anoop Mayampurath; L Nelson Sanchez-Pinto; Kyle A Carey; Laura-Ruth Venable; Matthew Churpek Journal: PLoS One Date: 2019-07-31 Impact factor: 3.240