Caitlin E Coombes1, Kevin R Coombes2, Naleef Fareed3,4. 1. College of Medicine, The Ohio State University, Columbus, OH, 43210, USA. 2. Department of Biomedical Informatics, The Ohio State University College of Medicine, 460 Medical Center Dr., 512 Institute of Behavioral Medicine Research, Columbus, OH, 43210, USA. 3. Department of Biomedical Informatics, The Ohio State University College of Medicine, 460 Medical Center Dr., 512 Institute of Behavioral Medicine Research, Columbus, OH, 43210, USA. naleef.fareed@osumc.edu. 4. Center for the Advancement of Team Science, Analytics, and Systems Thinking, College of Medicine, The Ohio State University, Columbus, OH, 43210, USA. naleef.fareed@osumc.edu.
Abstract
BACKGROUND: In the intensive care unit (ICU), delirium is a common, acute, confusional state associated with high risk for short- and long-term morbidity and mortality. Machine learning (ML) has promise to address research priorities and improve delirium outcomes. However, due to clinical and billing conventions, delirium is often inconsistently or incompletely labeled in electronic health record (EHR) datasets. Here, we identify clinical actions abstracted from clinical guidelines in electronic health records (EHR) data that indicate risk of delirium among intensive care unit (ICU) patients. We develop a novel prediction model to label patients with delirium based on a large data set and assess model performance. METHODS: EHR data on 48,451 admissions from 2001 to 2012, available through Medical Information Mart for Intensive Care-III database (MIMIC-III), was used to identify features to develop our prediction models. Five binary ML classification models (Logistic Regression; Classification and Regression Trees; Random Forests; Naïve Bayes; and Support Vector Machines) were fit and ranked by Area Under the Curve (AUC) scores. We compared our best model with two models previously proposed in the literature for goodness of fit, precision, and through biological validation. RESULTS: Our best performing model with threshold reclassification for predicting delirium was based on a multiple logistic regression using the 31 clinical actions (AUC 0.83). Our model out performed other proposed models by biological validation on clinically meaningful, delirium-associated outcomes. CONCLUSIONS: Hurdles in identifying accurate labels in large-scale datasets limit clinical applications of ML in delirium. We developed a novel labeling model for delirium in the ICU using a large, public data set. By using guideline-directed clinical actions independent from risk factors, treatments, and outcomes as model predictors, our classifier could be used as a delirium label for future clinically targeted models.
BACKGROUND: In the intensive care unit (ICU), delirium is a common, acute, confusional state associated with high risk for short- and long-term morbidity and mortality. Machine learning (ML) has promise to address research priorities and improve delirium outcomes. However, due to clinical and billing conventions, delirium is often inconsistently or incompletely labeled in electronic health record (EHR) datasets. Here, we identify clinical actions abstracted from clinical guidelines in electronic health records (EHR) data that indicate risk of delirium among intensive care unit (ICU) patients. We develop a novel prediction model to label patients with delirium based on a large data set and assess model performance. METHODS: EHR data on 48,451 admissions from 2001 to 2012, available through Medical Information Mart for Intensive Care-III database (MIMIC-III), was used to identify features to develop our prediction models. Five binary ML classification models (Logistic Regression; Classification and Regression Trees; Random Forests; Naïve Bayes; and Support Vector Machines) were fit and ranked by Area Under the Curve (AUC) scores. We compared our best model with two models previously proposed in the literature for goodness of fit, precision, and through biological validation. RESULTS: Our best performing model with threshold reclassification for predicting delirium was based on a multiple logistic regression using the 31 clinical actions (AUC 0.83). Our model out performed other proposed models by biological validation on clinically meaningful, delirium-associated outcomes. CONCLUSIONS: Hurdles in identifying accurate labels in large-scale datasets limit clinical applications of ML in delirium. We developed a novel labeling model for delirium in the ICU using a large, public data set. By using guideline-directed clinical actions independent from risk factors, treatments, and outcomes as model predictors, our classifier could be used as a delirium label for future clinically targeted models.
Entities:
Keywords:
Delirium; Electronic health records; Intensive care unit; Predictive model; Risk factors
Authors: Alison E Fohner; John D Greene; Brian L Lawson; Jonathan H Chen; Patricia Kipnis; Gabriel J Escobar; Vincent X Liu Journal: J Am Med Inform Assoc Date: 2019-12-01 Impact factor: 4.497
Authors: Joost Witlox; Lisa S M Eurelings; Jos F M de Jonghe; Kees J Kalisvaart; Piet Eikelenboom; Willem A van Gool Journal: JAMA Date: 2010-07-28 Impact factor: 56.272
Authors: John W Devlin; Jeffrey J Fong; Greg Schumaker; Heidi O'Connor; Robin Ruthazer; Erik Garpestad Journal: Crit Care Med Date: 2007-12 Impact factor: 7.598
Authors: Rebecca Gilbert; Richard M Martin; Jenny Donovan; J Athene Lane; Freddie Hamdy; David E Neal; Chris Metcalfe Journal: Stat Methods Med Res Date: 2014-09-12 Impact factor: 3.021