Annie M Racine1,2, Douglas Tommet3, Madeline L D'Aquila1, Tamara G Fong1,2,4, Yun Gou1, Patricia A Tabloski5, Eran D Metzger2,6, Tammy T Hshieh2,7, Eva M Schmitt1, Sarinnapha M Vasunilashorn2,7, Lisa Kunze2,8, Kamen Vlassakov2,5, Ayesha Abdeen2,9, Jeffrey Lange2,10, Brandon Earp2,11, Bradford C Dickerson12, Edward R Marcantonio1,2,7, Jon Steingrimsson13, Thomas G Travison1,2, Sharon K Inouye1,2,7, Richard N Jones14. 1. Aging Brain Center, Institute for Aging Research, Boston, MA, USA. 2. Harvard Medical School, Boston, MA, USA. 3. Department of Psychiatry & Human Behavior, and Neurology, Brown University Warren Alpert Medical School, Providence, RI, USA. 4. Department of Neurology, Beth Israel Deaconess Medical Center, Boston, MA, USA. 5. William F Connell School of Nursing at Boston College, Boston, MA, USA. 6. Department of Psychiatry, Beth Israel Deaconess Medical Center, Boston, MA, USA. 7. Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA. 8. Department of Anesthesia, Beth Israel Deaconess Medical Center, Boston, MA, USA. 9. Department of Orthopedic Surgery, Beth Israel Deaconess Medical Center, Boston, MA, USA. 10. Department of Orthopedic Surgery, Brigham and Women's Hospital, Boston, MA, USA. 11. Department of Orthopedics, Brigham and Women's Faulkner Hospital, Boston, MA, USA. 12. Department of Neurology and Massachusetts Alzheimer's Disease Research Center, Massachusetts General Hospital, Boston, MA, USA. 13. Department of Biostatistics, Brown University, Providence, RI, USA. 14. Department of Psychiatry & Human Behavior, and Neurology, Brown University Warren Alpert Medical School, Providence, RI, USA. Richard_Jones@Brown.edu.
Abstract
BACKGROUND: Our objective was to assess the performance of machine learning methods to predict post-operative delirium using a prospective clinical cohort. METHODS: We analyzed data from an observational cohort study of 560 older adults (≥ 70 years) without dementia undergoing major elective non-cardiac surgery. Post-operative delirium was determined by the Confusion Assessment Method supplemented by a medical chart review (N = 134, 24%). Five machine learning algorithms and a standard stepwise logistic regression model were developed in a training sample (80% of participants) and evaluated in the remaining hold-out testing sample. We evaluated three overlapping feature sets, restricted to variables that are readily available or minimally burdensome to collect in clinical settings, including interview and medical record data. A large feature set included 71 potential predictors. A smaller set of 18 features was selected by an expert panel using a consensus process, and this smaller feature set was considered with and without a measure of pre-operative mental status. RESULTS: The area under the receiver operating characteristic curve (AUC) was higher in the large feature set conditions (range of AUC, 0.62-0.71 across algorithms) versus the selected feature set conditions (AUC range, 0.53-0.57). The restricted feature set with mental status had intermediate AUC values (range, 0.53-0.68). In the full feature set condition, algorithms such as gradient boosting, cross-validated logistic regression, and neural network (AUC = 0.71, 95% CI 0.58-0.83) were comparable with a model developed using traditional stepwise logistic regression (AUC = 0.69, 95% CI 0.57-0.82). Calibration for all models and feature sets was poor. CONCLUSIONS: We developed machine learning prediction models for post-operative delirium that performed better than chance and are comparable with traditional stepwise logistic regression. Delirium proved to be a phenotype that was difficult to predict with appreciable accuracy.
BACKGROUND: Our objective was to assess the performance of machine learning methods to predict post-operative delirium using a prospective clinical cohort. METHODS: We analyzed data from an observational cohort study of 560 older adults (≥ 70 years) without dementia undergoing major elective non-cardiac surgery. Post-operative delirium was determined by the Confusion Assessment Method supplemented by a medical chart review (N = 134, 24%). Five machine learning algorithms and a standard stepwise logistic regression model were developed in a training sample (80% of participants) and evaluated in the remaining hold-out testing sample. We evaluated three overlapping feature sets, restricted to variables that are readily available or minimally burdensome to collect in clinical settings, including interview and medical record data. A large feature set included 71 potential predictors. A smaller set of 18 features was selected by an expert panel using a consensus process, and this smaller feature set was considered with and without a measure of pre-operative mental status. RESULTS: The area under the receiver operating characteristic curve (AUC) was higher in the large feature set conditions (range of AUC, 0.62-0.71 across algorithms) versus the selected feature set conditions (AUC range, 0.53-0.57). The restricted feature set with mental status had intermediate AUC values (range, 0.53-0.68). In the full feature set condition, algorithms such as gradient boosting, cross-validated logistic regression, and neural network (AUC = 0.71, 95% CI 0.58-0.83) were comparable with a model developed using traditional stepwise logistic regression (AUC = 0.69, 95% CI 0.57-0.82). Calibration for all models and feature sets was poor. CONCLUSIONS: We developed machine learning prediction models for post-operative delirium that performed better than chance and are comparable with traditional stepwise logistic regression. Delirium proved to be a phenotype that was difficult to predict with appreciable accuracy.
Entities:
Keywords:
delirium; machine learning; model prediction; post-operative; statistical learning
Authors: Samuel E Simon; Margaret A Bergmann; Richard N Jones; Katherine M Murphy; E John Orav; Edward R Marcantonio Journal: J Am Med Dir Assoc Date: 2006-05-30 Impact factor: 4.669
Authors: Tamara G Fong; Daniel Davis; Matthew E Growdon; Asha Albuquerque; Sharon K Inouye Journal: Lancet Neurol Date: 2015-06-29 Impact factor: 44.182
Authors: Diether Kramer; Sai Veeranki; Dieter Hayn; Franz Quehenberger; Werner Leodolter; Christian Jagsch; Günter Schreier Journal: Stud Health Technol Inform Date: 2017
Authors: Mark van den Boogaard; Lisette Schoonhoven; Johannes G van der Hoeven; Theo van Achterberg; Peter Pickkers Journal: Int J Nurs Stud Date: 2011-12-22 Impact factor: 5.837
Authors: Annie M Racine; Tamara G Fong; Yun Gou; Thomas G Travison; Douglas Tommet; Kristen Erickson; Richard N Jones; Bradford C Dickerson; Eran Metzger; Edward R Marcantonio; Eva M Schmitt; Sharon K Inouye Journal: Alzheimers Dement Date: 2017-11-27 Impact factor: 21.566
Authors: Elizabeth E Devore; Tamara G Fong; Edward R Marcantonio; Eva M Schmitt; Thomas G Travison; Richard N Jones; Sharon K Inouye Journal: J Gerontol A Biol Sci Med Sci Date: 2017-11-09 Impact factor: 6.053
Authors: Andrew Bishara; Catherine Chiu; Elizabeth L Whitlock; Vanja C Douglas; Sei Lee; Atul J Butte; Jacqueline M Leung; Anne L Donovan Journal: BMC Anesthesiol Date: 2022-01-03 Impact factor: 2.376
Authors: Honoria Ocagli; Daniele Bottigliengo; Giulia Lorenzoni; Danila Azzolina; Aslihan S Acar; Silvia Sorgato; Lucia Stivanello; Mario Degan; Dario Gregori Journal: Int J Environ Res Public Health Date: 2021-07-02 Impact factor: 3.390