Majid Afshar1,2,3, Andrew Phillips4, Niranjan Karnik5, Jeanne Mueller6, Daniel To1, Richard Gonzalez6, Ron Price2, Richard Cooper4, Cara Joyce2,4, Dmitriy Dligach2,3. 1. Health Sciences Division, Burn and Shock Trauma Research Institute, Stritch School of Medicine, Loyola University, Maywood, Illinois, USA. 2. Health Sciences Division, Center for Health Outcomes and Informatics Research, Loyola University, Maywood, Illinois, USA. 3. Department of Public Health Sciences, Stritch School of Medicine, Loyola University, Maywood, Illinois, USA. 4. Department of Computer Science, Loyola University, Chicago, Illinois, USA. 5. Department of Psychiatry, Rush University Medical Center, Chicago, Illinois, USA. 6. Department of Surgery, Loyola University Medical Center, Maywood, Illinois, USA.
Abstract
Objective: Alcohol misuse is present in over a quarter of trauma patients. Information in the clinical notes of the electronic health record of trauma patients may be used for phenotyping tasks with natural language processing (NLP) and supervised machine learning. The objective of this study is to train and validate an NLP classifier for identifying patients with alcohol misuse. Materials and Methods: An observational cohort of 1422 adult patients admitted to a trauma center between April 2013 and November 2016. Linguistic processing of clinical notes was performed using the clinical Text Analysis and Knowledge Extraction System. The primary analysis was the binary classification of alcohol misuse. The Alcohol Use Disorders Identification Test served as the reference standard. Results: The data corpus comprised 91 045 electronic health record notes and 16 091 features. In the final machine learning classifier, 16 features were selected from the first 24 hours of notes for identifying alcohol misuse. The classifier's performance in the validation cohort had an area under the receiver-operating characteristic curve of 0.78 (95% confidence interval [CI], 0.72 to 0.85). Sensitivity and specificity were at 56.0% (95% CI, 44.1% to 68.0%) and 88.9% (95% CI, 84.4% to 92.8%). The Hosmer-Lemeshow goodness-of-fit test demonstrates the classifier fits the data well (P = .17). A simpler rule-based keyword approach had a decrease in sensitivity when compared with the NLP classifier from 56.0% to 18.2%. Conclusions: The NLP classifier has adequate predictive validity for identifying alcohol misuse in trauma centers. External validation is needed before its application to augment screening.
Objective: Alcohol misuse is present in over a quarter of traumapatients. Information in the clinical notes of the electronic health record of traumapatients may be used for phenotyping tasks with natural language processing (NLP) and supervised machine learning. The objective of this study is to train and validate an NLP classifier for identifying patients with alcohol misuse. Materials and Methods: An observational cohort of 1422 adult patients admitted to a trauma center between April 2013 and November 2016. Linguistic processing of clinical notes was performed using the clinical Text Analysis and Knowledge Extraction System. The primary analysis was the binary classification of alcohol misuse. The Alcohol Use Disorders Identification Test served as the reference standard. Results: The data corpus comprised 91 045 electronic health record notes and 16 091 features. In the final machine learning classifier, 16 features were selected from the first 24 hours of notes for identifying alcohol misuse. The classifier's performance in the validation cohort had an area under the receiver-operating characteristic curve of 0.78 (95% confidence interval [CI], 0.72 to 0.85). Sensitivity and specificity were at 56.0% (95% CI, 44.1% to 68.0%) and 88.9% (95% CI, 84.4% to 92.8%). The Hosmer-Lemeshow goodness-of-fit test demonstrates the classifier fits the data well (P = .17). A simpler rule-based keyword approach had a decrease in sensitivity when compared with the NLP classifier from 56.0% to 18.2%. Conclusions: The NLP classifier has adequate predictive validity for identifying alcohol misuse in trauma centers. External validation is needed before its application to augment screening.
Authors: Victor M Castro; Dmitriy Dligach; Sean Finan; Sheng Yu; Anil Can; Muhammad Abd-El-Barr; Vivian Gainer; Nancy A Shadick; Shawn Murphy; Tianxi Cai; Guergana Savova; Scott T Weiss; Rose Du Journal: Neurology Date: 2016-12-07 Impact factor: 9.910
Authors: Ewout W Steyerberg; Andrew J Vickers; Nancy R Cook; Thomas Gerds; Mithat Gonen; Nancy Obuchowski; Michael J Pencina; Michael W Kattan Journal: Epidemiology Date: 2010-01 Impact factor: 4.822
Authors: Sujay Kulshrestha; Dmitriy Dligach; Cara Joyce; Marshall S Baker; Richard Gonzalez; Ann P O'Rourke; Joshua M Glazer; Anne Stey; Jacqueline M Kruser; Matthew M Churpek; Majid Afshar Journal: Injury Date: 2020-10-25 Impact factor: 2.586
Authors: Christophe Lemey; Aziliz Le Glaz; Yannis Haralambous; Deok-Hee Kim-Dufor; Philippe Lenca; Romain Billot; Taylor C Ryan; Jonathan Marsh; Jordan DeVylder; Michel Walter; Sofian Berrouiguet Journal: J Med Internet Res Date: 2021-05-04 Impact factor: 5.428