Jonathan Badger1, Eric LaRose2, John Mayer2, Fereshteh Bashiri2, David Page3, Peggy Peissig2. 1. Marshfield Clinic Research Institute, Marshfield, WI, USA; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA. Electronic address: badger.jonathan@marshfieldresearch.org. 2. Marshfield Clinic Research Institute, Marshfield, WI, USA. 3. Department of Computer Sciences, University of Wisconsin, Madison, WI, USA; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA.
Abstract
OBJECTIVE: To develop machine learning models for classifying the severity of opioid overdose events from clinical data. MATERIALS AND METHODS: Opioid overdoses were identified by diagnoses codes from the Marshfield Clinic population and assigned a severity score via chart review to form a gold standard set of labels. Three primary feature sets were constructed from disparate data sources surrounding each event and used to train machine learning models for phenotyping. RESULTS: Random forest and penalized logistic regression models gave the best performance with cross-validated mean areas under the ROC curves (AUCs) for all severity classes of 0.893 and 0.882 respectively. Features derived from a common data model outperformed features collected from disparate data sources for the same cohort of patients (AUCs 0.893 versus 0.837, p value = 0.002). The addition of features extracted from free text to machine learning models also increased AUCs from 0.827 to 0.893 (p value < 0.0001). Key word features extracted using natural language processing (NLP) such as 'Narcan' and 'Endotracheal Tube' are important for classifying overdose event severity. CONCLUSION: Random forest models using features derived from a common data model and free text can be effective for classifying opioid overdose events.
OBJECTIVE: To develop machine learning models for classifying the severity of opioid overdose events from clinical data. MATERIALS AND METHODS: Opioid overdoses were identified by diagnoses codes from the Marshfield Clinic population and assigned a severity score via chart review to form a gold standard set of labels. Three primary feature sets were constructed from disparate data sources surrounding each event and used to train machine learning models for phenotyping. RESULTS: Random forest and penalized logistic regression models gave the best performance with cross-validated mean areas under the ROC curves (AUCs) for all severity classes of 0.893 and 0.882 respectively. Features derived from a common data model outperformed features collected from disparate data sources for the same cohort of patients (AUCs 0.893 versus 0.837, p value = 0.002). The addition of features extracted from free text to machine learning models also increased AUCs from 0.827 to 0.893 (p value < 0.0001). Key word features extracted using natural language processing (NLP) such as 'Narcan' and 'Endotracheal Tube' are important for classifying overdose event severity. CONCLUSION: Random forest models using features derived from a common data model and free text can be effective for classifying opioid overdose events.
Authors: Peggy L Peissig; Luke V Rasmussen; Richard L Berg; James G Linneman; Catherine A McCarty; Carol Waudby; Lin Chen; Joshua C Denny; Russell A Wilke; Jyotishman Pathak; David Carrell; Abel N Kho; Justin B Starren Journal: J Am Med Inform Assoc Date: 2012 Mar-Apr Impact factor: 4.497
Authors: Matthew Miller; Catherine W Barber; Sarah Leatherman; Jennifer Fonda; John A Hermos; Kelly Cho; David R Gagnon Journal: JAMA Intern Med Date: 2015-04 Impact factor: 21.873
Authors: Yukun Chen; Robert J Carroll; Eugenia R McPeek Hinz; Anushi Shah; Anne E Eyler; Joshua C Denny; Hua Xu Journal: J Am Med Inform Assoc Date: 2013-07-13 Impact factor: 4.497
Authors: Daniel J Cobaugh; Carl Gainor; Cynthia L Gaston; Tai C Kwong; Barbarajean Magnani; Mary Lynn McPherson; Jacob T Painter; Edward P Krenzelok Journal: Am J Health Syst Pharm Date: 2014-09-15 Impact factor: 2.637
Authors: Chaitanya Shivade; Preethi Raghavan; Eric Fosler-Lussier; Peter J Embi; Noemie Elhadad; Stephen B Johnson; Albert M Lai Journal: J Am Med Inform Assoc Date: 2013-11-07 Impact factor: 4.497
Authors: David Goodman-Meza; Amber Tang; Babak Aryanfar; Sergio Vazquez; Adam J Gordon; Michihiko Goto; Matthew Bidwell Goetz; Steven Shoptaw; Alex A T Bui Journal: Open Forum Infect Dis Date: 2022-09-12 Impact factor: 4.423
Authors: Braja G Patra; Mohit M Sharma; Veer Vekaria; Prakash Adekkanattu; Olga V Patterson; Benjamin Glicksberg; Lauren A Lepow; Euijung Ryu; Joanna M Biernacka; Al'ona Furmanchuk; Thomas J George; William Hogan; Yonghui Wu; Xi Yang; Jiang Bian; Myrna Weissman; Priya Wickramaratne; J John Mann; Mark Olfson; Thomas R Campion; Mark Weiner; Jyotishman Pathak Journal: J Am Med Inform Assoc Date: 2021-11-25 Impact factor: 7.942