Daphne Liu1, Mia Yu2, Jeffrey Duncan3, Anna Fondario3, Hadi Kharrazi4,5, Paul S Nestadt6,7. 1. West High School, Salt Lake City, UT, USA. 2. The College, University of Chicago, Chicago, IL, USA. 3. Utah Department of Health, Salt Lake City, UT, USA. 4. Center for Population Health IT, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA. 5. Division of Health Sciences Informatics, Johns Hopkins School of Medicine, Baltimore, MD, USA. 6. Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA. 7. Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
Abstract
OBJECTIVE: The Centers for Disease Control and Prevention (CDC) monitor accidental and intentional deaths to answer questions that are critical for the development of effective prevention and resource allocation. CDC's National Violent Death Reporting System (NVDRS) is a major innovation in surveillance linking individual-level data from multiple sources. However, suicide underreporting is common, particularly from drug overdose deaths. This study sought to assess machine learning (ML) techniques in quantifying drug overdose suicide underreporting rates. METHODS: Clinical, sociodemographic, toxicological, and proximal stressor data on overdose decedents (n = 2,665) were extracted from Utah's NVDRS from 2012 to 2015. The existing well-determined cases were used to train and test our ML models. We assessed and compared multiple machine learning methods including Logistic Regression, Random Forest Classifier, Support Vector Machines, and Artificial Neural Networks. We applied a majority voting methodology to classify undetermined drug overdose deaths. RESULTS: Overdose suicide rates were estimated to be underreported by 33% across all years, increasing yearly from 29% in 2012 to 37% in 2015. The overall test accuracies for all models ranged from 92.3% to 94.6%. CONCLUSIONS: This research identifies a cost-effective, replicable, and expandable ML-based methodology to estimate the true rates of suicide which may be partially masked during the opioid epidemic.
OBJECTIVE: The Centers for Disease Control and Prevention (CDC) monitor accidental and intentional deaths to answer questions that are critical for the development of effective prevention and resource allocation. CDC's National Violent Death Reporting System (NVDRS) is a major innovation in surveillance linking individual-level data from multiple sources. However, suicide underreporting is common, particularly from drug overdose deaths. This study sought to assess machine learning (ML) techniques in quantifying drug overdose suicide underreporting rates. METHODS: Clinical, sociodemographic, toxicological, and proximal stressor data on overdose decedents (n = 2,665) were extracted from Utah's NVDRS from 2012 to 2015. The existing well-determined cases were used to train and test our ML models. We assessed and compared multiple machine learning methods including Logistic Regression, Random Forest Classifier, Support Vector Machines, and Artificial Neural Networks. We applied a majority voting methodology to classify undetermined drug overdose deaths. RESULTS:Overdose suicide rates were estimated to be underreported by 33% across all years, increasing yearly from 29% in 2012 to 37% in 2015. The overall test accuracies for all models ranged from 92.3% to 94.6%. CONCLUSIONS: This research identifies a cost-effective, replicable, and expandable ML-based methodology to estimate the true rates of suicide which may be partially masked during the opioid epidemic.
Authors: Alison J Athey; Eleanor E Beale; James C Overholser; Craig A Stockmeier; Courtney L Bagge Journal: Drug Alcohol Depend Date: 2020-01-11 Impact factor: 4.492