Murtaza Nasir1, Nichalin S Summerfield1, Asil Oztekin1, Margaret Knight2, Leland K Ackerson3, Stephanie Carreiro4. 1. Department of Operations and Information Systems, Manning School of Business, University of Massachusetts Lowell, Lowell, Massachusetts, USA. 2. Susan and Alan Solomont School of Nursing, Zuckerberg College of Health Sciences, University of Massachusetts Lowell, Lowell, Massachusetts, USA. 3. Department of Public Health, Zuckerberg College of Health Sciences, University of Massachusetts Lowell, Lowell, Massachusetts, USA. 4. Division of Medical Toxicology, Department of Emergency Medicine, UMass Memorial Healthcare, University of Massachusetts Medical School, Worcester, Massachusetts, USA.
Abstract
OBJECTIVE: Substance use disorder is a critical public health issue. Discovering the synergies among factors impacting treatment program success can help governments and treatment facilities develop effective policies. In this work, we propose a novel data analytics approach using machine learning models to discover interaction effects that might be neglected by traditional hypothesis-generating approaches. MATERIALS AND METHODS: A patient-episode-level substance use treatment discharge dataset and a Federal Bureau of Investigation crime dataset were joined using core-based statistical area codes. Random forests, artificial neural networks, and extreme gradient boosting were applied with a nested cross-validation methodology. Interaction effects were identified based on the machine learning model with the best performance. These interaction effects were analyzed and tested using traditional logistic regression models on unseen data. RESULTS: In predicting patient completion of a treatment program, extreme gradient boosting performed the best with an area under the curve of 89.31%. Based on our procedure, 73 interaction effects were identified. Among these, 14 were tested using traditional logistic regression models where 12 were statistically significant (P<.05). CONCLUSIONS: We identified new interaction effects among the length of stay, frequency of substance use, changes in self-help group attendance frequency, and other factors. This work provides insights into the interactions between factors impacting treatment completion. Further traditional statistical analysis can be employed by practitioners and policy makers to test the effects discovered by our novel machine learning approach.
OBJECTIVE: Substance use disorder is a critical public health issue. Discovering the synergies among factors impacting treatment program success can help governments and treatment facilities develop effective policies. In this work, we propose a novel data analytics approach using machine learning models to discover interaction effects that might be neglected by traditional hypothesis-generating approaches. MATERIALS AND METHODS: A patient-episode-level substance use treatment discharge dataset and a Federal Bureau of Investigation crime dataset were joined using core-based statistical area codes. Random forests, artificial neural networks, and extreme gradient boosting were applied with a nested cross-validation methodology. Interaction effects were identified based on the machine learning model with the best performance. These interaction effects were analyzed and tested using traditional logistic regression models on unseen data. RESULTS: In predicting patient completion of a treatment program, extreme gradient boosting performed the best with an area under the curve of 89.31%. Based on our procedure, 73 interaction effects were identified. Among these, 14 were tested using traditional logistic regression models where 12 were statistically significant (P<.05). CONCLUSIONS: We identified new interaction effects among the length of stay, frequency of substance use, changes in self-help group attendance frequency, and other factors. This work provides insights into the interactions between factors impacting treatment completion. Further traditional statistical analysis can be employed by practitioners and policy makers to test the effects discovered by our novel machine learning approach.
Authors: Elissa R Weitzman; Kara M Magane; Po-Hua Chen; Hadi Amiri; Timothy S Naimi; Lauren E Wisk Journal: Am J Prev Med Date: 2019-12-02 Impact factor: 5.043
Authors: Zina M Ibrahim; Honghan Wu; Ahmed Hamoud; Lukas Stappen; Richard J B Dobson; Andrea Agarossi Journal: J Am Med Inform Assoc Date: 2020-03-01 Impact factor: 4.497