Michael Sanderson1, Andrew G M Bulloch2, JianLi Wang3, Tyler Williamson4, Scott B Patten5. 1. Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Canada. Electronic address: michael.sanderson@gov.ab.ca. 2. Hotchkiss Brain Institute, Department of Psychiatry, Cumming School of Medicine, University of Calgary, Canada. 3. School of Epidemiology, Public Health and Preventive Medicine, Department of Psychiatry, Faculty of Medicine, University of Ottawa, Canada. 4. Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Canada. 5. Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Canada; Department of Community Health Sciences, Department of Psychiatry, Cumming School of Medicine, University of Calgary, Canada.
Abstract
BACKGROUND: Suicide is a leading cause of death worldwide. With the increasing volume of administrative health care data, there is an opportunity to evaluate whether machine learning models can improve upon statistical models for quantifying suicide risk. OBJECTIVE: To compare the relative performance of logistic regression and single hidden layer feedforward neural network models that quantify suicide risk with predictors available in administrative health care system data. METHODS: The modeling dataset contained 3548 persons that died by suicide and 35,480 persons that did not die by suicide between 2000 and 2016. 101 predictors were selected, and these were assembled for each of the 40 quarters (10 years) prior to the quarter of death, resulting in 4040 predictors in total for each person. Logistic regression and single hidden layer feedforward neural network model configurations were evaluated using 10-fold cross-validation. RESULTS: The optimal feedforward neural network model configuration (AUC: 0.8352) outperformed logistic regression (AUC: 0.8179). LIMITATIONS: Many important predictors are not available in administrative data and this likely places a limit on how well prediction models developed with administrative data can perform. CONCLUSIONS: Although the models developed in this study showed promise, further research is needed to determine the performance limits of statistical and machine learning models that quantify suicide risk, and to develop prediction models optimized for implementation in clinical settings.
BACKGROUND: Suicide is a leading cause of death worldwide. With the increasing volume of administrative health care data, there is an opportunity to evaluate whether machine learning models can improve upon statistical models for quantifying suicide risk. OBJECTIVE: To compare the relative performance of logistic regression and single hidden layer feedforward neural network models that quantify suicide risk with predictors available in administrative health care system data. METHODS: The modeling dataset contained 3548 persons that died by suicide and 35,480 persons that did not die by suicide between 2000 and 2016. 101 predictors were selected, and these were assembled for each of the 40 quarters (10 years) prior to the quarter of death, resulting in 4040 predictors in total for each person. Logistic regression and single hidden layer feedforward neural network model configurations were evaluated using 10-fold cross-validation. RESULTS: The optimal feedforward neural network model configuration (AUC: 0.8352) outperformed logistic regression (AUC: 0.8179). LIMITATIONS: Many important predictors are not available in administrative data and this likely places a limit on how well prediction models developed with administrative data can perform. CONCLUSIONS: Although the models developed in this study showed promise, further research is needed to determine the performance limits of statistical and machine learning models that quantify suicide risk, and to develop prediction models optimized for implementation in clinical settings.
Authors: Michael Sanderson; Andrew Gm Bulloch; JianLi Wang; Kimberly G Williams; Tyler Williamson; Scott B Patten Journal: EClinicalMedicine Date: 2020-02-18