Marcel Miché1, Erich Studerus2, Andrea Hans Meyer1, Andrew Thomas Gloster3, Katja Beesdo-Baum4, Hans-Ulrich Wittchen5, Roselind Lieb6. 1. University of Basel, Department of Psychology, Division of Clinical Psychology and Epidemiology, Basel, Switzerland. 2. University of Basel, Department of Psychology, Division of Personality and Developmental Psychology, Basel, Switzerland. 3. University of Basel, Department of Psychology, Division of Clinical Psychology and Intervention Science, Basel, Switzerland. 4. Technische Universitaet Dresden, Behavioral Epidemiology, Dresden, Germany; Technische Universitaet Dresden, Institute of Clinical Psychology and Psychotherapy, Dresden, Germany. 5. Technische Universitaet Dresden, Institute of Clinical Psychology and Psychotherapy, Dresden, Germany; Ludwig Maximilians University Munich, Department of Psychiatry and Psychotherapy, Munich, Germany. 6. University of Basel, Department of Psychology, Division of Clinical Psychology and Epidemiology, Basel, Switzerland. Electronic address: roselind.lieb@unibas.ch.
Abstract
BACKGROUND: The use of machine learning (ML) algorithms to study suicidality has recently been recommended. Our aim was to explore whether ML approaches have the potential to improve the prediction of suicide attempt (SA) risk. Using the epidemiological multiwave prospective-longitudinal Early Developmental Stages of Psychopathology (EDSP) data set, we compared four algorithms-logistic regression, lasso, ridge, and random forest-in predicting a future SA in a community sample of adolescents and young adults. METHODS: The EDSP Study prospectively assessed, over the course of 10 years, adolescents and young adults aged 14-24 years at baseline. Of 3021 subjects, 2797 were eligible for prospective analyses because they participated in at least one of the three follow-up assessments. Sixteen baseline predictors, all selected a priori from the literature, were used to predict follow-up SAs. Model performance was assessed using repeated nested 10-fold cross-validation. As the main measure of predictive performance we used the area under the curve (AUC). RESULTS: The mean AUCs of the four predictive models, logistic regression, lasso, ridge, and random forest, were 0.828, 0.826, 0.829, and 0.824, respectively. CONCLUSIONS: Based on our comparison, each algorithm performed equally well in distinguishing between a future SA case and a non-SA case in community adolescents and young adults. When choosing an algorithm, different considerations, however, such as ease of implementation, might in some instances lead to one algorithm being prioritized over another. Further research and replication studies are required in this regard.
BACKGROUND: The use of machine learning (ML) algorithms to study suicidality has recently been recommended. Our aim was to explore whether ML approaches have the potential to improve the prediction of suicide attempt (SA) risk. Using the epidemiological multiwave prospective-longitudinal Early Developmental Stages of Psychopathology (EDSP) data set, we compared four algorithms-logistic regression, lasso, ridge, and random forest-in predicting a future SA in a community sample of adolescents and young adults. METHODS: The EDSP Study prospectively assessed, over the course of 10 years, adolescents and young adults aged 14-24 years at baseline. Of 3021 subjects, 2797 were eligible for prospective analyses because they participated in at least one of the three follow-up assessments. Sixteen baseline predictors, all selected a priori from the literature, were used to predict follow-up SAs. Model performance was assessed using repeated nested 10-fold cross-validation. As the main measure of predictive performance we used the area under the curve (AUC). RESULTS: The mean AUCs of the four predictive models, logistic regression, lasso, ridge, and random forest, were 0.828, 0.826, 0.829, and 0.824, respectively. CONCLUSIONS: Based on our comparison, each algorithm performed equally well in distinguishing between a future SA case and a non-SA case in community adolescents and young adults. When choosing an algorithm, different considerations, however, such as ease of implementation, might in some instances lead to one algorithm being prioritized over another. Further research and replication studies are required in this regard.
Authors: Robert B Penfold; Eric Johnson; Susan M Shortreed; Rebecca A Ziebell; Frances L Lynch; Greg N Clarke; Karen J Coleman; Beth E Waitzfelder; Arne L Beck; Rebecca C Rossom; Brian K Ahmedani; Gregory E Simon Journal: J Affect Disord Date: 2021-07-01 Impact factor: 4.839