Mehri Sajjadian1, Raymond W Lam2, Roumen Milev3, Susan Rotzinger4,5, Benicio N Frey6,7, Claudio N Soares8, Sagar V Parikh9, Jane A Foster10, Gustavo Turecki11, Daniel J Müller12,4, Stephen C Strother13, Faranak Farzan14, Sidney H Kennedy4,5,15,16, Rudolf Uher1. 1. Department of Psychiatry, Dalhousie University, Halifax, NS, Canada. 2. Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada. 3. Department of Psychiatry and Psychology, Queen's University, Providence Care Hospital, Kingston, ON, Canada. 4. Department of Psychiatry, University of Toronto, Toronto, ON, Canada. 5. Department of Psychiatry, St. Michael's Hospital, University of Toronto, Toronto, Ontario, Canada. 6. Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, ON, Canada. 7. Mood Disorders Program and Women's Health Concerns Clinic, St. Joseph's Healthcare Hamilton, Hamilton, ON, Canada. 8. Department of Psychiatry, Queen's University School of Medicine, Kingston, ON, Canada. 9. Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA. 10. Department of Psychiatry & Behavioural Neurosciences, St. Joseph's Healthcare, Hamilton, ON, Canada. 11. Department of Psychiatry, Douglas Institute, McGill University, Montreal, QC, Canada. 12. Campbell Family Mental Health Research Institute, Center for Addiction and Mental Health, Toronto, ON, Canada. 13. Baycrest and Department of Medical Biophysics, Rotman Research Center, University of Toronto, Toronto, ON, Canada. 14. eBrain Lab, School of Mechatronic Systems Engineering, Simon Fraser University, Surrey, BC, Canada. 15. Department of Psychiatry, University Health Network, Toronto, ON, Canada. 16. Krembil Research Centre, University Health Network, University of Toronto, Toronto, ON, Canada.
Abstract
BACKGROUND: Multiple treatments are effective for major depressive disorder (MDD), but the outcomes of each treatment vary broadly among individuals. Accurate prediction of outcomes is needed to help select a treatment that is likely to work for a given person. We aim to examine the performance of machine learning methods in delivering replicable predictions of treatment outcomes. METHODS: Of 7732 non-duplicate records identified through literature search, we retained 59 eligible reports and extracted data on sample, treatment, predictors, machine learning method, and treatment outcome prediction. A minimum sample size of 100 and an adequate validation method were used to identify adequate-quality studies. The effects of study features on prediction accuracy were tested with mixed-effects models. Fifty-four of the studies provided accuracy estimates or other estimates that allowed calculation of balanced accuracy of predicting outcomes of treatment. RESULTS: Eight adequate-quality studies reported a mean accuracy of 0.63 [95% confidence interval (CI) 0.56-0.71], which was significantly lower than a mean accuracy of 0.75 (95% CI 0.72-0.78) in the other 46 studies. Among the adequate-quality studies, accuracies were higher when predicting treatment resistance (0.69) and lower when predicting remission (0.60) or response (0.56). The choice of machine learning method, feature selection, and the ratio of features to individuals were not associated with reported accuracy. CONCLUSIONS: The negative relationship between study quality and prediction accuracy, combined with a lack of independent replication, invites caution when evaluating the potential of machine learning applications for personalizing the treatment of depression.
BACKGROUND: Multiple treatments are effective for major depressive disorder (MDD), but the outcomes of each treatment vary broadly among individuals. Accurate prediction of outcomes is needed to help select a treatment that is likely to work for a given person. We aim to examine the performance of machine learning methods in delivering replicable predictions of treatment outcomes. METHODS: Of 7732 non-duplicate records identified through literature search, we retained 59 eligible reports and extracted data on sample, treatment, predictors, machine learning method, and treatment outcome prediction. A minimum sample size of 100 and an adequate validation method were used to identify adequate-quality studies. The effects of study features on prediction accuracy were tested with mixed-effects models. Fifty-four of the studies provided accuracy estimates or other estimates that allowed calculation of balanced accuracy of predicting outcomes of treatment. RESULTS: Eight adequate-quality studies reported a mean accuracy of 0.63 [95% confidence interval (CI) 0.56-0.71], which was significantly lower than a mean accuracy of 0.75 (95% CI 0.72-0.78) in the other 46 studies. Among the adequate-quality studies, accuracies were higher when predicting treatment resistance (0.69) and lower when predicting remission (0.60) or response (0.56). The choice of machine learning method, feature selection, and the ratio of features to individuals were not associated with reported accuracy. CONCLUSIONS: The negative relationship between study quality and prediction accuracy, combined with a lack of independent replication, invites caution when evaluating the potential of machine learning applications for personalizing the treatment of depression.
Authors: Nicolas Rost; Tanja M Brückl; Nikolaos Koutsouleris; Elisabeth B Binder; Bertram Müller-Myhsok Journal: BMC Med Inform Decis Mak Date: 2022-07-14 Impact factor: 3.298
Authors: Jacqueline K Harris; Stefanie Hassel; Andrew D Davis; Mojdeh Zamyadi; Stephen R Arnott; Roumen Milev; Raymond W Lam; Benicio N Frey; Geoffrey B Hall; Daniel J Müller; Susan Rotzinger; Sidney H Kennedy; Stephen C Strother; Glenda M MacQueen; Russell Greiner Journal: Neuroimage Clin Date: 2022-07-16 Impact factor: 4.891