Ilkin Bayramli1,2, Victor Castro3,4, Yuval Barak-Corren1, Emily M Madsen5,6, Matthew K Nock4,7,8, Jordan W Smoller5,6,9, Ben Y Reis1,9. 1. Predictive Medicine Group, Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA. 2. Harvard University, Cambridge, Massachusetts, USA. 3. Mass General Brigham Research Information Science and Computing, Boston, Massachusetts, USA. 4. Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts, USA. 5. Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA. 6. Department of Psychiatry, Center for Precision Psychiatry, Massachusetts General Hospital, Boston, Massachusetts, USA. 7. Department of Psychology, Harvard University, Cambridge, Massachusetts, USA. 8. Mental Health Research Program, Franciscan Children's, Brighton, Massachusetts, USA. 9. Harvard Medical School, Boston, Massachusetts, USA.
Abstract
OBJECTIVE: Suicide is one of the leading causes of death worldwide, yet clinicians find it difficult to reliably identify individuals at high risk for suicide. Algorithmic approaches for suicide risk detection have been developed in recent years, mostly based on data from electronic health records (EHRs). Significant room for improvement remains in the way these models take advantage of temporal information to improve predictions. MATERIALS AND METHODS: We propose a temporally enhanced variant of the random forest (RF) model-Omni-Temporal Balanced Random Forests (OT-BRFs)-that incorporates temporal information in every tree within the forest. We develop and validate this model using longitudinal EHRs and clinician notes from the Mass General Brigham Health System recorded between 1998 and 2018, and compare its performance to a baseline Naive Bayes Classifier and 2 standard versions of balanced RFs. RESULTS: Temporal variables were found to be associated with suicide risk: Elevated suicide risk was observed in individuals with a higher total number of visits as well as those with a low rate of visits over time, while lower suicide risk was observed in individuals with a longer period of EHR coverage. RF models were more accurate than Naive Bayesian classifiers at predicting suicide risk in advance (area under the receiver operating curve = 0.824 vs. 0.754, respectively). The proposed OT-BRF model performed best among all RF approaches, yielding a sensitivity of 0.339 at 95% specificity, compared to 0.290 and 0.286 for the other 2 RF models. Temporal variables were assigned high importance by the models that incorporated them. DISCUSSION: We demonstrate that temporal variables have an important role to play in suicide risk detection and that requiring their inclusion in all RF trees leads to increased predictive performance. Integrating temporal information into risk prediction models helps the models interpret patient data in temporal context, improving predictive performance.
OBJECTIVE: Suicide is one of the leading causes of death worldwide, yet clinicians find it difficult to reliably identify individuals at high risk for suicide. Algorithmic approaches for suicide risk detection have been developed in recent years, mostly based on data from electronic health records (EHRs). Significant room for improvement remains in the way these models take advantage of temporal information to improve predictions. MATERIALS AND METHODS: We propose a temporally enhanced variant of the random forest (RF) model-Omni-Temporal Balanced Random Forests (OT-BRFs)-that incorporates temporal information in every tree within the forest. We develop and validate this model using longitudinal EHRs and clinician notes from the Mass General Brigham Health System recorded between 1998 and 2018, and compare its performance to a baseline Naive Bayes Classifier and 2 standard versions of balanced RFs. RESULTS: Temporal variables were found to be associated with suicide risk: Elevated suicide risk was observed in individuals with a higher total number of visits as well as those with a low rate of visits over time, while lower suicide risk was observed in individuals with a longer period of EHR coverage. RF models were more accurate than Naive Bayesian classifiers at predicting suicide risk in advance (area under the receiver operating curve = 0.824 vs. 0.754, respectively). The proposed OT-BRF model performed best among all RF approaches, yielding a sensitivity of 0.339 at 95% specificity, compared to 0.290 and 0.286 for the other 2 RF models. Temporal variables were assigned high importance by the models that incorporated them. DISCUSSION: We demonstrate that temporal variables have an important role to play in suicide risk detection and that requiring their inclusion in all RF trees leads to increased predictive performance. Integrating temporal information into risk prediction models helps the models interpret patient data in temporal context, improving predictive performance.
Authors: Yuval Barak-Corren; Victor M Castro; Solomon Javitt; Alison G Hoffnagle; Yael Dai; Roy H Perlis; Matthew K Nock; Jordan W Smoller; Ben Y Reis Journal: Am J Psychiatry Date: 2016-09-09 Impact factor: 18.112
Authors: Thomas H McCoy; Sheng Yu; Kamber L Hart; Victor M Castro; Hannah E Brown; James N Rosenquist; Alysa E Doyle; Pieter J Vuijk; Tianxi Cai; Roy H Perlis Journal: Biol Psychiatry Date: 2018-02-26 Impact factor: 13.382
Authors: Yuval Barak-Corren; Victor M Castro; Matthew K Nock; Kenneth D Mandl; Emily M Madsen; Ashley Seiger; William G Adams; R Joseph Applegate; Elmer V Bernstam; Jeffrey G Klann; Ellen P McCarthy; Shawn N Murphy; Marc Natter; Brian Ostasiewski; Nandan Patibandla; Gary E Rosenthal; George S Silva; Kun Wei; Griffin M Weber; Sarah R Weiler; Ben Y Reis; Jordan W Smoller Journal: JAMA Netw Open Date: 2020-03-02
Authors: Jordan W Smoller; Ben Y Reis; Ilkin Bayramli; Victor Castro; Yuval Barak-Corren; Emily M Madsen; Matthew K Nock Journal: NPJ Digit Med Date: 2022-01-27