Maxwell Levis1,2, Christine Leonard Westgate1, Jiang Gui2, Bradley V Watts2,3, Brian Shiner1,2,3,4. 1. White River Junction VA Medical Center, White River Junction, VT, USA. 2. Geisel School of Medicine at Dartmouth, Hanover, NH, USA. 3. VA Office of Systems Redesign and Improvement, White River Junction, VT, USA. 4. National Center for PTSD Executive Division, White River Junction, VT, USA.
Abstract
BACKGROUND: This study evaluated whether natural language processing (NLP) of psychotherapy note text provides additional accuracy over and above currently used suicide prediction models. METHODS: We used a cohort of Veterans Health Administration (VHA) users diagnosed with post-traumatic stress disorder (PTSD) between 2004-2013. Using a case-control design, cases (those that died by suicide during the year following diagnosis) were matched to controls (those that remained alive). After selecting conditional matches based on having shared mental health providers, we chose controls using a 5:1 nearest-neighbor propensity match based on the VHA's structured Electronic Medical Records (EMR)-based suicide prediction model. For cases, psychotherapist notes were collected from diagnosis until death. For controls, psychotherapist notes were collected from diagnosis until matched case's date of death. After ensuring similar numbers of notes, the final sample included 246 cases and 986 controls. Notes were analyzed using Sentiment Analysis and Cognition Engine, a Python-based NLP package. The output was evaluated using machine-learning algorithms. The area under the curve (AUC) was calculated to determine models' predictive accuracy. RESULTS: NLP derived variables offered small but significant predictive improvement (AUC = 0.58) for patients that had longer treatment duration. A small sample size limited predictive accuracy. CONCLUSIONS: Study identifies a novel method for measuring suicide risk over time and potentially categorizing patient subgroups with distinct risk sensitivities. Findings suggest leveraging NLP derived variables from psychotherapy notes offers an additional predictive value over and above the VHA's state-of-the-art structured EMR-based suicide prediction model. Replication with a larger non-PTSD specific sample is required.
BACKGROUND: This study evaluated whether natural language processing (NLP) of psychotherapy note text provides additional accuracy over and above currently used suicide prediction models. METHODS: We used a cohort of Veterans Health Administration (VHA) users diagnosed with post-traumatic stress disorder (PTSD) between 2004-2013. Using a case-control design, cases (those that died by suicide during the year following diagnosis) were matched to controls (those that remained alive). After selecting conditional matches based on having shared mental health providers, we chose controls using a 5:1 nearest-neighbor propensity match based on the VHA's structured Electronic Medical Records (EMR)-based suicide prediction model. For cases, psychotherapist notes were collected from diagnosis until death. For controls, psychotherapist notes were collected from diagnosis until matched case's date of death. After ensuring similar numbers of notes, the final sample included 246 cases and 986 controls. Notes were analyzed using Sentiment Analysis and Cognition Engine, a Python-based NLP package. The output was evaluated using machine-learning algorithms. The area under the curve (AUC) was calculated to determine models' predictive accuracy. RESULTS: NLP derived variables offered small but significant predictive improvement (AUC = 0.58) for patients that had longer treatment duration. A small sample size limited predictive accuracy. CONCLUSIONS: Study identifies a novel method for measuring suicide risk over time and potentially categorizing patient subgroups with distinct risk sensitivities. Findings suggest leveraging NLP derived variables from psychotherapy notes offers an additional predictive value over and above the VHA's state-of-the-art structured EMR-based suicide prediction model. Replication with a larger non-PTSD specific sample is required.
Entities:
Keywords:
Electronic medical records; natural language processing; suicide prediction; veterans mental health
Authors: Kimberly A Van Orden; Tracy K Witte; Kelly C Cukrowicz; Scott R Braithwaite; Edward A Selby; Thomas E Joiner Journal: Psychol Rev Date: 2010-04 Impact factor: 8.934
Authors: John Torous; Mark E Larsen; Colin Depp; Theodore D Cosco; Ian Barnett; Matthew K Nock; Joe Firth Journal: Curr Psychiatry Rep Date: 2018-06-28 Impact factor: 5.285
Authors: Linda Ganzini; Lauren M Denneson; Nancy Press; Matthew J Bair; Drew A Helmer; Jennifer Poat; Steven K Dobscha Journal: J Gen Intern Med Date: 2013-09 Impact factor: 5.128
Authors: John A Schinka; Katherine C Schinka; Roger J Casey; Wes Kasprow; Robert M Bossarte Journal: Am J Public Health Date: 2012-01-25 Impact factor: 9.308
Authors: Ronald C Kessler; Irving Hwang; Claire A Hoffmire; John F McCarthy; Maria V Petukhova; Anthony J Rosellini; Nancy A Sampson; Alexandra L Schneider; Paul A Bradley; Ira R Katz; Caitlin Thompson; Robert M Bossarte Journal: Int J Methods Psychiatr Res Date: 2017-07-04 Impact factor: 4.035
Authors: Kimberly Narain; Bevanne Bean-Mayberry; Donna L Washington; Ismelda A Canelo; Jill E Darling; Elizabeth M Yano Journal: Womens Health Issues Date: 2018-02-21
Authors: Jenna A Forehand; Talya Peltzman; Christine Leonard Westgate; Natalie B Riblet; Bradley V Watts; Brian Shiner Journal: Am J Prev Med Date: 2019-06-24 Impact factor: 6.604
Authors: Robert M Bossarte; Chris J Kennedy; Alex Luedtke; Matthew K Nock; Jordan W Smoller; Cara Stokes; Ronald C Kessler Journal: Am J Epidemiol Date: 2021-12-01 Impact factor: 4.897
Authors: Esther Lydia Meerwijk; Suzanne R Tamang; Andrea K Finlay; Mark A Ilgen; Ruth M Reeves; Alex H S Harris Journal: BMJ Open Date: 2022-08-24 Impact factor: 3.006