Vlada Rozova1,2, Katrina Witt3,4, Jo Robinson3,4, Yan Li2, Karin Verspoor1,2. 1. School of Computing Technologies, RMIT University, Melbourne, Victoria, Australia. 2. School of Computing and Information Systems, The University of Melbourne, Melbourne, Victoria, Australia. 3. Orygen, Melbourne, Victoria, Australia. 4. Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia.
Abstract
OBJECTIVE: Accurate identification of self-harm presentations to Emergency Departments (ED) can lead to more timely mental health support, aid in understanding the burden of suicidal intent in a population, and support impact evaluation of public health initiatives related to suicide prevention. Given lack of manual self-harm reporting in ED, we aim to develop an automated system for the detection of self-harm presentations directly from ED triage notes. MATERIALS AND METHODS: We frame this as supervised classification using natural language processing (NLP), utilizing a large data set of 477 627 free-text triage notes from ED presentations in 2012-2018 to The Royal Melbourne Hospital, Australia. The data were highly imbalanced, with only 1.4% of triage notes relating to self-harm. We explored various preprocessing techniques, including spelling correction, negation detection, bigram replacement, and clinical concept recognition, and several machine learning methods. RESULTS: Our results show that machine learning methods dramatically outperform keyword-based methods. We achieved the best results with a calibrated Gradient Boosting model, showing 90% Precision and 90% Recall (PR-AUC 0.87) on blind test data. Prospective validation of the model achieves similar results (88% Precision; 89% Recall). DISCUSSION: ED notes are noisy texts, and simple token-based models work best. Negation detection and concept recognition did not change the results while bigram replacement significantly impaired model performance. CONCLUSION: This first NLP-based classifier for self-harm in ED notes has practical value for identifying patients who would benefit from mental health follow-up in ED, and for supporting surveillance of self-harm and suicide prevention efforts in the population.
OBJECTIVE: Accurate identification of self-harm presentations to Emergency Departments (ED) can lead to more timely mental health support, aid in understanding the burden of suicidal intent in a population, and support impact evaluation of public health initiatives related to suicide prevention. Given lack of manual self-harm reporting in ED, we aim to develop an automated system for the detection of self-harm presentations directly from ED triage notes. MATERIALS AND METHODS: We frame this as supervised classification using natural language processing (NLP), utilizing a large data set of 477 627 free-text triage notes from ED presentations in 2012-2018 to The Royal Melbourne Hospital, Australia. The data were highly imbalanced, with only 1.4% of triage notes relating to self-harm. We explored various preprocessing techniques, including spelling correction, negation detection, bigram replacement, and clinical concept recognition, and several machine learning methods. RESULTS: Our results show that machine learning methods dramatically outperform keyword-based methods. We achieved the best results with a calibrated Gradient Boosting model, showing 90% Precision and 90% Recall (PR-AUC 0.87) on blind test data. Prospective validation of the model achieves similar results (88% Precision; 89% Recall). DISCUSSION: ED notes are noisy texts, and simple token-based models work best. Negation detection and concept recognition did not change the results while bigram replacement significantly impaired model performance. CONCLUSION: This first NLP-based classifier for self-harm in ED notes has practical value for identifying patients who would benefit from mental health follow-up in ED, and for supporting surveillance of self-harm and suicide prevention efforts in the population.
Authors: Sarah Graham; Colin Depp; Ellen E Lee; Camille Nebeker; Xin Tu; Ho-Cheol Kim; Dilip V Jeste Journal: Curr Psychiatry Rep Date: 2019-11-07 Impact factor: 5.285
Authors: Heather D Anderson; Wilson D Pace; Elias Brandt; Rodney D Nielsen; Richard R Allen; Anne M Libby; David R West; Robert J Valuck Journal: J Am Board Fam Med Date: 2015 Jan-Feb Impact factor: 2.657
Authors: Holly Hedegaard; Michael Schoenbaum; Cynthia Claassen; Alex Crosby; Kristin Holland; Scott Proescholdbell Journal: Natl Health Stat Report Date: 2018-02
Authors: Keith Hawton; Helen Bergen; Deborah Casey; Sue Simkin; Ben Palmer; Jayne Cooper; Nav Kapur; Judith Horrocks; Allan House; Rachael Lilley; Rachael Noble; David Owens Journal: Soc Psychiatry Psychiatr Epidemiol Date: 2007-05-21 Impact factor: 4.328
Authors: Steven Horng; David A Sontag; Yoni Halpern; Yacine Jernite; Nathan I Shapiro; Larry A Nathanson Journal: PLoS One Date: 2017-04-06 Impact factor: 3.240
Authors: Jo Robinson; Katrina Witt; Michelle Lamblin; Matthew J Spittal; Greg Carter; Karin Verspoor; Andrew Page; Gowri Rajaram; Vlada Rozova; Nicole T M Hill; Jane Pirkis; Caitlin Bleeker; Alex Pleban; Jonathan C Knott Journal: Int J Environ Res Public Health Date: 2020-12-15 Impact factor: 3.390