Zfania Tom Korach1, Jie Yang2, Sarah Collins Rossetti3, Kenrick D Cato4, Min-Jeoung Kang2, Christopher Knaplund4, Kumiko O Schnock2, Jose P Garcia5, Haomiao Jia4, Jessica M Schwartz4, Li Zhou2. 1. Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, United States; Harvard Medical School, Boston, MA, United States. Electronic address: zkorach@bwh.harvard.edu. 2. Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, United States; Harvard Medical School, Boston, MA, United States. 3. Department of Biomedical Informatics, Columbia University, New-York, NY, United States; School of Nursing, Columbia University, New-York, NY, United States. 4. School of Nursing, Columbia University, New-York, NY, United States. 5. Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, United States.
Abstract
OBJECTIVE: Early identification and treatment of patient deterioration is crucial to improving clinical outcomes. To act, hospital rapid response (RR) teams often rely on nurses' clinical judgement typically documented narratively in the electronic health record (EHR). We developed a data-driven, unsupervised method to discover potential risk factors of RR events from nursing notes. METHODS: We applied multiple natural language processing methods, including language modelling, word embeddings, and two phrase mining methods (TextRank and NC-Value), to identify quality phrases that represent clinical entities from unannotated nursing notes. TextRank was used to determine the important word-sequences in each note. NC-Value was then used to globally rank the locally-important sequences across the whole corpus. We evaluated our method both on its accuracy compared to human judgement and on the ability of the mined phrases to predict a clinical outcome, RR event hazard. RESULTS: When applied to 61,740 hospital encounters with 1,067 RR events and 778,955 notes, our method achieved an average precision of 0.590 to 0.764 (when excluding numeric tokens). Time-dependent covariates Cox model using the phrases achieved a concordance index of 0.739. Clustering the phrases revealed clinical concepts significantly associated with RR event hazard. DISCUSSION: Our findings demonstrate that our minimal-annotation, unsurprised method can rapidly mine quality phrases from a large amount of nursing notes, and these identified phrases are useful for downstream tasks, such as clinical outcome predication and risk factor identification.
OBJECTIVE: Early identification and treatment of patient deterioration is crucial to improving clinical outcomes. To act, hospital rapid response (RR) teams often rely on nurses' clinical judgement typically documented narratively in the electronic health record (EHR). We developed a data-driven, unsupervised method to discover potential risk factors of RR events from nursing notes. METHODS: We applied multiple natural language processing methods, including language modelling, word embeddings, and two phrase mining methods (TextRank and NC-Value), to identify quality phrases that represent clinical entities from unannotated nursing notes. TextRank was used to determine the important word-sequences in each note. NC-Value was then used to globally rank the locally-important sequences across the whole corpus. We evaluated our method both on its accuracy compared to human judgement and on the ability of the mined phrases to predict a clinical outcome, RR event hazard. RESULTS: When applied to 61,740 hospital encounters with 1,067 RR events and 778,955 notes, our method achieved an average precision of 0.590 to 0.764 (when excluding numeric tokens). Time-dependent covariates Cox model using the phrases achieved a concordance index of 0.739. Clustering the phrases revealed clinical concepts significantly associated with RR event hazard. DISCUSSION: Our findings demonstrate that our minimal-annotation, unsurprised method can rapidly mine quality phrases from a large amount of nursing notes, and these identified phrases are useful for downstream tasks, such as clinical outcome predication and risk factor identification.
Authors: Rinaldo Bellomo; Donna Goldsmith; Shigehiko Uchino; Jonathan Buckmaster; Graeme Hart; Helen Opdam; William Silvester; Laurie Doolan; Geoffrey Gutteridge Journal: Crit Care Med Date: 2004-04 Impact factor: 7.598
Authors: Sarah A Collins; Kenrick Cato; David Albers; Karen Scott; Peter D Stetson; Suzanne Bakken; David K Vawdrey Journal: Am J Crit Care Date: 2013-07 Impact factor: 2.228
Authors: Bradford D Winters; Sallie J Weaver; Elizabeth R Pfoh; Ting Yang; Julius Cuong Pham; Sydney M Dy Journal: Ann Intern Med Date: 2013-03-05 Impact factor: 25.391
Authors: Rose S Solomon; Gregory S Corwin; Dawn C Barclay; Sarah F Quddusi; Michelle D Dannenberg Journal: J Hosp Med Date: 2016-02-01 Impact factor: 2.960
Authors: Elena García-García; Gracia María González-Romero; Encarna M Martín-Pérez; Enrique de Dios Zapata Cornejo; Gema Escobar-Aguilar; Marlon Félix Cárdenas Bonnet Journal: Int J Environ Res Public Health Date: 2021-01-21 Impact factor: 3.390
Authors: Dai Su; Qinmengge Li; Tao Zhang; Philip Veliz; Yingchun Chen; Kevin He; Prashant Mahajan; Xingyu Zhang Journal: BMC Med Res Methodol Date: 2022-01-14 Impact factor: 4.615
Authors: Sarah Collins Rossetti; Chris Knaplund; Dave Albers; Patricia C Dykes; Min Jeoung Kang; Tom Z Korach; Li Zhou; Kumiko Schnock; Jose Garcia; Jessica Schwartz; Li-Heng Fu; Jeffrey G Klann; Graham Lowenthal; Kenrick Cato Journal: J Am Med Inform Assoc Date: 2021-06-12 Impact factor: 4.497