Yaoyun Zhang1, Olivia Zhang2, Yonghui Wu1, Hee-Jin Lee1, Jun Xu1, Hua Xu3, Kirk Roberts4. 1. School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA. 2. St. John's School, Houston, TX 77019, USA. 3. School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA. Electronic address: hua.xu@uth.tmc.edu. 4. School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA. Electronic address: kirk.roberts@uth.tmc.edu.
Abstract
OBJECTIVE: Mental health is becoming an increasingly important topic in healthcare. Psychiatric symptoms, which consist of subjective descriptions of the patient's experience, as well as the nature and severity of mental disorders, are critical to support the phenotypic classification for personalized prevention, diagnosis, and intervention of mental disorders. However, few automated approaches have been proposed to extract psychiatric symptoms from clinical text, mainly due to (a) the lack of annotated corpora, which are time-consuming and costly to build, and (b) the inherent linguistic difficulties that symptoms present as they are not well-defined clinical concepts like diseases. The goal of this study is to investigate techniques for recognizing psychiatric symptoms in clinical text without labeled data. Instead, external knowledge in the form of publicly available "seed" lists of symptoms is leveraged using unsupervised distributional representations. MATERIALS AND METHODS: First, psychiatric symptoms are collected from three online repositories of healthcare knowledge for consumers-MedlinePlus, Mayo Clinic, and the American Psychiatric Association-for use as seed terms. Candidate symptoms in psychiatric notes are automatically extracted using phrasal syntax patterns. In particular, the 2016 CEGS N-GRID challenge data serves as the psychiatric note corpus. Second, three corpora-psychiatric notes, psychiatric forum data, and MIMIC II-are adopted to generate distributional representations with paragraph2vec. Finally, semantic similarity between the distributional representations of the seed symptoms and candidate symptoms is calculated to assess the relevance of a phrase. Experiments were performed on a set of psychiatric notes from the CEGS N-GRID 2016 Challenge. RESULTS & CONCLUSION: Our method demonstrates good performance at extracting symptoms from an unseen corpus, including symptoms with no word overlap with the provided seed terms. Semantic similarity based on the distributional representation outperformed baseline methods. Our experiment yielded two interesting results. First, distributional representations built from social media data outperformed those built from clinical data. And second, the distributional representation model built from sentences resulted in better representations of phrases than the model built from phrase alone.
OBJECTIVE: Mental health is becoming an increasingly important topic in healthcare. Psychiatric symptoms, which consist of subjective descriptions of the patient's experience, as well as the nature and severity of mental disorders, are critical to support the phenotypic classification for personalized prevention, diagnosis, and intervention of mental disorders. However, few automated approaches have been proposed to extract psychiatric symptoms from clinical text, mainly due to (a) the lack of annotated corpora, which are time-consuming and costly to build, and (b) the inherent linguistic difficulties that symptoms present as they are not well-defined clinical concepts like diseases. The goal of this study is to investigate techniques for recognizing psychiatric symptoms in clinical text without labeled data. Instead, external knowledge in the form of publicly available "seed" lists of symptoms is leveraged using unsupervised distributional representations. MATERIALS AND METHODS: First, psychiatric symptoms are collected from three online repositories of healthcare knowledge for consumers-MedlinePlus, Mayo Clinic, and the American Psychiatric Association-for use as seed terms. Candidate symptoms in psychiatric notes are automatically extracted using phrasal syntax patterns. In particular, the 2016 CEGS N-GRID challenge data serves as the psychiatric note corpus. Second, three corpora-psychiatric notes, psychiatric forum data, and MIMIC II-are adopted to generate distributional representations with paragraph2vec. Finally, semantic similarity between the distributional representations of the seed symptoms and candidate symptoms is calculated to assess the relevance of a phrase. Experiments were performed on a set of psychiatric notes from the CEGS N-GRID 2016 Challenge. RESULTS & CONCLUSION: Our method demonstrates good performance at extracting symptoms from an unseen corpus, including symptoms with no word overlap with the provided seed terms. Semantic similarity based on the distributional representation outperformed baseline methods. Our experiment yielded two interesting results. First, distributional representations built from social media data outperformed those built from clinical data. And second, the distributional representation model built from sentences resulted in better representations of phrases than the model built from phrase alone.
Authors: John P Pestian; Jacqueline Grupp-Phelan; Kevin Bretonnel Cohen; Gabriel Meyers; Linda A Richey; Pawel Matykiewicz; Michael T Sorter Journal: Suicide Life Threat Behav Date: 2015-08-07
Authors: Mohammed Saeed; Mauricio Villarroel; Andrew T Reisner; Gari Clifford; Li-Wei Lehman; George Moody; Thomas Heldt; Tin H Kyaw; Benjamin Moody; Roger G Mark Journal: Crit Care Med Date: 2011-05 Impact factor: 7.598
Authors: Enola K Proctor; John Landsverk; Gregory Aarons; David Chambers; Charles Glisson; Brian Mittman Journal: Adm Policy Ment Health Date: 2008-12-23
Authors: Rashmi Patel; Robin Wilson; Richard Jackson; Michael Ball; Hitesh Shetty; Matthew Broadbent; Robert Stewart; Philip McGuire; Sagnik Bhattacharyya Journal: Lancet Date: 2015-02-26 Impact factor: 79.321
Authors: Thomas H McCoy; Victor M Castro; Ashlee M Roberson; Leslie A Snapper; Roy H Perlis Journal: JAMA Psychiatry Date: 2016-10-01 Impact factor: 21.596
Authors: A Rumshisky; M Ghassemi; T Naumann; P Szolovits; V M Castro; T H McCoy; R H Perlis Journal: Transl Psychiatry Date: 2016-10-18 Impact factor: 6.222
Authors: Jejo D Koola; Sharon E Davis; Omar Al-Nimri; Sharidan K Parr; Daniel Fabbri; Bradley A Malin; Samuel B Ho; Michael E Matheny Journal: J Biomed Inform Date: 2018-03-09 Impact factor: 6.317