Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Speculation detection for Chinese clinical notes: Impacts of word segmentation and embedding models.

Literature DB >> 26923634

Speculation detection for Chinese clinical notes: Impacts of word segmentation and embedding models.

Shaodian Zhang¹, Tian Kang¹, Xingting Zhang², Dong Wen², Noémie Elhadad¹, Jianbo Lei³.

Abstract

Speculations represent uncertainty toward certain facts. In clinical texts, identifying speculations is a critical step of natural language processing (NLP). While it is a nontrivial task in many languages, detecting speculations in Chinese clinical notes can be particularly challenging because word segmentation may be necessary as an upstream operation. The objective of this paper is to construct a state-of-the-art speculation detection system for Chinese clinical notes and to investigate whether embedding features and word segmentations are worth exploiting toward this overall task. We propose a sequence labeling based system for speculation detection, which relies on features from bag of characters, bag of words, character embedding, and word embedding. We experiment on a novel dataset of 36,828 clinical notes with 5103 gold-standard speculation annotations on 2000 notes, and compare the systems in which word embeddings are calculated based on word segmentations given by general and by domain specific segmenters respectively. Our systems are able to reach performance as high as 92.2% measured by F score. We demonstrate that word segmentation is critical to produce high quality word embedding to facilitate downstream information extraction applications, and suggest that a domain dependent word segmenter can be vital to such a clinical NLP task in Chinese language.

Entities: Disease Gene Species

Keywords: Chinese NLP; Clinical NLP; Natural language processing; Speculation detection; Word embedding; Word segmentation

Mesh：

Year: 2016 PMID： 26923634 PMCID： PMC5282586 DOI： 10.1016/j.jbi.2016.02.011

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

20 in total

1. A hybrid knowledge-based and data-driven approach to identifying semantically similar concepts.

Authors: Rimma Pivovarov; Noémie Elhadad
Journal: J Biomed Inform Date: 2012-01-25 Impact factor: 6.317

2. Named entity recognition of follow-up and time information in 20,000 radiology reports.

Authors: Yan Xu; Junichi Tsujii; Eric I-Chao Chang
Journal: J Am Med Inform Assoc Date: 2012-07-06 Impact factor: 4.497

3. Characterizing the sublanguage of online breast cancer forums for medications, symptoms, and emotions.

Authors: Noémie Elhadad; Shaodian Zhang; Patricia Driscoll; Samuel Brody
Journal: AMIA Annu Symp Proc Date: 2014-11-14

4. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries.

Authors: Min Jiang; Yukun Chen; Mei Liu; S Trent Rosenbloom; Subramani Mani; Joshua C Denny; Hua Xu
Journal: J Am Med Inform Assoc Date: 2011-04-20 Impact factor: 4.497

Review 5. Natural language processing: an introduction.

Authors: Prakash M Nadkarni; Lucila Ohno-Machado; Wendy W Chapman
Journal: J Am Med Inform Assoc Date: 2011 Sep-Oct Impact factor: 4.497

6. A flexible framework for deriving assertions from electronic medical records.

Authors: Kirk Roberts; Sanda M Harabagiu
Journal: J Am Med Inform Assoc Date: 2011-07-01 Impact factor: 4.497

7. Extracting important information from Chinese Operation Notes with natural language processing methods.

Authors: Hui Wang; Weide Zhang; Qiang Zeng; Zuofeng Li; Kaiyan Feng; Lei Liu
Journal: J Biomed Inform Date: 2014-01-31 Impact factor: 6.317

8. Detecting hedge cues and their scope in biomedical text with conditional random fields.

Authors: Shashank Agarwal; Hong Yu
Journal: J Biomed Inform Date: 2010-08-13 Impact factor: 6.317

Review 9. What can natural language processing do for clinical decision support?

Authors: Dina Demner-Fushman; Wendy W Chapman; Clement J McDonald
Journal: J Biomed Inform Date: 2009-08-13 Impact factor: 6.317

10. Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network.

Authors: Yonghui Wu; Min Jiang; Jianbo Lei; Hua Xu
Journal: Stud Health Technol Inform Date: 2015

10 in total

1. Feature extraction for phenotyping from semantic and knowledge resources.

Authors: Wenxin Ning; Stephanie Chan; Andrew Beam; Ming Yu; Alon Geva; Katherine Liao; Mary Mullen; Kenneth D Mandl; Isaac Kohane; Tianxi Cai; Sheng Yu
Journal: J Biomed Inform Date: 2019-02-07 Impact factor: 6.317

2. A cascaded approach for Chinese clinical text de-identification with less annotation effort.

Authors: Zhe Jian; Xusheng Guo; Shijian Liu; Handong Ma; Shaodian Zhang; Rui Zhang; Jianbo Lei
Journal: J Biomed Inform Date: 2017-07-26 Impact factor: 6.317

Review 3. Making Sense of Big Textual Data for Health Care: Findings from the Section on Clinical Natural Language Processing.

Authors: A Névéol; P Zweigenbaum
Journal: Yearb Med Inform Date: 2017-09-11

4. Classifying Chinese Questions Related to Health Care Posted by Consumers Via the Internet.

Authors: Haihong Guo; Xu Na; Li Hou; Jiao Li
Journal: J Med Internet Res Date: 2017-06-20 Impact factor: 5.428

5. Mining and standardizing chinese consumer health terms.

Authors: Li Hou; Hongyu Kang; Yan Liu; Luqi Li; Jiao Li
Journal: BMC Med Inform Decis Mak Date: 2018-12-07 Impact factor: 2.796

6. Constructing fine-grained entity recognition corpora based on clinical records of traditional Chinese medicine.

Authors: Tingting Zhang; Yaqiang Wang; Xiaofeng Wang; Yafei Yang; Ying Ye
Journal: BMC Med Inform Decis Mak Date: 2020-04-06 Impact factor: 2.796

7. Negation and uncertainty detection in clinical texts written in Spanish: a deep learning-based approach.

Authors: Oswaldo Solarte Pabón; Orlando Montenegro; Maria Torrente; Alejandro Rodríguez González; Mariano Provencio; Ernestina Menasalvas
Journal: PeerJ Comput Sci Date: 2022-03-07

8. Construction of an Assisted Model Based on Natural Language Processing for Automatic Early Diagnosis of Autoimmune Encephalitis.

Authors: Yunsong Zhao; Bin Ren; Wenjin Yu; Haijun Zhang; Di Zhao; Junchao Lv; Zhen Xie; Kun Jiang; Lei Shang; Han Yao; Yongyong Xu; Gang Zhao
Journal: Neurol Ther Date: 2022-05-11

9. EliIE: An open-source information extraction system for clinical trial eligibility criteria.

Authors: Tian Kang; Shaodian Zhang; Youlan Tang; Gregory W Hruby; Alexander Rusanov; Noémie Elhadad; Chunhua Weng
Journal: J Am Med Inform Assoc Date: 2017-11-01 Impact factor: 4.497

Review 10. Clinical Natural Language Processing in languages other than English: opportunities and challenges.

Authors: Aurélie Névéol; Hercules Dalianis; Sumithra Velupillai; Guergana Savova; Pierre Zweigenbaum
Journal: J Biomed Semantics Date: 2018-03-30

10 in total