Literature DB >> 24347408

A comprehensive study of named entity recognition in Chinese clinical text.

Jianbo Lei1, Buzhou Tang2, Xueqin Lu3, Kaihua Gao3, Min Jiang4, Hua Xu4.   

Abstract

OBJECTIVE: Named entity recognition (NER) is one of the fundamental tasks in natural language processing. In the medical domain, there have been a number of studies on NER in English clinical notes; however, very limited NER research has been carried out on clinical notes written in Chinese. The goal of this study was to systematically investigate features and machine learning algorithms for NER in Chinese clinical text.
MATERIALS AND METHODS: We randomly selected 400 admission notes and 400 discharge summaries from Peking Union Medical College Hospital in China. For each note, four types of entity-clinical problems, procedures, laboratory test, and medications-were annotated according to a predefined guideline. Two-thirds of the 400 notes were used to train the NER systems and one-third for testing. We investigated the effects of different types of feature including bag-of-characters, word segmentation, part-of-speech, and section information, and different machine learning algorithms including conditional random fields (CRF), support vector machines (SVM), maximum entropy (ME), and structural SVM (SSVM) on the Chinese clinical NER task. All classifiers were trained on the training dataset and evaluated on the test set, and micro-averaged precision, recall, and F-measure were reported.
RESULTS: Our evaluation on the independent test set showed that most types of feature were beneficial to Chinese NER systems, although the improvements were limited. The system achieved the highest performance by combining word segmentation and section information, indicating that these two types of feature complement each other. When the same types of optimized feature were used, CRF and SSVM outperformed SVM and ME. More specifically, SSVM achieved the highest performance of the four algorithms, with F-measures of 93.51% and 90.01% for admission notes and discharge summaries, respectively. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Entities:  

Keywords:  Chinese clinic notes; Machine learning algorithm; Medical concept recognition; Natural language processing

Mesh:

Year:  2013        PMID: 24347408      PMCID: PMC4147609          DOI: 10.1136/amiajnl-2013-002381

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  10 in total

1.  Extracting medication information from clinical text.

Authors:  Ozlem Uzuner; Imre Solti; Eithon Cadag
Journal:  J Am Med Inform Assoc       Date:  2010 Sep-Oct       Impact factor: 4.497

2.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

3.  Agreement, the f-measure, and reliability in information retrieval.

Authors:  George Hripcsak; Adam S Rothschild
Journal:  J Am Med Inform Assoc       Date:  2005-01-31       Impact factor: 4.497

Review 4.  Extracting information from textual documents in the electronic health record: a review of recent research.

Authors:  S M Meystre; G K Savova; K C Kipper-Schuler; J F Hurdle
Journal:  Yearb Med Inform       Date:  2008

5.  Joint segmentation and named entity recognition using dual decomposition in Chinese discharge summaries.

Authors:  Yan Xu; Yining Wang; Tianren Liu; Jiahua Liu; Yubo Fan; Yi Qian; Junichi Tsujii; Eric I Chang
Journal:  J Am Med Inform Assoc       Date:  2013-08-09       Impact factor: 4.497

6.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.

Authors:  Özlem Uzuner; Brett R South; Shuying Shen; Scott L DuVall
Journal:  J Am Med Inform Assoc       Date:  2011-06-16       Impact factor: 4.497

7.  A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries.

Authors:  Min Jiang; Yukun Chen; Mei Liu; S Trent Rosenbloom; Subramani Mani; Joshua C Denny; Hua Xu
Journal:  J Am Med Inform Assoc       Date:  2011-04-20       Impact factor: 4.497

8.  A general natural-language text processor for clinical radiology.

Authors:  C Friedman; P O Alderson; J H Austin; J J Cimino; S B Johnson
Journal:  J Am Med Inform Assoc       Date:  1994 Mar-Apr       Impact factor: 4.497

9.  Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010.

Authors:  Berry de Bruijn; Colin Cherry; Svetlana Kiritchenko; Joel Martin; Xiaodan Zhu
Journal:  J Am Med Inform Assoc       Date:  2011-05-12       Impact factor: 4.497

10.  Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features.

Authors:  Buzhou Tang; Hongxin Cao; Yonghui Wu; Min Jiang; Hua Xu
Journal:  BMC Med Inform Decis Mak       Date:  2013-04-05       Impact factor: 2.796

  10 in total
  25 in total

Review 1.  Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare.

Authors:  A Névéol; P Zweigenbaum
Journal:  Yearb Med Inform       Date:  2015-08-13

2.  Trends in biomedical informatics: automated topic analysis of JAMIA articles.

Authors:  Dong Han; Shuang Wang; Chao Jiang; Xiaoqian Jiang; Hyeon-Eui Kim; Jimeng Sun; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2015-11       Impact factor: 4.497

3.  Speculation detection for Chinese clinical notes: Impacts of word segmentation and embedding models.

Authors:  Shaodian Zhang; Tian Kang; Xingting Zhang; Dong Wen; Noémie Elhadad; Jianbo Lei
Journal:  J Biomed Inform       Date:  2016-02-26       Impact factor: 6.317

4.  A cascaded approach for Chinese clinical text de-identification with less annotation effort.

Authors:  Zhe Jian; Xusheng Guo; Shijian Liu; Handong Ma; Shaodian Zhang; Rui Zhang; Jianbo Lei
Journal:  J Biomed Inform       Date:  2017-07-26       Impact factor: 6.317

5.  Automatic approach for constructing a knowledge graph of knee osteoarthritis in Chinese.

Authors:  Xin Li; Haoyang Liu; Xu Zhao; Guigang Zhang; Chunxiao Xing
Journal:  Health Inf Sci Syst       Date:  2020-02-27

6.  Research on Named Entity Recognition Method of Metro On-Board Equipment Based on Multiheaded Self-Attention Mechanism and CNN-BiLSTM-CRF.

Authors:  Junting Lin; Endong Liu
Journal:  Comput Intell Neurosci       Date:  2022-07-06

7.  Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network.

Authors:  Yonghui Wu; Min Jiang; Jianbo Lei; Hua Xu
Journal:  Stud Health Technol Inform       Date:  2015

8.  Machine Reading for Extraction of Bacteria and Habitat Taxonomies.

Authors:  Parisa Kordjamshidi; Wouter Massa; Thomas Provoost; Marie-Francine Moens
Journal:  Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap       Date:  2016-01-05

9.  Cruxome: a powerful tool for annotating, interpreting and reporting genetic variants.

Authors:  Qingmei Han; Ying Yang; Shengyang Wu; Yingchun Liao; Shuang Zhang; Hongbin Liang; David S Cram; Yu Zhang
Journal:  BMC Genomics       Date:  2021-06-03       Impact factor: 3.969

10.  Data-Driven Information Extraction from Chinese Electronic Medical Records.

Authors:  Dong Xu; Meizhuo Zhang; Tianwan Zhao; Chen Ge; Weiguo Gao; Jia Wei; Kenny Q Zhu
Journal:  PLoS One       Date:  2015-08-21       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.