Literature DB >> 30802545

MCN: A comprehensive corpus for medical concept normalization.

Yen-Fu Luo1, Weiyi Sun2, Anna Rumshisky3.   

Abstract

Normalization of clinical text involves linking different ways of talking about the same clinical concept to the same term in the standardized vocabulary. To date, very few annotated corpora for normalization have been available, and existing corpora so far have been limited in scope and only dealt with the normalization of diseases and disorders. In this paper, we describe the annotation methodology we developed in order to create a new manually annotated wide-coverage corpus for clinical concept normalization, the Medical Concept Normalization (MCN) corpus. In order to ensure wider coverage, we applied normalization to the text spans corresponding to the medical problems, treatments, and tests in the named entity corpus released for the fourth i2b2/VA shared task. In contrast to previous annotation efforts, we do not assign multiple concept labels to the named entities that do not map to a unique concept in the controlled vocabulary. Nor do we leave that named entity without a concept label. Instead, our normalization method that splits such named entities, resolving some of the core ambiguity issues. Lastly, we supply a sieve-based normalization baseline for MCN which combines MetaMap with multiple exact match components. The resulting corpus consists of 100 discharge summaries and provides normalization for the total of 10,919 concept mentions, using 3792 unique concepts from two controlled vocabularies. Our inter-annotator agreement is 67.69% pre-adjudication and 74.20% post-adjudication. Our sieve-based normalization baseline for MCN achieves 77% accuracy in cross-validation. We also detail the challenges of creating a normalization corpus, including the limitations deriving from both the mention span selection and the ambiguity and inconsistency within the current standardized terminologies. In order to facilitate the development of improved concept normalization methods, the MCN corpus will be publicly released to the research community in a shared task in 2019.
Copyright © 2019 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Annotation; Clinical concept normalization; Medical informatics; Natural language processing

Mesh:

Year:  2019        PMID: 30802545     DOI: 10.1016/j.jbi.2019.103132

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  11 in total

1.  RadLex Normalization in Radiology Reports.

Authors:  Surabhi Datta; Jordan Godfrey-Stovall; Kirk Roberts
Journal:  AMIA Annu Symp Proc       Date:  2021-01-25

2.  Clinical concept normalization with a hybrid natural language processing system combining multilevel matching and machine learning ranking.

Authors:  Long Chen; Wenbo Fu; Yu Gu; Zhiyong Sun; Haodan Li; Enyu Li; Li Jiang; Yuan Gao; Yang Huang
Journal:  J Am Med Inform Assoc       Date:  2020-10-01       Impact factor: 4.497

3.  The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records.

Authors:  Sam Henry; Yanshan Wang; Feichen Shen; Ozlem Uzuner
Journal:  J Am Med Inform Assoc       Date:  2020-10-01       Impact factor: 4.497

4.  Chemical identification and indexing in PubMed full-text articles using deep learning and heuristics.

Authors:  Tiago Almeida; Rui Antunes; João F Silva; João R Almeida; Sérgio Matos
Journal:  Database (Oxford)       Date:  2022-07-01       Impact factor: 4.462

5.  A simple neural vector space model for medical concept normalization using concept embeddings.

Authors:  Dongfang Xu; Timothy Miller
Journal:  J Biomed Inform       Date:  2022-04-23       Impact factor: 8.000

Review 6.  Artificial Intelligence-Based Pharmacovigilance in the Setting of Limited Resources.

Authors:  Likeng Liang; Jifa Hu; Gang Sun; Na Hong; Ge Wu; Yuejun He; Yong Li; Tianyong Hao; Li Liu; Mengchun Gong
Journal:  Drug Saf       Date:  2022-05-17       Impact factor: 5.228

7.  Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets.

Authors:  Denis Newman-Griffis; Guy Divita; Bart Desmet; Ayah Zirikly; Carolyn P Rosé; Eric Fosler-Lussier
Journal:  J Am Med Inform Assoc       Date:  2021-03-01       Impact factor: 4.497

8.  Neural Machine Translation-Based Automated Current Procedural Terminology Classification System Using Procedure Text: Development and Validation Study.

Authors:  Hyeon Joo; Michael Burns; Sai Saradha Kalidaikurichi Lakshmanan; Yaokun Hu; V G Vinod Vydiswaran
Journal:  JMIR Form Res       Date:  2021-05-26

9.  Unsupervised inference of implicit biomedical events using context triggers.

Authors:  Jin-Woo Chung; Wonsuk Yang; Jong C Park
Journal:  BMC Bioinformatics       Date:  2020-01-28       Impact factor: 3.169

10.  Unified Medical Language System resources improve sieve-based generation and Bidirectional Encoder Representations from Transformers (BERT)-based ranking for concept normalization.

Authors:  Dongfang Xu; Manoj Gopale; Jiacheng Zhang; Kris Brown; Edmon Begoli; Steven Bethard
Journal:  J Am Med Inform Assoc       Date:  2020-10-01       Impact factor: 4.497

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.