Literature DB >> 26232443

Normalizing clinical terms using learned edit distance patterns.

Rohit J Kate1.   

Abstract

BACKGROUND: Variations of clinical terms are very commonly encountered in clinical texts. Normalization methods that use similarity measures or hand-coded approximation rules for matching clinical terms to standard terminologies have limited accuracy and coverage.
MATERIALS AND METHODS: In this paper, a novel method is presented that automatically learns patterns of variations of clinical terms from known variations from a resource such as the Unified Medical Language System (UMLS). The patterns are first learned by computing edit distances between the known variations, which are then appropriately generalized for normalizing previously unseen terms. The method was applied and evaluated on the disease and disorder mention normalization task using the dataset of SemEval 2014 and compared with the normalization ability of the MetaMap system and a method based on cosine similarity.
RESULTS: Excluding the mentions that already exactly match in UMLS and the training dataset, the proposed method obtained 64.7% accuracy on the rest of the test dataset. The accuracy was calculated as the number of mentions that correctly matched the gold-standard concept unique identifiers (CUIs) or correctly matched to be without a CUI. In comparison, MetaMap's accuracy was 41.9% and cosine similarity's accuracy was 44.6%. When only the output CUIs were evaluated, the proposed method obtained 54.4% best F-measure (at 92.1% precision and 38.6% recall) while MetaMap obtained 19.4% best F-measure (at 38.0% precision and 13.0% recall) and cosine similarity obtained 38.1% best F-measure (at 70.3% precision and 26.1% recall).
CONCLUSIONS: The novel method was found to perform much better than the MetaMap system and the cosine similarity based method in normalizing disease mentions in clinical text that did not exactly match in UMLS. The method is also general and can be used for normalizing clinical terms of other semantic types as well.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  clinical terms; edit distance; normalization

Mesh:

Year:  2015        PMID: 26232443     DOI: 10.1093/jamia/ocv108

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  9 in total

1.  The bird's-eye view: A data-driven approach to understanding patient journeys from claims data.

Authors:  Katherine Bobroske; Christine Larish; Anita Cattrell; Margrét V Bjarnadóttir; Lawrence Huan
Journal:  J Am Med Inform Assoc       Date:  2020-07-01       Impact factor: 4.497

2.  Clinical concept normalization with a hybrid natural language processing system combining multilevel matching and machine learning ranking.

Authors:  Long Chen; Wenbo Fu; Yu Gu; Zhiyong Sun; Haodan Li; Enyu Li; Li Jiang; Yuan Gao; Yang Huang
Journal:  J Am Med Inform Assoc       Date:  2020-10-01       Impact factor: 4.497

3.  The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records.

Authors:  Sam Henry; Yanshan Wang; Feichen Shen; Ozlem Uzuner
Journal:  J Am Med Inform Assoc       Date:  2020-10-01       Impact factor: 4.497

Review 4.  Capturing the Patient's Perspective: a Review of Advances in Natural Language Processing of Health-Related Text.

Authors:  G Gonzalez-Hernandez; A Sarker; K O'Connor; G Savova
Journal:  Yearb Med Inform       Date:  2017-09-11

5.  A simple neural vector space model for medical concept normalization using concept embeddings.

Authors:  Dongfang Xu; Timothy Miller
Journal:  J Biomed Inform       Date:  2022-04-23       Impact factor: 8.000

Review 6.  Recent advances in biomedical literature mining.

Authors:  Sendong Zhao; Chang Su; Zhiyong Lu; Fei Wang
Journal:  Brief Bioinform       Date:  2021-05-20       Impact factor: 11.622

7.  Mapping Phenotypic Information in Heterogeneous Textual Sources to a Domain-Specific Terminological Resource.

Authors:  Noha Alnazzawi; Paul Thompson; Sophia Ananiadou
Journal:  PLoS One       Date:  2016-09-19       Impact factor: 3.240

8.  Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)-Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study.

Authors:  Fei Li; Yonghao Jin; Weisong Liu; Bhanu Pratap Singh Rawat; Pengshan Cai; Hong Yu
Journal:  JMIR Med Inform       Date:  2019-09-12

9.  Unified Medical Language System resources improve sieve-based generation and Bidirectional Encoder Representations from Transformers (BERT)-based ranking for concept normalization.

Authors:  Dongfang Xu; Manoj Gopale; Jiacheng Zhang; Kris Brown; Edmon Begoli; Steven Bethard
Journal:  J Am Med Inform Assoc       Date:  2020-10-01       Impact factor: 4.497

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.