Literature DB >> 27539197

A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD).

Yonghui Wu1, Joshua C Denny2,3, S Trent Rosenbloom2,3, Randolph A Miller2,3, Dario A Giuse2, Lulu Wang3, Carmelo Blanquicett4, Ergin Soysal1, Jun Xu1, Hua Xu1.   

Abstract

OBJECTIVE: The goal of this study was to develop a practical framework for recognizing and disambiguating clinical abbreviations, thereby improving current clinical natural language processing (NLP) systems' capability to handle abbreviations in clinical narratives.
METHODS: We developed an open-source framework for clinical abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of abbreviations, and (3) profile-based word sense disambiguation methods for clinical abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. RESULTS AND
CONCLUSION: CARD detected 27 317 and 107 303 distinct abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache's clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap's performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at https://sbmi.uth.edu/ccb/resources/abbreviation.htm . We believe the CARD framework can be a valuable resource for improving abbreviation identification in clinical NLP systems.
© The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Keywords:  clinical abbreviation; clinical natural language processing; machine learning; sense clustering

Mesh:

Year:  2017        PMID: 27539197      PMCID: PMC7651947          DOI: 10.1093/jamia/ocw109

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  30 in total

1.  A study of abbreviations in the UMLS.

Authors:  H Liu; Y A Lussier; C Friedman
Journal:  Proc AMIA Symp       Date:  2001

2.  The KnowledgeMap project: development of a concept-based medical school curriculum database.

Authors:  Joshua C Denny; Plomarz R Irani; Firas H Wehbe; Jeffrey D Smithers; Anderson Spickard
Journal:  AMIA Annu Symp Proc       Date:  2003

Review 3.  Word sense disambiguation in the biomedical domain: an overview.

Authors:  Martijn J Schuemie; Jan A Kors; Barend Mons
Journal:  J Comput Biol       Date:  2005-06       Impact factor: 1.479

4.  Ambiguous abbreviations: an audit of abbreviations in paediatric note keeping.

Authors:  J E Sheppard; L C E Weidner; S Zakai; S Fountain-Polley; J Williams
Journal:  Arch Dis Child       Date:  2007-11-06       Impact factor: 3.791

5.  A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time.

Authors:  Y Wu; J C Denny; S T Rosenbloom; R A Miller; D A Giuse; M Song; H Xu
Journal:  Appl Clin Inform       Date:  2015-06-03       Impact factor: 2.342

6.  Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.

Authors:  Sungrim Moon; Serguei Pakhomov; Genevieve B Melton
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

7.  Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.

Authors:  Hua Xu; Peter D Stetson; Carol Friedman
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

8.  Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing.

Authors:  Sungrim Moon; Bjoern-Toby Berster; Hua Xu; Trevor Cohen
Journal:  AMIA Annu Symp Proc       Date:  2013-11-16

9.  A new clustering method for detecting rare senses of abbreviations in clinical notes.

Authors:  Hua Xu; Yonghui Wu; Noémie Elhadad; Peter D Stetson; Carol Friedman
Journal:  J Biomed Inform       Date:  2012-06-25       Impact factor: 6.317

10.  Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues.

Authors:  Hua Xu; Marianthi Markatou; Rositsa Dimova; Hongfang Liu; Carol Friedman
Journal:  BMC Bioinformatics       Date:  2006-07-05       Impact factor: 3.169

View more
  14 in total

1.  EHR problem list clustering for improved topic-space navigation.

Authors:  Markus Kreuzthaler; Bastian Pfeifer; Jose Antonio Vera Ramos; Diether Kramer; Victor Grogger; Sylvia Bredenfeldt; Markus Pedevilla; Peter Krisper; Stefan Schulz
Journal:  BMC Med Inform Decis Mak       Date:  2019-04-04       Impact factor: 2.796

Review 2.  Making Sense of Big Textual Data for Health Care: Findings from the Section on Clinical Natural Language Processing.

Authors:  A Névéol; P Zweigenbaum
Journal:  Yearb Med Inform       Date:  2017-09-11

3.  A method for harmonization of clinical abbreviation and acronym sense inventories.

Authors:  Lisa V Grossman; Elliot G Mitchell; George Hripcsak; Chunhua Weng; David K Vawdrey
Journal:  J Biomed Inform       Date:  2018-11-07       Impact factor: 6.317

4.  Artificial Intelligence Assesses Clinicians' Adherence to Asthma Guidelines Using Electronic Health Records.

Authors:  Elham Sagheb; Chung-Il Wi; Jungwon Yoon; Hee Yun Seol; Pragya Shrestha; Euijung Ryu; Miguel Park; Barbara Yawn; Hongfang Liu; Jason Homme; Young Juhn; Sunghwan Sohn
Journal:  J Allergy Clin Immunol Pract       Date:  2021-11-17

5.  Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets.

Authors:  Denis Newman-Griffis; Guy Divita; Bart Desmet; Ayah Zirikly; Carolyn P Rosé; Eric Fosler-Lussier
Journal:  J Am Med Inform Assoc       Date:  2021-03-01       Impact factor: 4.497

6.  CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines.

Authors:  Ergin Soysal; Jingqi Wang; Min Jiang; Yonghui Wu; Serguei Pakhomov; Hongfang Liu; Hua Xu
Journal:  J Am Med Inform Assoc       Date:  2018-03-01       Impact factor: 4.497

7.  FasTag: Automatic text classification of unstructured medical narratives.

Authors:  Guhan Ram Venkataraman; Arturo Lopez Pineda; Oliver J Bear Don't Walk Iv; Ashley M Zehnder; Sandeep Ayyar; Rodney L Page; Carlos D Bustamante; Manuel A Rivas
Journal:  PLoS One       Date:  2020-06-22       Impact factor: 3.240

8.  Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions.

Authors:  Sunghwan Sohn; Yanshan Wang; Chung-Il Wi; Elizabeth A Krusemark; Euijung Ryu; Mir H Ali; Young J Juhn; Hongfang Liu
Journal:  J Am Med Inform Assoc       Date:  2018-03-01       Impact factor: 4.497

9.  Efficient Reuse of Natural Language Processing Models for Phenotype-Mention Identification in Free-text Electronic Medical Records: A Phenotype Embedding Approach.

Authors:  Honghan Wu; Karen Hodgson; Sue Dyson; Katherine I Morley; Zina M Ibrahim; Ehtesham Iqbal; Robert Stewart; Richard Jb Dobson; Cathie Sudlow
Journal:  JMIR Med Inform       Date:  2019-12-17

10.  A deep database of medical abbreviations and acronyms for natural language processing.

Authors:  Lisa Grossman Liu; Raymond H Grossman; Elliot G Mitchell; Chunhua Weng; Karthik Natarajan; George Hripcsak; David K Vawdrey
Journal:  Sci Data       Date:  2021-06-02       Impact factor: 6.444

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.