Literature DB >> 24441986

Word sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge-poor unsupervised methods.

Rachel Chasin1, Anna Rumshisky2, Ozlem Uzuner3, Peter Szolovits1.   

Abstract

OBJECTIVE: To evaluate state-of-the-art unsupervised methods on the word sense disambiguation (WSD) task in the clinical domain. In particular, to compare graph-based approaches relying on a clinical knowledge base with bottom-up topic-modeling-based approaches. We investigate several enhancements to the topic-modeling techniques that use domain-specific knowledge sources.
MATERIALS AND METHODS: The graph-based methods use variations of PageRank and distance-based similarity metrics, operating over the Unified Medical Language System (UMLS). Topic-modeling methods use unlabeled data from the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC II) database to derive models for each ambiguous word. We investigate the impact of using different linguistic features for topic models, including UMLS-based and syntactic features. We use a sense-tagged clinical dataset from the Mayo Clinic for evaluation.
RESULTS: The topic-modeling methods achieve 66.9% accuracy on a subset of the Mayo Clinic's data, while the graph-based methods only reach the 40-50% range, with a most-frequent-sense baseline of 56.5%. Features derived from the UMLS semantic type and concept hierarchies do not produce a gain over bag-of-words features in the topic models, but identifying phrases from UMLS and using syntax does help. DISCUSSION: Although topic models outperform graph-based methods, semantic features derived from the UMLS prove too noisy to improve performance beyond bag-of-words.
CONCLUSIONS: Topic modeling for WSD provides superior results in the clinical domain; however, integration of knowledge remains to be effectively exploited. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Entities:  

Keywords:  Clinical Language Processing; Medical Language Processing; Natural Language Processing; Word Sense Disambiguation

Mesh:

Year:  2014        PMID: 24441986      PMCID: PMC4147600          DOI: 10.1136/amiajnl-2013-002133

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  14 in total

1.  The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors:  Olivier Bodenreider
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

3.  Abbreviation and acronym disambiguation in clinical discourse.

Authors:  Sergeui Pakhomov; Ted Pedersen; Christopher G Chute
Journal:  AMIA Annu Symp Proc       Date:  2005

4.  Journal descriptor indexing tool for categorizing text according to discipline or semantic type.

Authors:  Susanne M Humphrey; Chris J Lu; Willie J Rogers; Allen C Browne
Journal:  AMIA Annu Symp Proc       Date:  2006

5.  Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database.

Authors:  Mohammed Saeed; Mauricio Villarroel; Andrew T Reisner; Gari Clifford; Li-Wei Lehman; George Moody; Thomas Heldt; Tin H Kyaw; Benjamin Moody; Roger G Mark
Journal:  Crit Care Med       Date:  2011-05       Impact factor: 7.598

6.  Word sense disambiguation across two domains: biomedical literature and clinical notes.

Authors:  Guergana K Savova; Anni R Coden; Igor L Sominsky; Rie Johnson; Philip V Ogren; Piet C de Groen; Christopher G Chute
Journal:  J Biomed Inform       Date:  2008-03-04       Impact factor: 6.317

7.  Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.

Authors:  Sungrim Moon; Serguei Pakhomov; Genevieve B Melton
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

8.  Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.

Authors:  Hua Xu; Peter D Stetson; Carol Friedman
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

9.  Word Sense Disambiguation by Selecting the Best Semantic Type Based on Journal Descriptor Indexing: Preliminary Experiment.

Authors:  Susanne M Humphrey; Willie J Rogers; Halil Kilicoglu; Dina Demner-Fushman; Thomas C Rindflesch
Journal:  J Am Soc Inf Sci Technol       Date:  2006-01-01

10.  Knowledge-based biomedical word sense disambiguation: comparison of approaches.

Authors:  Antonio J Jimeno-Yepes; Alan R Aronson
Journal:  BMC Bioinformatics       Date:  2010-11-22       Impact factor: 3.169

View more
  11 in total

Review 1.  Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare.

Authors:  A Névéol; P Zweigenbaum
Journal:  Yearb Med Inform       Date:  2015-08-13

2.  Trends in biomedical informatics: automated topic analysis of JAMIA articles.

Authors:  Dong Han; Shuang Wang; Chao Jiang; Xiaoqian Jiang; Hyeon-Eui Kim; Jimeng Sun; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2015-11       Impact factor: 4.497

3.  Challenges in clinical natural language processing for automated disorder normalization.

Authors:  Robert Leaman; Ritu Khare; Zhiyong Lu
Journal:  J Biomed Inform       Date:  2015-07-14       Impact factor: 6.317

4.  Distinction between medical and non-medical usages of short forms in clinical narratives.

Authors:  Sungrim Moon; Donna Ihrke; Yuqun Zeng; Hongfang Liu
Journal:  AMIA Annu Symp Proc       Date:  2018-04-16

5.  Knowledge-Based Biomedical Word Sense Disambiguation with Neural Concept Embeddings

Authors:  Akm Sabbir; Antonio Jimeno-Yepes; Ramakanth Kavuluru
Journal:  Proc IEEE Int Symp Bioinformatics Bioeng       Date:  2018-01-11

6.  Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets.

Authors:  Denis Newman-Griffis; Guy Divita; Bart Desmet; Ayah Zirikly; Carolyn P Rosé; Eric Fosler-Lussier
Journal:  J Am Med Inform Assoc       Date:  2021-03-01       Impact factor: 4.497

7.  KnowLife: a versatile approach for constructing a large knowledge graph for biomedical sciences.

Authors:  Patrick Ernst; Amy Siu; Gerhard Weikum
Journal:  BMC Bioinformatics       Date:  2015-05-14       Impact factor: 3.169

Review 8.  Semantic annotation in biomedicine: the current landscape.

Authors:  Jelena Jovanović; Ebrahim Bagheri
Journal:  J Biomed Semantics       Date:  2017-09-22

9.  Concept Modeling-based Drug Repositioning.

Authors:  Jagadeesh Patchala; Anil G Jegga
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2015-03-23

10.  A bibliometric analysis of natural language processing in medical research.

Authors:  Xieling Chen; Haoran Xie; Fu Lee Wang; Ziqing Liu; Juan Xu; Tianyong Hao
Journal:  BMC Med Inform Decis Mak       Date:  2018-03-22       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.