Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Word sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge-poor unsupervised methods.

Literature DB >> 24441986

Word sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge-poor unsupervised methods.

Rachel Chasin¹, Anna Rumshisky², Ozlem Uzuner³, Peter Szolovits¹.

Abstract

OBJECTIVE: To evaluate state-of-the-art unsupervised methods on the word sense disambiguation (WSD) task in the clinical domain. In particular, to compare graph-based approaches relying on a clinical knowledge base with bottom-up topic-modeling-based approaches. We investigate several enhancements to the topic-modeling techniques that use domain-specific knowledge sources.
MATERIALS AND METHODS: The graph-based methods use variations of PageRank and distance-based similarity metrics, operating over the Unified Medical Language System (UMLS). Topic-modeling methods use unlabeled data from the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC II) database to derive models for each ambiguous word. We investigate the impact of using different linguistic features for topic models, including UMLS-based and syntactic features. We use a sense-tagged clinical dataset from the Mayo Clinic for evaluation.
RESULTS: The topic-modeling methods achieve 66.9% accuracy on a subset of the Mayo Clinic's data, while the graph-based methods only reach the 40-50% range, with a most-frequent-sense baseline of 56.5%. Features derived from the UMLS semantic type and concept hierarchies do not produce a gain over bag-of-words features in the topic models, but identifying phrases from UMLS and using syntax does help. DISCUSSION: Although topic models outperform graph-based methods, semantic features derived from the UMLS prove too noisy to improve performance beyond bag-of-words.
CONCLUSIONS: Topic modeling for WSD provides superior results in the clinical domain; however, integration of knowledge remains to be effectively exploited. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Entities: Disease Species

Keywords: Clinical Language Processing; Medical Language Processing; Natural Language Processing; Word Sense Disambiguation

Mesh：

Year: 2014 PMID： 24441986 PMCID： PMC4147600 DOI： 10.1136/amiajnl-2013-002133

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

14 in total

1. The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors: Olivier Bodenreider
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

2. An overview of MetaMap: historical perspective and recent advances.

Authors: Alan R Aronson; François-Michel Lang
Journal: J Am Med Inform Assoc Date: 2010 May-Jun Impact factor: 4.497

3. Abbreviation and acronym disambiguation in clinical discourse.

Authors: Sergeui Pakhomov; Ted Pedersen; Christopher G Chute
Journal: AMIA Annu Symp Proc Date: 2005

4. Journal descriptor indexing tool for categorizing text according to discipline or semantic type.

Authors: Susanne M Humphrey; Chris J Lu; Willie J Rogers; Allen C Browne
Journal: AMIA Annu Symp Proc Date: 2006

5. Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database.

Authors: Mohammed Saeed; Mauricio Villarroel; Andrew T Reisner; Gari Clifford; Li-Wei Lehman; George Moody; Thomas Heldt; Tin H Kyaw; Benjamin Moody; Roger G Mark
Journal: Crit Care Med Date: 2011-05 Impact factor: 7.598

6. Word sense disambiguation across two domains: biomedical literature and clinical notes.

Authors: Guergana K Savova; Anni R Coden; Igor L Sominsky; Rie Johnson; Philip V Ogren; Piet C de Groen; Christopher G Chute
Journal: J Biomed Inform Date: 2008-03-04 Impact factor: 6.317

7. Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.

Authors: Sungrim Moon; Serguei Pakhomov; Genevieve B Melton
Journal: AMIA Annu Symp Proc Date: 2012-11-03

8. Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.

Authors: Hua Xu; Peter D Stetson; Carol Friedman
Journal: AMIA Annu Symp Proc Date: 2012-11-03

9. Word Sense Disambiguation by Selecting the Best Semantic Type Based on Journal Descriptor Indexing: Preliminary Experiment.

Authors: Susanne M Humphrey; Willie J Rogers; Halil Kilicoglu; Dina Demner-Fushman; Thomas C Rindflesch
Journal: J Am Soc Inf Sci Technol Date: 2006-01-01

10. Knowledge-based biomedical word sense disambiguation: comparison of approaches.

Authors: Antonio J Jimeno-Yepes; Alan R Aronson
Journal: BMC Bioinformatics Date: 2010-11-22 Impact factor: 3.169

11 in total

Review 1. Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare.

Authors: A Névéol; P Zweigenbaum
Journal: Yearb Med Inform Date: 2015-08-13

2. Trends in biomedical informatics: automated topic analysis of JAMIA articles.

Authors: Dong Han; Shuang Wang; Chao Jiang; Xiaoqian Jiang; Hyeon-Eui Kim; Jimeng Sun; Lucila Ohno-Machado
Journal: J Am Med Inform Assoc Date: 2015-11 Impact factor: 4.497

3. Challenges in clinical natural language processing for automated disorder normalization.

Authors: Robert Leaman; Ritu Khare; Zhiyong Lu
Journal: J Biomed Inform Date: 2015-07-14 Impact factor: 6.317

4. Distinction between medical and non-medical usages of short forms in clinical narratives.

Authors: Sungrim Moon; Donna Ihrke; Yuqun Zeng; Hongfang Liu
Journal: AMIA Annu Symp Proc Date: 2018-04-16

5. Knowledge-Based Biomedical Word Sense Disambiguation with Neural Concept Embeddings

Authors: Akm Sabbir; Antonio Jimeno-Yepes; Ramakanth Kavuluru
Journal: Proc IEEE Int Symp Bioinformatics Bioeng Date: 2018-01-11

6. Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets.

Authors: Denis Newman-Griffis; Guy Divita; Bart Desmet; Ayah Zirikly; Carolyn P Rosé; Eric Fosler-Lussier
Journal: J Am Med Inform Assoc Date: 2021-03-01 Impact factor: 4.497

7. KnowLife: a versatile approach for constructing a large knowledge graph for biomedical sciences.

Authors: Patrick Ernst; Amy Siu; Gerhard Weikum
Journal: BMC Bioinformatics Date: 2015-05-14 Impact factor: 3.169

Review 8. Semantic annotation in biomedicine: the current landscape.

Authors: Jelena Jovanović; Ebrahim Bagheri
Journal: J Biomed Semantics Date: 2017-09-22

9. Concept Modeling-based Drug Repositioning.

Authors: Jagadeesh Patchala; Anil G Jegga
Journal: AMIA Jt Summits Transl Sci Proc Date: 2015-03-23

10. A bibliometric analysis of natural language processing in medical research.

Authors: Xieling Chen; Haoran Xie; Fu Lee Wang; Ziqing Liu; Juan Xu; Tianyong Hao
Journal: BMC Med Inform Decis Mak Date: 2018-03-22 Impact factor: 2.796