Literature DB >> 15360839

Using symbolic knowledge in the UMLS to disambiguate words in small datasets with a naïve Bayes classifier.

Gondy Leroy1, Thomas C Rindflesch.   

Abstract

Current approaches to word sense disambiguation use and combine various machine-learning techniques. Most refer to characteristics of the ambiguous word and surrounding words and are based on hundreds of examples. Unfortunately, developing large training sets is time-consuming. We investigate the use of symbolic knowledge to augment machine-learning techniques for small datasets. UMLS semantic types assigned to concepts found in the sentence and relationships between these semantic types form the knowledge base. A naïve Bayes classifier was trained for 15 words with 100 examples for each. The most frequent sense of a word served as the baseline. The effect of increasingly accurate symbolic knowledge was evaluated in eight experimental conditions. Performance was measured by accuracy based on 10-fold cross-validation. The best condition used only the semantic types of the words in the sentence. Accuracy was then on average 10% higher than the baseline; however, it varied from 8% deterioration to 29% improvement. In a follow-up evaluation, we noted a trend that the best disambiguation was found for words that were the least troublesome to the human evaluators.

Entities:  

Mesh:

Year:  2004        PMID: 15360839

Source DB:  PubMed          Journal:  Stud Health Technol Inform        ISSN: 0926-9630


  8 in total

1.  Tailoring vocabularies for NLP in sub-domains: a method to detect unused word sense.

Authors:  Rosa L Figueroa; Qing Zeng-Treitler; Sergey Goryachev; Eduardo P Wiechmann
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

2.  A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources.

Authors:  Sungrim Moon; Serguei Pakhomov; Nathan Liu; James O Ryan; Genevieve B Melton
Journal:  J Am Med Inform Assoc       Date:  2013-06-27       Impact factor: 4.497

3.  Clinical Word Sense Disambiguation with Interactive Search and Classification.

Authors:  Yue Wang; Kai Zheng; Hua Xu; Qiaozhu Mei
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

4.  Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.

Authors:  Sungrim Moon; Serguei Pakhomov; Genevieve B Melton
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

5.  Hyperdimensional computing approach to word sense disambiguation.

Authors:  Bjoern-Toby Berster; J Caleb Goodwin; Trevor Cohen
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

6.  Challenges and practical approaches with word sense disambiguation of acronyms and abbreviations in the clinical domain.

Authors:  Sungrim Moon; Bridget McInnes; Genevieve B Melton
Journal:  Healthc Inform Res       Date:  2015-01-31

7.  Word Sense Disambiguation by Selecting the Best Semantic Type Based on Journal Descriptor Indexing: Preliminary Experiment.

Authors:  Susanne M Humphrey; Willie J Rogers; Halil Kilicoglu; Dina Demner-Fushman; Thomas C Rindflesch
Journal:  J Am Soc Inf Sci Technol       Date:  2006-01-01

8.  Fast max-margin clustering for unsupervised word sense disambiguation in biomedical texts.

Authors:  Weisi Duan; Min Song; Alexander Yates
Journal:  BMC Bioinformatics       Date:  2009-03-19       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.