Literature DB >> 15064284

A multi-aspect comparison study of supervised word sense disambiguation.

Hongfang Liu1, Virginia Teller, Carol Friedman.   

Abstract

OBJECTIVE: The aim of this study was to investigate relations among different aspects in supervised word sense disambiguation (WSD; supervised machine learning for disambiguating the sense of a term in a context) and compare supervised WSD in the biomedical domain with that in the general English domain.
METHODS: The study involves three data sets (a biomedical abbreviation data set, a general biomedical term data set, and a general English data set). The authors implemented three machine-learning algorithms, including (1) naïve Bayes (NBL) and decision lists (TDLL), (2) their adaptation of decision lists (ODLL), and (3) their mixed supervised learning (MSL). There were six feature representations (various combinations of collocations, bag of words, oriented bag of words, etc.) and five window sizes (2, 4, 6, 8, and 10).
RESULTS: Supervised WSD is suitable only when there are enough sense-tagged instances with at least a few dozens of instances for each sense. Collocations combined with neighboring words are appropriate selections for the context. For terms with unrelated biomedical senses, a large window size such as the whole paragraph should be used, while for general English words a moderate window size between 4 and 10 should be used. The performance of the authors' implementation of decision list classifiers for abbreviations was better than that of traditional decision list classifiers. However, the opposite held for the other two sets. Also, the authors' mixed supervised learning was stable and generally better than others for all sets.
CONCLUSION: From this study, it was found that different aspects of supervised WSD depend on each other. The experiment method presented in the study can be used to select the best supervised WSD classifier for each ambiguous term.

Entities:  

Mesh:

Year:  2004        PMID: 15064284      PMCID: PMC436083          DOI: 10.1197/jamia.M1533

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  3 in total

1.  Disambiguating ambiguous biomedical terms in biomedical narrative text: an unsupervised method.

Authors:  H Liu; Y A Lussier; C Friedman
Journal:  J Biomed Inform       Date:  2001-08       Impact factor: 6.317

2.  Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS.

Authors:  Hongfang Liu; Stephen B Johnson; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2002 Nov-Dec       Impact factor: 4.497

3.  Developing a test collection for biomedical word sense disambiguation.

Authors:  M Weeber; J G Mork; A R Aronson
Journal:  Proc AMIA Symp       Date:  2001
  3 in total
  33 in total

1.  Knowledge-based method for determining the meaning of ambiguous biomedical terms using information content measures of similarity.

Authors:  Bridget T McInnes; Ted Pedersen; Ying Liu; Genevieve B Melton; Serguei V Pakhomov
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  Tailoring vocabularies for NLP in sub-domains: a method to detect unused word sense.

Authors:  Rosa L Figueroa; Qing Zeng-Treitler; Sergey Goryachev; Eduardo P Wiechmann
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

3.  Comparison of vector space model methodologies to reconcile cross-species neuroanatomical concepts.

Authors:  P R Srinivas; Shang-Heng Wei; Nello Cristianini; E G Jones; F A Gorin
Journal:  Neuroinformatics       Date:  2005

4.  Quantitative assessment of dictionary-based protein named entity tagging.

Authors:  Hongfang Liu; Zhang-Zhi Hu; Manabu Torii; Cathy Wu; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2006-06-23       Impact factor: 4.497

5.  Natural language processing of spoken diet records (SDRs).

Authors:  Ronilda Lacson; William Long
Journal:  AMIA Annu Symp Proc       Date:  2006

6.  Using UMLS Concept Unique Identifiers (CUIs) for word sense disambiguation in the biomedical domain.

Authors:  Bridget T McInnes; Ted Pedersen; John Carlis
Journal:  AMIA Annu Symp Proc       Date:  2007-10-11

7.  Methods for building sense inventories of abbreviations in clinical notes.

Authors:  Hua Xu; Peter D Stetson; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2008-10-24       Impact factor: 4.497

8.  Word sense disambiguation via semantic type classification.

Authors:  Jung-Wei Fan; Carol Friedman
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

9.  Knowledge-Based Biomedical Word Sense Disambiguation with Neural Concept Embeddings

Authors:  Akm Sabbir; Antonio Jimeno-Yepes; Ramakanth Kavuluru
Journal:  Proc IEEE Int Symp Bioinformatics Bioeng       Date:  2018-01-11

10.  Fast max-margin clustering for unsupervised word sense disambiguation in biomedical texts.

Authors:  Weisi Duan; Min Song; Alexander Yates
Journal:  BMC Bioinformatics       Date:  2009-03-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.