Literature DB >> 19890434

Word Sense Disambiguation by Selecting the Best Semantic Type Based on Journal Descriptor Indexing: Preliminary Experiment.

Susanne M Humphrey1, Willie J Rogers, Halil Kilicoglu, Dina Demner-Fushman, Thomas C Rindflesch.   

Abstract

An experiment was performed at the National Library of Medicine((R)) (NLM((R))) in word sense disambiguation (WSD) using the Journal Descriptor Indexing (JDI) methodology. The motivation is the need to solve the ambiguity problem confronting NLM's MetaMap system, which maps free text to terms corresponding to concepts in NLM's Unified Medical Language System((R)) (UMLS((R))) Metathesaurus((R)). If the text maps to more than one Metathesaurus concept at the same high confidence score, MetaMap has no way of knowing which concept is the correct mapping. We describe the JDI methodology, which is ultimately based on statistical associations between words in a training set of MEDLINE((R)) citations and a small set of journal descriptors (assigned by humans to journals per se) assumed to be inherited by the citations. JDI is the basis for selecting the best meaning that is correlated to UMLS semantic types (STs) assigned to ambiguous concepts in the Metathesaurus. For example, the ambiguity transport has two meanings: "Biological Transport" assigned the ST Cell Function and "Patient transport" assigned the ST Health Care Activity. A JDI-based methodology can analyze text containing transport and determine which ST receives a higher score for that text, which then returns the associated meaning, presumed to apply to the ambiguity itself. We then present an experiment in which a baseline disambiguation method was compared to four versions of JDI in disambiguating 45 ambiguous strings from NLM's WSD Test Collection. Overall average precision for the highest-scoring JDI version was 0.7873 compared to 0.2492 for the baseline method, and average precision for individual ambiguities was greater than 0.90 for 23 of them (51%), greater than 0.85 for 24 (53%), and greater than 0.65 for 35 (79%). On the basis of these results, we hope to improve performance of JDI and test its use in applications.

Entities:  

Year:  2006        PMID: 19890434      PMCID: PMC2771948          DOI: 10.1002/asi.20257

Source DB:  PubMed          Journal:  J Am Soc Inf Sci Technol        ISSN: 1532-2882


  14 in total

1.  The NLM Indexing Initiative.

Authors:  A R Aronson; O Bodenreider; H F Chang; S M Humphrey; J G Mork; S J Nelson; T C Rindflesch; W J Wilbur
Journal:  Proc AMIA Symp       Date:  2000

2.  Mapping abbreviations to full forms in biomedical articles.

Authors:  Hong Yu; George Hripcsak; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2002 May-Jun       Impact factor: 4.497

3.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors:  A R Aronson
Journal:  Proc AMIA Symp       Date:  2001

4.  A simple algorithm for identifying abbreviation definitions in biomedical text.

Authors:  Ariel S Schwartz; Marti A Hearst
Journal:  Pac Symp Biocomput       Date:  2003

5.  A study of abbreviations in MEDLINE abstracts.

Authors:  Hongfang Liu; Alan R Aronson; Carol Friedman
Journal:  Proc AMIA Symp       Date:  2002

6.  Using symbolic knowledge in the UMLS to disambiguate words in small datasets with a naïve Bayes classifier.

Authors:  Gondy Leroy; Thomas C Rindflesch
Journal:  Stud Health Technol Inform       Date:  2004

7.  A multi-aspect comparison study of supervised word sense disambiguation.

Authors:  Hongfang Liu; Virginia Teller; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2004-04-02       Impact factor: 4.497

8.  Using literature-based discovery to identify disease candidate genes.

Authors:  Dimitar Hristovski; Borut Peterlin; Joyce A Mitchell; Susanne M Humphrey
Journal:  Int J Med Inform       Date:  2005-03       Impact factor: 4.046

9.  Automatic Indexing of Documents from Journal Descriptors: A Preliminary Investigation.

Authors:  Susanne M Humphrey
Journal:  J Am Soc Inf Sci       Date:  1999

10.  Developing a test collection for biomedical word sense disambiguation.

Authors:  M Weeber; J G Mork; A R Aronson
Journal:  Proc AMIA Symp       Date:  2001
View more
  29 in total

1.  A cloud-based approach to medical NLP.

Authors:  Kyle Chard; Michael Russell; Yves A Lussier; Eneida A Mendonça; Jonathan C Silverstein
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  Knowledge-based method for determining the meaning of ambiguous biomedical terms using information content measures of similarity.

Authors:  Bridget T McInnes; Ted Pedersen; Ying Liu; Genevieve B Melton; Serguei V Pakhomov
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

3.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

4.  Word sense disambiguation via semantic type classification.

Authors:  Jung-Wei Fan; Carol Friedman
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

5.  A fast document classification algorithm for gene symbol disambiguation in the BITOLA literature-based discovery support system.

Authors:  Andrej Kastrin; Dimitar Hristovski
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

6.  Degree centrality for semantic abstraction summarization of therapeutic studies.

Authors:  Han Zhang; Marcelo Fiszman; Dongwook Shin; Christopher M Miller; Graciela Rosemblat; Thomas C Rindflesch
Journal:  J Biomed Inform       Date:  2011-05-08       Impact factor: 6.317

7.  Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification.

Authors:  Vijay N Garla; Cynthia Brandt
Journal:  J Am Med Inform Assoc       Date:  2012-10-16       Impact factor: 4.497

8.  Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.

Authors:  Hua Xu; Peter D Stetson; Carol Friedman
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

9.  Comparing a Rule Based vs. Statistical System for Automatic Categorization of MEDLINE Documents According to Biomedical Specialty.

Authors:  Susanne M Humphrey; Aurélie Névéol; Julien Gobeil; Patrick Ruch; Stéfan J Darmoni; Allen Browne
Journal:  J Am Soc Inf Sci Technol       Date:  2009-12-01

10.  Fast max-margin clustering for unsupervised word sense disambiguation in biomedical texts.

Authors:  Weisi Duan; Min Song; Alexander Yates
Journal:  BMC Bioinformatics       Date:  2009-03-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.