Literature DB >> 20351847

Tailoring vocabularies for NLP in sub-domains: a method to detect unused word sense.

Rosa L Figueroa1, Qing Zeng-Treitler, Sergey Goryachev, Eduardo P Wiechmann.   

Abstract

We developed a method to help tailor a comprehensive vocabulary system (e.g. the UMLS) for a sub-domain (e.g. clinical reports) in support of natural language processing (NLP). The method detects unused sense in a sub-domain by comparing the relational neighborhood of a word/term in the vocabulary with the semantic neighborhood of the word/term in the sub-domain. The semantic neighborhood of the word/term in the sub-domain is determined using latent semantic analysis (LSA). We trained and tested the unused sense detection on two clinical text corpora: one contains discharge summaries and the other outpatient visit notes. We were able to detect unused senses with precision from 79% to 87%, recall from 48% to 74%, and an area under receiver operation curve (AUC) of 72% to 87%.

Entities:  

Mesh:

Year:  2009        PMID: 20351847      PMCID: PMC2815465     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  13 in total

1.  Disambiguating proteins, genes, and RNA in text: a machine learning approach.

Authors:  V Hatzivassiloglou; P A Duboué; A Rzhetsky
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

2.  Disambiguating ambiguous biomedical terms in biomedical narrative text: an unsupervised method.

Authors:  H Liu; Y A Lussier; C Friedman
Journal:  J Biomed Inform       Date:  2001-08       Impact factor: 6.317

3.  Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS.

Authors:  Hongfang Liu; Stephen B Johnson; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2002 Nov-Dec       Impact factor: 4.497

4.  Using symbolic knowledge in the UMLS to disambiguate words in small datasets with a naïve Bayes classifier.

Authors:  Gondy Leroy; Thomas C Rindflesch
Journal:  Stud Health Technol Inform       Date:  2004

5.  A multi-aspect comparison study of supervised word sense disambiguation.

Authors:  Hongfang Liu; Virginia Teller; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2004-04-02       Impact factor: 4.497

6.  Abbreviation and acronym disambiguation in clinical discourse.

Authors:  Sergeui Pakhomov; Ted Pedersen; Christopher G Chute
Journal:  AMIA Annu Symp Proc       Date:  2005

7.  Word sense disambiguation across two domains: biomedical literature and clinical notes.

Authors:  Guergana K Savova; Anni R Coden; Igor L Sominsky; Rie Johnson; Philip V Ogren; Piet C de Groen; Christopher G Chute
Journal:  J Biomed Inform       Date:  2008-03-04       Impact factor: 6.317

8.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.

Authors:  E R DeLong; D M DeLong; D L Clarke-Pearson
Journal:  Biometrics       Date:  1988-09       Impact factor: 2.571

9.  Contextual weighting for Support Vector Machines in literature mining: an application to gene versus protein name disambiguation.

Authors:  Tapio Pahikkala; Filip Ginter; Jorma Boberg; Jouni Järvinen; Tapio Salakoski
Journal:  BMC Bioinformatics       Date:  2005-06-22       Impact factor: 3.169

10.  Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues.

Authors:  Hua Xu; Marianthi Markatou; Rositsa Dimova; Hongfang Liu; Carol Friedman
Journal:  BMC Bioinformatics       Date:  2006-07-05       Impact factor: 3.169

View more
  1 in total

1.  Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets.

Authors:  Denis Newman-Griffis; Guy Divita; Bart Desmet; Ayah Zirikly; Carolyn P Rosé; Eric Fosler-Lussier
Journal:  J Am Med Inform Assoc       Date:  2021-03-01       Impact factor: 4.497

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.