Literature DB >> 20816855

Disambiguation in the biomedical domain: the role of ambiguity type.

Mark Stevenson1, Yikun Guo.   

Abstract

Word Sense Disambiguation (WSD), the automatic identification of the meanings of ambiguous terms in a document, is an important stage in text processing. We describe a WSD system that has been developed specifically for the types of ambiguities found in biomedical documents. This system uses a range of knowledge sources. It employs both linguistic features, such as local collocations, and features derived from domain-specific knowledge sources, the Unified Medical Language System (UMLS) and Medical Subject Headings (MeSH). This system is applied to three types of ambiguities found in Medline abstracts: ambiguous terms, abbreviations with multiple expansions and names that are ambiguous between genes. The WSD system is applied to the standard NLM-WSD data set, which consists of ambiguous terms from Medline abstracts, and was found to perform well in comparison with previously reported results. The system's performance and the contribution of each knowledge source depends upon the type of lexical ambiguity. 87.9% of the ambiguous terms are correctly disambiguated using a combination of linguistic features and MeSH terms, 99% of abbreviations are disambiguated by combining all knowledge sources, while 97.2% of ambiguous gene names are disambiguated using the MeSH terms alone. Analysis reveals that these differences are caused by the nature of each ambiguity type. These results should be taken into account when deciding which information to use for WSD and the level of performance that can be expected.
Copyright © 2010 Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Year:  2010        PMID: 20816855     DOI: 10.1016/j.jbi.2010.08.009

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  10 in total

1.  A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD).

Authors:  Yonghui Wu; Joshua C Denny; S Trent Rosenbloom; Randolph A Miller; Dario A Giuse; Lulu Wang; Carmelo Blanquicett; Ergin Soysal; Jun Xu; Hua Xu
Journal:  J Am Med Inform Assoc       Date:  2017-04-01       Impact factor: 4.497

2.  Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets.

Authors:  Denis Newman-Griffis; Guy Divita; Bart Desmet; Ayah Zirikly; Carolyn P Rosé; Eric Fosler-Lussier
Journal:  J Am Med Inform Assoc       Date:  2021-03-01       Impact factor: 4.497

3.  Identifying the status of genetic lesions in cancer clinical trial documents using machine learning.

Authors:  Yonghui Wu; Mia A Levy; Christine M Micheel; Paul Yeh; Buzhou Tang; Michael J Cantrell; Stacy M Cooreman; Hua Xu
Journal:  BMC Genomics       Date:  2012-12-17       Impact factor: 3.969

4.  Exploiting domain information for Word Sense Disambiguation of medical documents.

Authors:  Mark Stevenson; Eneko Agirre; Aitor Soroa
Journal:  J Am Med Inform Assoc       Date:  2011-09-07       Impact factor: 4.497

5.  Adapting a natural language processing tool to facilitate clinical trial curation for personalized cancer therapy.

Authors:  Jia Zeng; Yonghui Wu; Ann Bailey; Amber Johnson; Vijaykumar Holla; Elmer V Bernstam; Hua Xu; Funda Meric-Bernstam
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2014-04-07

6.  Reengineering of MeSH thesauri for term selection to optimize literature retrieval and knowledge reconstruction in support of stem cell research.

Authors:  Yan Su; James Andrews; Hong Huang; Yue Wang; Liangliang Kong; Peter Cannon; Ping Xu
Journal:  BMC Med Inform Decis Mak       Date:  2016-05-23       Impact factor: 2.796

7.  Evaluation of research in biomedical ontologies.

Authors:  Robert Hoehndorf; Michel Dumontier; Georgios V Gkoutos
Journal:  Brief Bioinform       Date:  2012-09-08       Impact factor: 11.622

8.  The effect of word sense disambiguation accuracy on literature based discovery.

Authors:  Judita Preiss; Mark Stevenson
Journal:  BMC Med Inform Decis Mak       Date:  2016-07-18       Impact factor: 2.796

9.  Adeft: Acromine-based Disambiguation of Entities from Text with applications to the biomedical literature.

Authors:  Albert Steppi; Benjamin M Gyori; John A Bachman
Journal:  J Open Source Softw       Date:  2020-01-16

10.  Relation extraction between bacteria and biotopes from biomedical texts with attention mechanisms and domain-specific contextual representations.

Authors:  Amarin Jettakul; Duangdao Wichadakul; Peerapon Vateekul
Journal:  BMC Bioinformatics       Date:  2019-12-03       Impact factor: 3.169

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.