Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Determining the difficulty of Word Sense Disambiguation.

Literature DB >> 24076369

Determining the difficulty of Word Sense Disambiguation.

Abstract

Automatic processing of biomedical documents is made difficult by the fact that many of the terms they contain are ambiguous. Word Sense Disambiguation (WSD) systems attempt to resolve these ambiguities and identify the correct meaning. However, the published literature on WSD systems for biomedical documents report considerable differences in performance for different terms. The development of WSD systems is often expensive with respect to acquiring the necessary training data. It would therefore be useful to be able to predict in advance which terms WSD systems are likely to perform well or badly on. This paper explores various methods for estimating the performance of WSD systems on a wide range of ambiguous biomedical terms (including ambiguous words/phrases and abbreviations). The methods include both supervised and unsupervised approaches. The supervised approaches make use of information from labeled training data while the unsupervised ones rely on the UMLS Metathesaurus. The approaches are evaluated by comparing their predictions about how difficult disambiguation will be for ambiguous terms against the output of two WSD systems. We find the supervised methods are the best predictors of WSD difficulty, but are limited by their dependence on labeled training data. The unsupervised methods all perform well in some situations and can be applied more widely.

Entities: Disease

Keywords: Ambiguity; Biomedical documents; NLP; Natural Language Processing; WSD; Word Sense Disambiguation

Mesh：

Year: 2013 PMID： 24076369 DOI： 10.1016/j.jbi.2013.09.009

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

Keyword Cloud
Cited

5 in total

1. deepBioWSD: effective deep neural word sense disambiguation of biomedical text data.

Authors: Ahmad Pesaranghader; Stan Matwin; Marina Sokolova; Ali Pesaranghader
Journal: J Am Med Inform Assoc Date: 2019-05-01 Impact factor: 4.497

2. SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes.

Authors: Andon Tchechmedjiev; Amine Abdaoui; Vincent Emonet; Stella Zevio; Clement Jonquet
Journal: BMC Bioinformatics Date: 2018-11-06 Impact factor: 3.169

3. The Implicitome: A Resource for Rationalizing Gene-Disease Associations.

Authors: Kristina M Hettne; Mark Thompson; Herman H H B M van Haagen; Eelke van der Horst; Rajaram Kaliyaperumal; Eleni Mina; Zuotian Tatum; Jeroen F J Laros; Erik M van Mulligen; Martijn Schuemie; Emmelien Aten; Tong Shu Li; Richard Bruskiewich; Benjamin M Good; Andrew I Su; Jan A Kors; Johan den Dunnen; Gert-Jan B van Ommen; Marco Roos; Peter A C 't Hoen; Barend Mons; Erik A Schultes
Journal: PLoS One Date: 2016-02-26 Impact factor: 3.240

4. Complexities, variations, and errors of numbering within clinical notes: the potential impact on information extraction and cohort-identification.

Authors: David A Hanauer; Qiaozhu Mei; V G Vinod Vydiswaran; Karandeep Singh; Zach Landis-Lewis; Chunhua Weng
Journal: BMC Med Inform Decis Mak Date: 2019-04-04 Impact factor: 2.796

5. Adeft: Acromine-based Disambiguation of Entities from Text with applications to the biomedical literature.

Authors: Albert Steppi; Benjamin M Gyori; John A Bachman
Journal: J Open Source Softw Date: 2020-01-16

5 in total