Literature DB >> 16037121

Resolving abbreviations to their senses in Medline.

S Gaudan1, H Kirsch, D Rebholz-Schuhmann.   

Abstract

MOTIVATION: Biological literature contains many abbreviations with one particular sense in each document. However, most abbreviations do not have a unique sense across the literature. Furthermore, many documents do not contain the long forms of the abbreviations. Resolving an abbreviation in a document consists of retrieving its sense in use. Abbreviation resolution improves accuracy of document retrieval engines and of information extraction systems.
RESULTS: We combine an automatic analysis of Medline abstracts and linguistic methods to build a dictionary of abbreviation/sense pairs. The dictionary is used for the resolution of abbreviations occurring with their long forms. Ambiguous global abbreviations are resolved using support vector machines that have been trained on the context of each instance of the abbreviation/sense pairs, previously extracted for the dictionary set-up. The system disambiguates abbreviations with a precision of 98.9% for a recall of 98.2% (98.5% accuracy). This performance is superior in comparison with previously reported research work. AVAILABILITY: The abbreviation resolution module is available at http://www.ebi.ac.uk/Rebholz/software.html.

Mesh:

Year:  2005        PMID: 16037121     DOI: 10.1093/bioinformatics/bti586

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  24 in total

1.  Quantitative assessment of dictionary-based protein named entity tagging.

Authors:  Hongfang Liu; Zhang-Zhi Hu; Manabu Torii; Cathy Wu; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2006-06-23       Impact factor: 4.497

2.  Word add-in for ontology recognition: semantic enrichment of scientific literature.

Authors:  J Lynn Fink; Pablo Fernicola; Rahul Chandran; Savas Parastatidis; Alex Wade; Oscar Naim; Gregory B Quinn; Philip E Bourne
Journal:  BMC Bioinformatics       Date:  2010-02-24       Impact factor: 3.169

3.  eFIP: a tool for mining functional impact of phosphorylation from literature.

Authors:  Cecilia N Arighi; Amy Y Siu; Catalina O Tudor; Jules A Nchoutmboube; Cathy H Wu; Vijay K Shanker
Journal:  Methods Mol Biol       Date:  2011

4.  Building a high-quality sense inventory for improved abbreviation disambiguation.

Authors:  Naoaki Okazaki; Sophia Ananiadou; Jun'ichi Tsujii
Journal:  Bioinformatics       Date:  2010-03-25       Impact factor: 6.937

5.  Gendoo: functional profiling of gene and disease features using MeSH vocabulary.

Authors:  Takeru Nakazato; Hidemasa Bono; Hideo Matsuda; Toshihisa Takagi
Journal:  Nucleic Acids Res       Date:  2009-06-04       Impact factor: 16.971

6.  eGIFT: mining gene information from the literature.

Authors:  Catalina O Tudor; Carl J Schmidt; K Vijay-Shanker
Journal:  BMC Bioinformatics       Date:  2010-08-09       Impact factor: 3.169

7.  Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier.

Authors:  Illés Solt; Domonkos Tikk; Viktor Gál; Zsolt T Kardkovács
Journal:  J Am Med Inform Assoc       Date:  2009-04-23       Impact factor: 4.497

8.  Discriminative application of string similarity methods to chemical and non-chemical names for biomedical abbreviation clustering.

Authors:  Atsuko Yamaguchi; Yasunori Yamamoto; Jin-Dong Kim; Toshihisa Takagi; Akinori Yonezawa
Journal:  BMC Genomics       Date:  2012-06-11       Impact factor: 3.969

9.  Fast max-margin clustering for unsupervised word sense disambiguation in biomedical texts.

Authors:  Weisi Duan; Min Song; Alexander Yates
Journal:  BMC Bioinformatics       Date:  2009-03-19       Impact factor: 3.169

10.  The eFIP system for text mining of protein interaction networks of phosphorylated proteins.

Authors:  Catalina O Tudor; Cecilia N Arighi; Qinghua Wang; Cathy H Wu; K Vijay-Shanker
Journal:  Database (Oxford)       Date:  2012-12-05       Impact factor: 3.451

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.