Literature DB >> 19188193

High-performance gene name normalization with GeNo.

Joachim Wermter1, Katrin Tomanek, Udo Hahn.   

Abstract

MOTIVATION: The recognition and normalization of textual mentions of gene and protein names is both particularly important and challenging. Its importance lies in the fact that they constitute the crucial conceptual entities in biomedicine. Their recognition and normalization remains a challenging task because of widespread gene name ambiguities within species, across species, with common English words and with medical sublanguage terms.
RESULTS: We present GeNo, a highly competitive system for gene name normalization, which obtains an F-measure performance of 86.4% (precision: 87.8%, recall: 85.0%) on the BioCreAtIvE-II test set, thus being on a par with the best system on that task. Our system tackles the complex gene normalization problem by employing a carefully crafted suite of symbolic and statistical methods, and by fully relying on publicly available software and data resources, including extensive background knowledge based on semantic profiling. A major goal of our work is to present GeNo's architecture in a lucid and perspicuous way to pave the way to full reproducibility of our results. AVAILABILITY: GeNo, including its underlying resources, will be available from www.julielab.de. It is also currently deployed in the Semedico search engine at www.semedico.org.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19188193     DOI: 10.1093/bioinformatics/btp071

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  43 in total

1.  Cross-species gene normalization by species inference.

Authors:  Chih-Hsuan Wei; Hung-Yu Kao
Journal:  BMC Bioinformatics       Date:  2011-10-03       Impact factor: 3.169

2.  Soft tagging of overlapping high confidence gene mention variants for cross-species full-text gene normalization.

Authors:  Cheng-Ju Kuo; Maurice H T Ling; Chun-Nan Hsu
Journal:  BMC Bioinformatics       Date:  2011-10-03       Impact factor: 3.169

3.  Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature.

Authors:  Emily Doughty; Attila Kertesz-Farkas; Olivier Bodenreider; Gary Thompson; Asa Adadey; Thomas Peterson; Maricel G Kann
Journal:  Bioinformatics       Date:  2010-12-07       Impact factor: 6.937

4.  Beyond accuracy: creating interoperable and scalable text-mining web services.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  Bioinformatics       Date:  2016-02-16       Impact factor: 6.937

5.  A literature search tool for intelligent extraction of disease-associated genes.

Authors:  Jae-Yoon Jung; Todd F DeLuca; Tristan H Nelson; Dennis P Wall
Journal:  J Am Med Inform Assoc       Date:  2013-09-02       Impact factor: 4.497

Review 6.  Recent progress in automatically extracting information from the pharmacogenomic literature.

Authors:  Yael Garten; Adrien Coulet; Russ B Altman
Journal:  Pharmacogenomics       Date:  2010-10       Impact factor: 2.533

7.  SimConcept: A Hybrid Approach for Simplifying Composite Named Entities in Biomedicine.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  ACM BCB       Date:  2014

8.  Moara: a Java library for extracting and normalizing gene and protein mentions.

Authors:  Mariana L Neves; José-María Carazo; Alberto Pascual-Montano
Journal:  BMC Bioinformatics       Date:  2010-03-26       Impact factor: 3.169

9.  SimConcept: a hybrid approach for simplifying composite named entities in biomedical text.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  IEEE J Biomed Health Inform       Date:  2015-04-13       Impact factor: 5.772

10.  Biomedical text mining and its applications.

Authors:  Raul Rodriguez-Esteban
Journal:  PLoS Comput Biol       Date:  2009-12-24       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.