Literature DB >> 18689813

Inter-species normalization of gene mentions with GNAT.

Jörg Hakenberg1, Conrad Plake, Robert Leaman, Michael Schroeder, Graciela Gonzalez.   

Abstract

MOTIVATION: Text mining in the biomedical domain aims at helping researchers to access information contained in scientific publications in a faster, easier and more complete way. One step towards this aim is the recognition of named entities and their subsequent normalization to database identifiers. Normalization helps to link objects of potential interest, such as genes, to detailed information not contained in a publication; it is also key for integrating different knowledge sources. From an information retrieval perspective, normalization facilitates indexing and querying. Gene mention normalization (GN) is particularly challenging given the high ambiguity of gene names: they refer to orthologous or entirely different genes, are named after phenotypes and other biomedical terms, or they resemble common English words.
RESULTS: We present the first publicly available system, GNAT, reported to handle inter-species GN. Our method uses extensive background knowledge on genes to resolve ambiguous names to EntrezGene identifiers. It performs comparably to single-species approaches proposed by us and others. On a benchmark set derived from BioCreative 1 and 2 data that contains genes from 13 species, GNAT achieves an F-measure of 81.4% (90.8% precision at 73.8% recall). For the single-species task, we report an F-measure of 85.4% on human genes. AVAILABILITY: A web-frontend is available at http://cbioc.eas.asu.edu/gnat/. GNAT will also be available within the BioCreativeMetaService project, see http://bcms.bioinfo.cnio.es. SUPPLEMENTARY INFORMATION: The test data set, lexica, and links toexternal data are available at http://cbioc.eas.asu.edu/gnat/

Entities:  

Mesh:

Year:  2008        PMID: 18689813     DOI: 10.1093/bioinformatics/btn299

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  42 in total

1.  Cross-species gene normalization by species inference.

Authors:  Chih-Hsuan Wei; Hung-Yu Kao
Journal:  BMC Bioinformatics       Date:  2011-10-03       Impact factor: 3.169

2.  SimConcept: A Hybrid Approach for Simplifying Composite Named Entities in Biomedicine.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  ACM BCB       Date:  2014

3.  Moara: a Java library for extracting and normalizing gene and protein mentions.

Authors:  Mariana L Neves; José-María Carazo; Alberto Pascual-Montano
Journal:  BMC Bioinformatics       Date:  2010-03-26       Impact factor: 3.169

4.  LINNAEUS: a species name identification system for biomedical literature.

Authors:  Martin Gerner; Goran Nenadic; Casey M Bergman
Journal:  BMC Bioinformatics       Date:  2010-02-11       Impact factor: 3.169

5.  Improved mutation tagging with gene identifiers applied to membrane protein stability prediction.

Authors:  Rainer Winnenburg; Conrad Plake; Michael Schroeder
Journal:  BMC Bioinformatics       Date:  2009-08-27       Impact factor: 3.169

6.  NEMO: Extraction and normalization of organization names from PubMed affiliations.

Authors:  Siddhartha Reddy Jonnalagadda; Philip Topham
Journal:  J Biomed Discov Collab       Date:  2010-10-04

7.  Discovering drug-drug interactions: a text-mining and reasoning approach based on properties of drug metabolism.

Authors:  Luis Tari; Saadat Anwar; Shanshan Liang; James Cai; Chitta Baral
Journal:  Bioinformatics       Date:  2010-09-15       Impact factor: 6.937

8.  Incorporating rich background knowledge for gene named entity classification and recognition.

Authors:  Yanpeng Li; Hongfei Lin; Zhihao Yang
Journal:  BMC Bioinformatics       Date:  2009-07-17       Impact factor: 3.169

9.  GoGene: gene annotation in the fast lane.

Authors:  Conrad Plake; Loic Royer; Rainer Winnenburg; Jörg Hakenberg; Michael Schroeder
Journal:  Nucleic Acids Res       Date:  2009-05-22       Impact factor: 16.971

10.  Disambiguating the species of biomedical named entities using natural language parsers.

Authors:  Xinglong Wang; Jun'ichi Tsujii; Sophia Ananiadou
Journal:  Bioinformatics       Date:  2010-01-06       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.