Literature DB >> 12463959

Automatic extraction of gene and protein synonyms from MEDLINE and journal articles.

Hong Yu1, Vasileios Hatzivassiloglou, Carol Friedman, Andrey Rzhetsky, W John Wilbur.   

Abstract

Genes and proteins are often associated with multiple names, and more names are added as new functional or structural information is discovered. Because authors often alternate between these synonyms, information retrieval and extraction benefits from identifying these synonymous names. We have developed a method to extract automatically synonymous gene and protein names from MEDLINE and journal articles. We first identified patterns authors use to list synonymous gene and protein names. We developed SGPE (for synonym extraction of gene and protein names), a software program that recognizes the patterns and extracts from MEDLINE abstracts and full-text journal articles candidate synonymous terms. SGPE then applies a sequence of filters that automatically screen out those terms that are not gene and protein names. We evaluated our method to have an overall precision of 71% on both MEDLINE and journal articles, and 90% precision on the more suitable full-text articles alone

Mesh:

Substances:

Year:  2002        PMID: 12463959      PMCID: PMC2244511     

Source DB:  PubMed          Journal:  Proc AMIA Symp        ISSN: 1531-605X


  5 in total

1.  PNAD-CSS: a workbench for constructing a protein name abbreviation dictionary.

Authors:  M Yoshida; K Fukuda; T Takagi
Journal:  Bioinformatics       Date:  2000-02       Impact factor: 6.937

2.  Detecting Gene Symbols and Names in Biological Texts: A First Step toward Pertinent Information Extraction.

Authors: 
Journal:  Genome Inform Ser Workshop Genome Inform       Date:  1998

3.  Mapping abbreviations to full forms in biomedical articles.

Authors:  Hong Yu; George Hripcsak; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2002 May-Jun       Impact factor: 4.497

4.  Rules and guidelines for mouse gene nomenclature: a condensed version. International Committee on Standardized Genetic Nomenclature for Mice.

Authors:  L J Maltais; J A Blake; J T Eppig; M T Davisson
Journal:  Genomics       Date:  1997-10-15       Impact factor: 5.736

5.  Toward information extraction: identifying protein names from biological papers.

Authors:  K Fukuda; A Tamura; T Tsunoda; T Takagi
Journal:  Pac Symp Biocomput       Date:  1998
  5 in total
  13 in total

1.  Identification of related gene/protein names based on an HMM of name variations.

Authors:  L Yeganova; L Smith; W J Wilbur
Journal:  Comput Biol Chem       Date:  2004-04       Impact factor: 2.877

2.  Using co-occurrence network structure to extract synonymous gene and protein names from MEDLINE abstracts.

Authors:  A M Cohen; W R Hersh; C Dubay; K Spackman
Journal:  BMC Bioinformatics       Date:  2005-04-22       Impact factor: 3.169

3.  LAITOR--Literature Assistant for Identification of Terms co-Occurrences and Relationships.

Authors:  Adriano Barbosa-Silva; Theodoros G Soldatos; Ivan L F Magalhães; Georgios A Pavlopoulos; Jean-Fred Fontaine; Miguel A Andrade-Navarro; Reinhard Schneider; J Miguel Ortega
Journal:  BMC Bioinformatics       Date:  2010-02-01       Impact factor: 3.169

4.  Ambiguity of human gene symbols in LocusLink and MEDLINE: creating an inventory and a disambiguation test collection.

Authors:  Marc Weeber; Bob J Schijvenaars; Erik M Van Mulligen; Barend Mons; Rob Jelier; Christian C Van Der Eijk; Jan A Kors
Journal:  AMIA Annu Symp Proc       Date:  2003

5.  Synonym set extraction from the biomedical literature by lexical pattern discovery.

Authors:  John McCrae; Nigel Collier
Journal:  BMC Bioinformatics       Date:  2008-03-24       Impact factor: 3.169

Review 6.  Recent advances in biomedical literature mining.

Authors:  Sendong Zhao; Chang Su; Zhiyong Lu; Fei Wang
Journal:  Brief Bioinform       Date:  2021-05-20       Impact factor: 11.622

7.  Do peers see more in a paper than its authors?

Authors:  Anna Divoli; Preslav Nakov; Marti A Hearst
Journal:  Adv Bioinformatics       Date:  2012-11-27

8.  Text mining of full-text journal articles combined with gene expression analysis reveals a relationship between sphingosine-1-phosphate and invasiveness of a glioblastoma cell line.

Authors:  Jeyakumar Natarajan; Daniel Berrar; Werner Dubitzky; Catherine Hack; Yonghong Zhang; Catherine DeSesa; James R Van Brocklyn; Eric G Bremer
Journal:  BMC Bioinformatics       Date:  2006-08-10       Impact factor: 3.169

9.  Is searching full text more effective than searching abstracts?

Authors:  Jimmy Lin
Journal:  BMC Bioinformatics       Date:  2009-02-03       Impact factor: 3.169

10.  Challenges for automatically extracting molecular interactions from full-text articles.

Authors:  Tara McIntosh; James R Curran
Journal:  BMC Bioinformatics       Date:  2009-09-24       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.