Literature DB >> 14734313

GAPSCORE: finding gene and protein names one word at a time.

Jeffrey T Chang1, Hinrich Schütze, Russ B Altman.   

Abstract

MOTIVATION: New high-throughput technologies have accelerated the accumulation of knowledge about genes and proteins. However, much knowledge is still stored as written natural language text. Therefore, we have developed a new method, GAPSCORE, to identify gene and protein names in text. GAPSCORE scores words based on a statistical model of gene names that quantifies their appearance, morphology and context.
RESULTS: We evaluated GAPSCORE against the Yapex data set and achieved an F-score of 82.5% (83.3% recall, 81.5% precision) for partial matches and 57.6% (58.5% recall, 56.7% precision) for exact matches. Since the method is statistical, users can choose score cutoffs that adjust the performance according to their needs. AVAILABILITY: GAPSCORE is available at http://bionlp.stanford.edu/gapscore/

Mesh:

Substances:

Year:  2004        PMID: 14734313     DOI: 10.1093/bioinformatics/btg393

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  22 in total

1.  NLProt: extracting protein names and sequences from papers.

Authors:  Sven Mika; Burkhard Rost
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

2.  A statistical approach to scanning the biomedical literature for pharmacogenetics knowledge.

Authors:  Daniel L Rubin; Caroline F Thorn; Teri E Klein; Russ B Altman
Journal:  J Am Med Inform Assoc       Date:  2004-11-23       Impact factor: 4.497

3.  Quantitative assessment of dictionary-based protein named entity tagging.

Authors:  Hongfang Liu; Zhang-Zhi Hu; Manabu Torii; Cathy Wu; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2006-06-23       Impact factor: 4.497

4.  BioTagger-GM: a gene/protein name recognition system.

Authors:  Manabu Torii; Zhangzhi Hu; Cathy H Wu; Hongfang Liu
Journal:  J Am Med Inform Assoc       Date:  2008-12-11       Impact factor: 4.497

Review 5.  Recent progress in automatically extracting information from the pharmacogenomic literature.

Authors:  Yael Garten; Adrien Coulet; Russ B Altman
Journal:  Pharmacogenomics       Date:  2010-10       Impact factor: 2.533

6.  eFIP: a tool for mining functional impact of phosphorylation from literature.

Authors:  Cecilia N Arighi; Amy Y Siu; Catalina O Tudor; Jules A Nchoutmboube; Cathy H Wu; Vijay K Shanker
Journal:  Methods Mol Biol       Date:  2011

7.  Systematic identification of pharmacogenomics information from clinical trials.

Authors:  Jiao Li; Zhiyong Lu
Journal:  J Biomed Inform       Date:  2012-04-24       Impact factor: 6.317

8.  BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature.

Authors:  Cheng-Ju Kuo; Maurice H T Ling; Kuan-Ting Lin; Chun-Nan Hsu
Journal:  BMC Bioinformatics       Date:  2009-12-03       Impact factor: 3.169

9.  EnzyMiner: automatic identification of protein level mutations and their impact on target enzymes from PubMed abstracts.

Authors:  Süveyda Yeniterzi; Ugur Sezerman
Journal:  BMC Bioinformatics       Date:  2009-08-27       Impact factor: 3.169

10.  3D-footprint: a database for the structural analysis of protein-DNA complexes.

Authors:  Bruno Contreras-Moreira
Journal:  Nucleic Acids Res       Date:  2009-09-18       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.