Literature DB >> 18274648

Extraction of protein interaction data: a comparative analysis of methods in use.

Hena Jose1, Thangavel Vadivukarasi, Jyothi Devakumar.   

Abstract

Several natural language processing tools, both commercial and freely available, are used to extract protein interactions from publications. Methods used by these tools include pattern matching to dynamic programming with individual recall and precision rates. A methodical survey of these tools, keeping in mind the minimum interaction information a researcher would need, in comparison to manual analysis has not been carried out. We compared data generated using some of the selected NLP tools with manually curated protein interaction data (PathArt and IMaps) to comparatively determine the recall and precision rate. The rates were found to be lower than the published scores when a normalized definition for interaction is considered. Each data point captured wrongly or not picked up by the tool was analyzed. Our evaluation brings forth critical failures of NLP tools and provides pointers for the development of an ideal NLP tool.

Entities:  

Year:  2007        PMID: 18274648      PMCID: PMC3171344          DOI: 10.1155/2007/53096

Source DB:  PubMed          Journal:  EURASIP J Bioinform Syst Biol        ISSN: 1687-4145


  15 in total

1.  A literature network of human genes for high-throughput analysis of gene expression.

Authors:  T K Jenssen; A Laegreid; J Komorowski; E Hovig
Journal:  Nat Genet       Date:  2001-05       Impact factor: 38.330

2.  Detecting gene relations from Medline abstracts.

Authors:  M Stephens; M Palakal; S Mukhopadhyay; R Raje; J Mostafa
Journal:  Pac Symp Biocomput       Date:  2001

3.  Identifying the Interaction between Genes and Gene Products Based on Frequently Seen Verbs in Medline Abstracts.

Authors: 
Journal:  Genome Inform Ser Workshop Genome Inform       Date:  1998

4.  Automatic extraction of protein interactions from scientific abstracts.

Authors:  J Thomas; D Milward; C Ouzounis; S Pulman; M Carroll
Journal:  Pac Symp Biocomput       Date:  2000

5.  GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles.

Authors:  C Friedman; P Kra; H Yu; M Krauthammer; A Rzhetsky
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

6.  MedScan, a natural language processing engine for MEDLINE abstracts.

Authors:  Svetlana Novichkova; Sergei Egorov; Nikolai Daraselia
Journal:  Bioinformatics       Date:  2003-09-01       Impact factor: 6.937

7.  Extracting human protein interactions from MEDLINE using a full-sentence parser.

Authors:  Nikolai Daraselia; Anton Yuryev; Sergei Egorov; Svetalana Novichkova; Alexander Nikitin; Ilya Mazo
Journal:  Bioinformatics       Date:  2004-01-22       Impact factor: 6.937

8.  Literature mining and database annotation of protein phosphorylation using a rule-based system.

Authors:  Z Z Hu; M Narayanaswamy; K E Ravikumar; K Vijay-Shanker; C H Wu
Journal:  Bioinformatics       Date:  2005-04-06       Impact factor: 6.937

9.  Toward information extraction: identifying protein names from biological papers.

Authors:  K Fukuda; A Tamura; T Tsunoda; T Takagi
Journal:  Pac Symp Biocomput       Date:  1998

10.  BioRAT: extracting biological information from full-length papers.

Authors:  David P A Corney; Bernard F Buxton; William B Langdon; David T Jones
Journal:  Bioinformatics       Date:  2004-07-01       Impact factor: 6.937

View more
  5 in total

1.  PIMiner: a web tool for extraction of protein interactions from biomedical literature.

Authors:  Rajesh Chowdhary; Jinfeng Zhang; Sin Lam Tan; Daniel E Osborne; Vladimir B Bajic; Jun S Liu
Journal:  Int J Data Min Bioinform       Date:  2013       Impact factor: 0.667

2.  Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis.

Authors:  Christian Frech; Michael Kommenda; Viktoria Dorfer; Thomas Kern; Helmut Hintner; Johann W Bauer; Kamil Onder
Journal:  BMC Bioinformatics       Date:  2009-01-19       Impact factor: 3.169

3.  A realistic assessment of methods for extracting gene/protein interactions from free text.

Authors:  Renata Kabiljo; Andrew B Clegg; Adrian J Shepherd
Journal:  BMC Bioinformatics       Date:  2009-07-28       Impact factor: 3.169

4.  Large scale application of neural network based semantic role labeling for automated relation extraction from biomedical texts.

Authors:  Thorsten Barnickel; Jason Weston; Ronan Collobert; Hans-Werner Mewes; Volker Stümpflen
Journal:  PLoS One       Date:  2009-07-28       Impact factor: 3.240

5.  Information-based methods for predicting gene function from systematic gene knock-downs.

Authors:  Matthew T Weirauch; Christopher K Wong; Alexandra B Byrne; Joshua M Stuart
Journal:  BMC Bioinformatics       Date:  2008-10-29       Impact factor: 3.169

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.