Literature DB >> 21342534

Dynamic programming re-ranking for PPI interactor and pair extraction in full-text articles.

Richard Tzong-Han Tsai1, Po-Ting Lai.   

Abstract

BACKGROUND: Experimentally verified protein-protein interactions (PPIs) cannot be easily retrieved by researchers unless they are stored in PPI databases. The curation of such databases can be facilitated by employing text-mining systems to identify genes which play the interactor role in PPIs and to map these genes to unique database identifiers (interactor normalization task or INT) and then to return a list of interaction pairs for each article (interaction pair task or IPT). These two tasks are evaluated in terms of the area under curve of the interpolated precision/recall (AUC iP/R) score because the order of identifiers in the output list is important for ease of curation.
RESULTS: Our INT system developed for the BioCreAtIvE II.5 INT challenge achieved a promising AUC iP/R of 43.5% by using a support vector machine (SVM)-based ranking procedure. Using our new re-ranking algorithm, we have been able to improve system performance (AUC iP/R) by 1.84%. Our experimental results also show that with the re-ranked INT results, our unsupervised IPT system can achieve a competitive AUC iP/R of 23.86%, which outperforms the best BC II.5 INT system by 1.64%. Compared to using only SVM ranked INT results, using re-ranked INT results boosts AUC iP/R by 7.84%. Statistical significance t-test results show that our INT/IPT system with re-ranking outperforms that without re-ranking by a statistically significant difference.
CONCLUSIONS: In this paper, we present a new re-ranking algorithm that considers co-occurrence among identifiers in an article to improve INT and IPT ranking results. Combining the re-ranked INT results with an unsupervised approach to find associations among interactors, the proposed method can boost the IPT performance. We also implement score computation using dynamic programming, which is faster and more efficient than traditional approaches.

Entities:  

Mesh:

Year:  2011        PMID: 21342534      PMCID: PMC3053256          DOI: 10.1186/1471-2105-12-60

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  26 in total

1.  Detecting gene relations from Medline abstracts.

Authors:  M Stephens; M Palakal; S Mukhopadhyay; R Raje; J Mostafa
Journal:  Pac Symp Biocomput       Date:  2001

2.  GENIA corpus--semantically annotated corpus for bio-textmining.

Authors:  J-D Kim; T Ohta; Y Tateisi; J Tsujii
Journal:  Bioinformatics       Date:  2003       Impact factor: 6.937

3.  Distribution of information in biomedical abstracts and full-text publications.

Authors:  M J Schuemie; M Weeber; B J A Schijvenaars; E M van Mulligen; C C van der Eijk; R Jelier; B Mons; J A Kors
Journal:  Bioinformatics       Date:  2004-05-06       Impact factor: 6.937

4.  Discovering patterns to extract protein-protein interactions from full texts.

Authors:  Minlie Huang; Xiaoyan Zhu; Yu Hao; Donald G Payan; Kunbin Qu; Ming Li
Journal:  Bioinformatics       Date:  2004-07-29       Impact factor: 6.937

5.  Efficient extraction of protein-protein interactions from full-text articles.

Authors:  Jörg Hakenberg; Robert Leaman; Nguyen Ha Vo; Siddhartha Jonnalagadda; Ryan Sullivan; Christopher Miller; Luis Tari; Chitta Baral; Graciela Gonzalez
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2010 Jul-Sep       Impact factor: 3.710

6.  Integrating image data into biomedical text categorization.

Authors:  Hagit Shatkay; Nawei Chen; Dorothea Blostein
Journal:  Bioinformatics       Date:  2006-07-15       Impact factor: 6.937

7.  Overview of BioCreAtIvE task 1B: normalized gene lists.

Authors:  Lynette Hirschman; Marc Colosimo; Alexander Morgan; Alexander Yeh
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

8.  Overview of BioCreAtIvE: critical assessment of information extraction for biology.

Authors:  Lynette Hirschman; Alexander Yeh; Christian Blaschke; Alfonso Valencia
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

9.  Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome.

Authors:  Arun K Ramani; Razvan C Bunescu; Raymond J Mooney; Edward M Marcotte
Journal:  Genome Biol       Date:  2005-04-15       Impact factor: 13.583

10.  Information extraction from full text scientific articles: where are the keywords?

Authors:  Parantu K Shah; Carolina Perez-Iratxeta; Peer Bork; Miguel A Andrade
Journal:  BMC Bioinformatics       Date:  2003-05-29       Impact factor: 3.169

View more
  3 in total

1.  ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval.

Authors:  Jingyan Wang; Xin Gao; Quanquan Wang; Yongping Li
Journal:  BMC Bioinformatics       Date:  2012-05-08       Impact factor: 3.169

2.  Automatic extraction of biomolecular interactions: an empirical approach.

Authors:  Lifeng Zhang; Daniel Berleant; Jing Ding; Eve Syrkin Wurtele
Journal:  BMC Bioinformatics       Date:  2013-07-24       Impact factor: 3.169

3.  The research on gene-disease association based on text-mining of PubMed.

Authors:  Jie Zhou; Bo-Quan Fu
Journal:  BMC Bioinformatics       Date:  2018-02-07       Impact factor: 3.169

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.