Literature DB >> 26380077

The PFP and ESG protein function prediction methods in 2014: effect of database updates and ensemble approaches.

Ishita K Khan1, Qing Wei1, Samuel Chapman2, Dukka B Kc2, Daisuke Kihara3.   

Abstract

BACKGROUND: Functional annotation of novel proteins is one of the central problems in bioinformatics. With the ever-increasing development of genome sequencing technologies, more and more sequence information is becoming available to analyze and annotate. To achieve fast and automatic function annotation, many computational (automated) function prediction (AFP) methods have been developed. To objectively evaluate the performance of such methods on a large scale, community-wide assessment experiments have been conducted. The second round of the Critical Assessment of Function Annotation (CAFA) experiment was held in 2013-2014. Evaluation of participating groups was reported in a special interest group meeting at the Intelligent Systems in Molecular Biology (ISMB) conference in Boston in 2014. Our group participated in both CAFA1 and CAFA2 using multiple, in-house AFP methods. Here, we report benchmark results of our methods obtained in the course of preparation for CAFA2 prior to submitting function predictions for CAFA2 targets.
RESULTS: For CAFA2, we updated the annotation databases used by our methods, protein function prediction (PFP) and extended similarity group (ESG), and benchmarked their function prediction performances using the original (older) and updated databases. Performance evaluation for PFP with different settings and ESG are discussed. We also developed two ensemble methods that combine function predictions from six independent, sequence-based AFP methods. We further analyzed the performances of our prediction methods by enriching the predictions with prior distribution of gene ontology (GO) terms. Examples of predictions by the ensemble methods are discussed.
CONCLUSIONS: Updating the annotation database was successful, improving the Fmax prediction accuracy score for both PFP and ESG. Adding the prior distribution of GO terms did not make much improvement. Both of the ensemble methods we developed improved the average Fmax score over all individual component methods except for ESG. Our benchmark results will not only complement the overall assessment that will be done by the CAFA organizers, but also help elucidate the predictive powers of sequence-based function prediction methods in general.

Entities:  

Keywords:  CAFA; ESG; PFP; Protein function; consensus method; ensemble method; function prediction; gene annotation; sequence

Mesh:

Substances:

Year:  2015        PMID: 26380077      PMCID: PMC4570625          DOI: 10.1186/s13742-015-0083-4

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  56 in total

1.  Knowledge-based analysis of microarray gene expression data by using support vector machines.

Authors:  M P Brown; W N Grundy; D Lin; N Cristianini; C W Sugnet; T S Furey; M Ares; D Haussler
Journal:  Proc Natl Acad Sci U S A       Date:  2000-01-04       Impact factor: 11.205

2.  Structure- and sequence-based function prediction for non-homologous proteins.

Authors:  Lee Sael; Meghana Chitale; Daisuke Kihara
Journal:  J Struct Funct Genomics       Date:  2012-01-22

3.  Widely predicting specific protein functions based on protein-protein interaction data and gene expression profile.

Authors:  Lei Gao; Xia Li; Zheng Guo; MingZhu Zhu; YanHui Li; ShaoQi Rao
Journal:  Sci China C Life Sci       Date:  2007-02

Review 4.  Structure-based function prediction: approaches and applications.

Authors:  Pier Federico Gherardini; Manuela Helmer-Citterich
Journal:  Brief Funct Genomic Proteomic       Date:  2008-07-03

5.  Protein folds and functions.

Authors:  A C Martin; C A Orengo; E G Hutchinson; S Jones; M Karmirantzou; R A Laskowski; J B Mitchell; C Taroni; J M Thornton
Journal:  Structure       Date:  1998-07-15       Impact factor: 5.006

Review 6.  Multifunctional lens crystallins and corneal enzymes. More than meets the eye.

Authors:  J Piatigorsky
Journal:  Ann N Y Acad Sci       Date:  1998-04-15       Impact factor: 5.691

7.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

8.  Predicting protein function from protein/protein interaction data: a probabilistic approach.

Authors:  Stanley Letovsky; Simon Kasif
Journal:  Bioinformatics       Date:  2003       Impact factor: 6.937

9.  In-depth performance evaluation of PFP and ESG sequence-based function prediction methods in CAFA 2011 experiment.

Authors:  Meghana Chitale; Ishita K Khan; Daisuke Kihara
Journal:  BMC Bioinformatics       Date:  2013-02-28       Impact factor: 3.169

10.  GOPET: a tool for automated predictions of Gene Ontology terms.

Authors:  Arunachalam Vinayagam; Coral del Val; Falk Schubert; Roland Eils; Karl-Heinz Glatting; Sándor Suhai; Rainer König
Journal:  BMC Bioinformatics       Date:  2006-03-20       Impact factor: 3.169

View more
  4 in total

1.  ContactPFP: Protein function prediction using predicted contact information.

Authors:  Yuki Kagaya; Sean T Flannery; Aashish Jain; Daisuke Kihara
Journal:  Front Bioinform       Date:  2022-06-02

2.  BUSCA: an integrative web server to predict subcellular localization of proteins.

Authors:  Castrense Savojardo; Pier Luigi Martelli; Piero Fariselli; Giuseppe Profiti; Rita Casadio
Journal:  Nucleic Acids Res       Date:  2018-07-02       Impact factor: 16.971

3.  INGA 2.0: improving protein function prediction for the dark proteome.

Authors:  Damiano Piovesan; Silvio C E Tosatto
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

4.  Proteomic profiling of hydatid fluid from pulmonary cystic echinococcosis.

Authors:  Guilherme Brzoskowski Dos Santos; Edileuza Danieli da Silva; Eduardo Shigueo Kitano; Maria Eduarda Battistella; Karina Mariante Monteiro; Jeferson Camargo de Lima; Henrique Bunselmeyer Ferreira; Solange Maria de Toledo Serrano; Arnaldo Zaha
Journal:  Parasit Vectors       Date:  2022-03-21       Impact factor: 3.876

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.