Literature DB >> 18655063

PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

Troy Hawkins1, Meghana Chitale, Stanislav Luban, Daisuke Kihara.   

Abstract

Protein function prediction is a central problem in bioinformatics, increasing in importance recently due to the rapid accumulation of biological data awaiting interpretation. Sequence data represents the bulk of this new stock and is the obvious target for consideration as input, as newly sequenced organisms often lack any other type of biological characterization. We have previously introduced PFP (Protein Function Prediction) as our sequence-based predictor of Gene Ontology (GO) functional terms. PFP interprets the results of a PSI-BLAST search by extracting and scoring individual functional attributes, searching a wide range of E-value sequence matches, and utilizing conventional data mining techniques to fill in missing information. We have shown it to be effective in predicting both specific and low-resolution functional attributes when sufficient data is unavailable. Here we describe (1) significant improvements to the PFP infrastructure, including the addition of prediction significance and confidence scores, (2) a thorough benchmark of performance and comparisons to other related prediction methods, and (3) applications of PFP predictions to genome-scale data. We applied PFP predictions to uncharacterized protein sequences from 15 organisms. Among these sequences, 60-90% could be annotated with a GO molecular function term at high confidence (>or=80%). We also applied our predictions to the protein-protein interaction network of the Malaria plasmodium (Plasmodium falciparum). High confidence GO biological process predictions (>or=90%) from PFP increased the number of fully enriched interactions in this dataset from 23% of interactions to 94%. Our benchmark comparison shows significant performance improvement of PFP relative to GOtcha, InterProScan, and PSI-BLAST predictions. This is consistent with the performance of PFP as the overall best predictor in both the AFP-SIG '05 and CASP7 function (FN) assessments. PFP is available as a web service at http://dragon.bio.purdue.edu/pfp/. (c) 2008 Wiley-Liss, Inc.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 18655063     DOI: 10.1002/prot.22172

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  41 in total

1.  Real-time ligand binding pocket database search using local surface descriptors.

Authors:  Rayan Chikhi; Lee Sael; Daisuke Kihara
Journal:  Proteins       Date:  2010-07

Review 2.  Computational characterization of moonlighting proteins.

Authors:  Ishita K Khan; Daisuke Kihara
Journal:  Biochem Soc Trans       Date:  2014-12       Impact factor: 5.407

3.  Structure- and sequence-based function prediction for non-homologous proteins.

Authors:  Lee Sael; Meghana Chitale; Daisuke Kihara
Journal:  J Struct Funct Genomics       Date:  2012-01-22

4.  ESG: extended similarity group method for automated protein function prediction.

Authors:  Meghana Chitale; Troy Hawkins; Changsoon Park; Daisuke Kihara
Journal:  Bioinformatics       Date:  2009-05-12       Impact factor: 6.937

5.  Computational Methods for Predicting Protein-Protein Interactions Using Various Protein Features.

Authors:  Ziyun Ding; Daisuke Kihara
Journal:  Curr Protoc Protein Sci       Date:  2018-06-21

6.  Identification of Moonlighting Proteins in Genomes Using Text Mining Techniques.

Authors:  Aashish Jain; Hareesh Gali; Daisuke Kihara
Journal:  Proteomics       Date:  2018-10-10       Impact factor: 3.984

7.  Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks.

Authors:  Renzhi Cao; Jianlin Cheng
Journal:  Methods       Date:  2015-09-11       Impact factor: 3.608

8.  MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping.

Authors:  Chengxin Zhang; Wei Zheng; Peter L Freddolino; Yang Zhang
Journal:  J Mol Biol       Date:  2018-03-10       Impact factor: 5.469

9.  The PFP and ESG protein function prediction methods in 2014: effect of database updates and ensemble approaches.

Authors:  Ishita K Khan; Qing Wei; Samuel Chapman; Dukka B Kc; Daisuke Kihara
Journal:  Gigascience       Date:  2015-09-14       Impact factor: 6.524

10.  Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP.

Authors:  Troy Hawkins; Meghana Chitale; Daisuke Kihara
Journal:  BMC Bioinformatics       Date:  2010-05-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.