Literature DB >> 16672240

Enhanced automated function prediction using distantly related sequences and contextual association by PFP.

Troy Hawkins1, Stanislav Luban, Daisuke Kihara.   

Abstract

The impetus for the recent development and emergence of automated function prediction methods is an exponentially growing flood of new experimental data, the interpretation of which is hindered by a shortage of reliable annotations for proteins that lack experimental characterization or significant homologs in current databases. Here we introduce PFP, an automated function prediction server that provides the most probable annotations for a query sequence in each of the three branches of the Gene Ontology: biological process, molecular function, and cellular component. Rather than utilizing precise pattern matching to identify functional motifs in the sequences and structures of these proteins, we designed PFP to increase the coverage of function annotation by lowering resolution of predictions when a detailed function is not predictable. To do this we extend a traditional PSI-BLAST search by extracting and scoring annotations (GO terms) individually, including annotations from distantly related sequences, and applying a novel data mining tool, the Function Association Matrix, to score strongly associated pairs of annotations. We show that PFP can correctly assign function using only weakly similar sequences with a significantly better accuracy and coverage than a standard PSI-BLAST search, improving it more than fivefold. The most descriptive annotations predicted by PFP (GO depth > or = 8) can identify a significant subgraph in the GO with > 60% accuracy and approximately 100% coverage for our benchmark set. We also provide examples of the superb performance of PFP in an assessment of automated function prediction servers at the Automated Function Prediction Special Interest Group meeting at ISMB 2005 (AFP-SIG '05).

Mesh:

Substances:

Year:  2006        PMID: 16672240      PMCID: PMC2242549          DOI: 10.1110/ps.062153506

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  8 in total

1.  GoFigure: automated Gene Ontology annotation.

Authors:  Salim Khan; Gang Situ; Keith Decker; Carl J Schmidt
Journal:  Bioinformatics       Date:  2003-12-12       Impact factor: 6.937

2.  The Gene Ontology (GO) database and informatics resource.

Authors:  M A Harris; J Clark; A Ireland; J Lomax; M Ashburner; R Foulger; K Eilbeck; S Lewis; B Marshall; C Mungall; J Richter; G M Rubin; J A Blake; C Bult; M Dolan; H Drabkin; J T Eppig; D P Hill; L Ni; M Ringwald; R Balakrishnan; J M Cherry; K R Christie; M C Costanzo; S S Dwight; S Engel; D G Fisk; J E Hirschman; E L Hong; R S Nash; A Sethuraman; C L Theesfeld; D Botstein; K Dolinski; B Feierbach; T Berardini; S Mundodi; S Y Rhee; R Apweiler; D Barrell; E Camon; E Dimmer; V Lee; R Chisholm; P Gaudet; W Kibbe; R Kishore; E M Schwarz; P Sternberg; M Gwinn; L Hannick; J Wortman; M Berriman; V Wood; N de la Cruz; P Tonellato; P Jaiswal; T Seigfried; R White
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

3.  Automated Gene Ontology annotation for anonymous sequence data.

Authors:  Steffen Hennig; Detlef Groth; Hans Lehrach
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

4.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

Review 5.  Predicting protein function from sequence and structural data.

Authors:  James D Watson; Roman A Laskowski; Janet M Thornton
Journal:  Curr Opin Struct Biol       Date:  2005-06       Impact factor: 6.809

6.  Inference of protein function from protein structure.

Authors:  Debnath Pal; David Eisenberg
Journal:  Structure       Date:  2005-01       Impact factor: 5.006

Review 7.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

8.  GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes.

Authors:  David M A Martin; Matthew Berriman; Geoffrey J Barton
Journal:  BMC Bioinformatics       Date:  2004-11-18       Impact factor: 3.169

  8 in total
  61 in total

1.  Real-time ligand binding pocket database search using local surface descriptors.

Authors:  Rayan Chikhi; Lee Sael; Daisuke Kihara
Journal:  Proteins       Date:  2010-07

Review 2.  Computational characterization of moonlighting proteins.

Authors:  Ishita K Khan; Daisuke Kihara
Journal:  Biochem Soc Trans       Date:  2014-12       Impact factor: 5.407

3.  Structure- and sequence-based function prediction for non-homologous proteins.

Authors:  Lee Sael; Meghana Chitale; Daisuke Kihara
Journal:  J Struct Funct Genomics       Date:  2012-01-22

4.  New avenues in protein function prediction.

Authors:  Iddo Friedberg; Martin Jambon; Adam Godzik
Journal:  Protein Sci       Date:  2006-06       Impact factor: 6.725

5.  ESG: extended similarity group method for automated protein function prediction.

Authors:  Meghana Chitale; Troy Hawkins; Changsoon Park; Daisuke Kihara
Journal:  Bioinformatics       Date:  2009-05-12       Impact factor: 6.937

Review 6.  Exploring the structure and function paradigm.

Authors:  Oliver C Redfern; Benoit Dessailly; Christine A Orengo
Journal:  Curr Opin Struct Biol       Date:  2008-06       Impact factor: 6.809

7.  Lactobacillus reuteri-specific immunoregulatory gene rsiR modulates histamine production and immunomodulation by Lactobacillus reuteri.

Authors:  P Hemarajata; C Gao; K J Pflughoeft; C M Thomas; D M Saulnier; J K Spinler; J Versalovic
Journal:  J Bacteriol       Date:  2013-10-11       Impact factor: 3.490

8.  Genome-scale phylogenetic function annotation of large and diverse protein families.

Authors:  Barbara E Engelhardt; Michael I Jordan; John R Srouji; Steven E Brenner
Journal:  Genome Res       Date:  2011-07-22       Impact factor: 9.043

9.  The PFP and ESG protein function prediction methods in 2014: effect of database updates and ensemble approaches.

Authors:  Ishita K Khan; Qing Wei; Samuel Chapman; Dukka B Kc; Daisuke Kihara
Journal:  Gigascience       Date:  2015-09-14       Impact factor: 6.524

10.  Sequence-based feature prediction and annotation of proteins.

Authors:  Agnieszka S Juncker; Lars J Jensen; Andrea Pierleoni; Andreas Bernsel; Michael L Tress; Peer Bork; Gunnar von Heijne; Alfonso Valencia; Christos A Ouzounis; Rita Casadio; Søren Brunak
Journal:  Genome Biol       Date:  2009-02-02       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.