Literature DB >> 21784873

Genome-scale phylogenetic function annotation of large and diverse protein families.

Barbara E Engelhardt1, Michael I Jordan, John R Srouji, Steven E Brenner.   

Abstract

The Statistical Inference of Function Through Evolutionary Relationships (SIFTER) framework uses a statistical graphical model that applies phylogenetic principles to automate precise protein function prediction. Here we present a revised approach (SIFTER version 2.0) that enables annotations on a genomic scale. SIFTER 2.0 produces equivalently precise predictions compared to the earlier version on a carefully studied family and on a collection of 100 protein families. We have added an approximation method to SIFTER 2.0 and show a 500-fold improvement in speed with minimal impact on prediction results in the functionally diverse sulfotransferase protein family. On the Nudix protein family, previously inaccessible to the SIFTER framework because of the 66 possible molecular functions, SIFTER achieved 47.4% accuracy on experimental data (where BLAST achieved 34.0%). Finally, we used SIFTER to annotate all of the Schizosaccharomyces pombe proteins with experimental functional characterizations, based on annotations from proteins in 46 fungal genomes. SIFTER precisely predicted molecular function for 45.5% of the characterized proteins in this genome, as compared with four current function prediction methods that precisely predicted function for 62.6%, 30.6%, 6.0%, and 5.7% of these proteins. We use both precision-recall curves and ROC analyses to compare these genome-scale predictions across the different methods and to assess performance on different types of applications. SIFTER 2.0 is capable of predicting protein molecular function for large and functionally diverse protein families using an approximate statistical model, enabling phylogenetics-based protein function prediction for genome-wide analyses. The code for SIFTER and protein family data are available at http://sifter.berkeley.edu.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21784873      PMCID: PMC3205580          DOI: 10.1101/gr.104687.109

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  29 in total

1.  A phylogenomic study of DNA repair genes, proteins, and processes.

Authors:  J A Eisen; P C Hanawalt
Journal:  Mutat Res       Date:  1999-12-07       Impact factor: 2.433

2.  There is no universal molecular clock for invertebrates, but rate variation does not scale with body size.

Authors:  Jessica A Thomas; John J Welch; Megan Woolfit; Lindell Bromham
Journal:  Proc Natl Acad Sci U S A       Date:  2006-05-01       Impact factor: 11.205

3.  Enhanced automated function prediction using distantly related sequences and contextual association by PFP.

Authors:  Troy Hawkins; Stanislav Luban; Daisuke Kihara
Journal:  Protein Sci       Date:  2006-05-02       Impact factor: 6.725

Review 4.  Sulfated steroids as endogenous neuromodulators.

Authors:  Terrell T Gibbs; Shelley J Russek; David H Farb
Journal:  Pharmacol Biochem Behav       Date:  2006-10-04       Impact factor: 3.533

5.  ConFunc--functional annotation in the twilight zone.

Authors:  Mark N Wass; Michael J E Sternberg
Journal:  Bioinformatics       Date:  2008-02-08       Impact factor: 6.937

6.  Protein molecular function prediction by Bayesian phylogenomics.

Authors:  Barbara E Engelhardt; Michael I Jordan; Kathryn E Muratore; Steven E Brenner
Journal:  PLoS Comput Biol       Date:  2005-10-07       Impact factor: 4.475

7.  PANDIT: an evolution-centric database of protein and associated nucleotide domains with inferred trees.

Authors:  Simon Whelan; Paul I W de Bakker; Emmanuel Quevillon; Nicolas Rodriguez; Nick Goldman
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

8.  The GOA database in 2009--an integrated Gene Ontology Annotation resource.

Authors:  Daniel Barrell; Emily Dimmer; Rachael P Huntley; David Binns; Claire O'Donovan; Rolf Apweiler
Journal:  Nucleic Acids Res       Date:  2008-10-27       Impact factor: 16.971

9.  Structural and chemical profiling of the human cytosolic sulfotransferases.

Authors:  Abdellah Allali-Hassani; Patricia W Pan; Ludmila Dombrovski; Rafael Najmanovich; Wolfram Tempel; Aiping Dong; Peter Loppnau; Fernando Martin; Janet Thornton; Janet Thonton; Aled M Edwards; Alexey Bochkarev; Alexander N Plotnikov; Masoud Vedadi; Cheryl H Arrowsmith
Journal:  PLoS Biol       Date:  2007-05       Impact factor: 8.029

10.  FFPred: an integrated feature-based function prediction server for vertebrate proteomes.

Authors:  A E Lobley; T Nugent; C A Orengo; D T Jones
Journal:  Nucleic Acids Res       Date:  2008-05-07       Impact factor: 16.971

View more
  29 in total

1.  SIFTER search: a web server for accurate phylogeny-based protein function prediction.

Authors:  Sayed M Sahraeian; Kevin R Luo; Steven E Brenner
Journal:  Nucleic Acids Res       Date:  2015-05-15       Impact factor: 16.971

2.  Profile of Michael I. Jordan.

Authors:  Nicholette Zeliadt
Journal:  Proc Natl Acad Sci U S A       Date:  2013-01-22       Impact factor: 11.205

3.  A Bayesian sampler for optimization of protein domain hierarchies.

Authors:  Andrew F Neuwald
Journal:  J Comput Biol       Date:  2014-02-04       Impact factor: 1.479

4.  PROSNET: INTEGRATING HOMOLOGY WITH MOLECULAR NETWORKS FOR PROTEIN FUNCTION PREDICTION.

Authors:  Sheng Wang; Meng Qu; Jian Peng
Journal:  Pac Symp Biocomput       Date:  2017

Review 5.  The use of evolutionary patterns in protein annotation.

Authors:  Angela D Wilkins; Benjamin J Bachman; Serkan Erdin; Olivier Lichtarge
Journal:  Curr Opin Struct Biol       Date:  2012-05-24       Impact factor: 6.809

6.  Bayesian parameter estimation for automatic annotation of gene functions using observational data and phylogenetic trees.

Authors:  George G Vega Yon; Duncan C Thomas; John Morrison; Huaiyu Mi; Paul D Thomas; Paul Marjoram
Journal:  PLoS Comput Biol       Date:  2021-02-18       Impact factor: 4.475

7.  Molecular Evolutionary Constraints that Determine the Avirulence State of Clostridium botulinum C2 Toxin.

Authors:  A Prisilla; R Prathiviraj; P Chellapandi
Journal:  J Mol Evol       Date:  2017-04-05       Impact factor: 2.395

8.  A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function.

Authors:  Jason Lai; Jing Jin; Jan Kubelka; David A Liberles
Journal:  J Mol Biol       Date:  2012-05-28       Impact factor: 5.469

9.  Identifying functional groups among the diverse, recombining antigenic var genes of the malaria parasite Plasmodium falciparum from a local community in Ghana.

Authors:  Mary M Rorick; Edward B Baskerville; Thomas S Rask; Karen P Day; Mercedes Pascual
Journal:  PLoS Comput Biol       Date:  2018-06-13       Impact factor: 4.475

10.  Substrate ambiguity among the nudix hydrolases: biologically significant, evolutionary remnant, or both?

Authors:  Alexander G McLennan
Journal:  Cell Mol Life Sci       Date:  2012-11-27       Impact factor: 9.261

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.