Literature DB >> 22331370

Surveying the manifold divergence of an entire protein class for statistical clues to underlying biochemical mechanisms.

Andrew F Neuwald1.   

Abstract

Certain residues have no known function yet are co-conserved across distantly related protein families and diverse organisms, suggesting that they perform critical roles associated with as-yet-unidentified molecular properties and mechanisms. This raises the question of how to obtain additional clues regarding these mysterious biochemical phenomena with a view to formulating experimentally testable hypotheses. One approach is to access the implicit biochemical information encoded within the vast amount of genomic sequence data now becoming available. Here, a new Gibbs sampling strategy is formulated and implemented that can partition hundreds of thousands of sequences within a major protein class into multiple, functionally-divergent categories based on those pattern residues that best discriminate between categories. The sampler precisely defines the partition and pattern for each category by explicitly modeling unrelated, non-functional and related-yet-divergent proteins that would otherwise obscure the analysis. To aid biological interpretation, auxiliary routines can characterize pattern residues within available crystal structures and identify those structures most likely to shed light on the roles of pattern residues. This approach can be used to define and annotate automatically subgroup-specific conserved domain profiles based on statistically-rigorous empirical criteria rather than on the subjective and labor-intensive process of manual curation. Incorporating such profiles into domain database search sites (such as the NCBI BLAST site) will provide biologists with previously inaccessible molecular information useful for hypothesis generation and experimental design. Analyses of P-loop GTPases and of AAA+ ATPases illustrate the sampler's ability to obtain such information.

Mesh:

Substances:

Year:  2011        PMID: 22331370      PMCID: PMC3176138          DOI: 10.2202/1544-6115.1666

Source DB:  PubMed          Journal:  Stat Appl Genet Mol Biol        ISSN: 1544-6115


  58 in total

1.  Analysis and prediction of functional sub-types from protein sequence alignments.

Authors:  S S Hannenhalli; R B Russell
Journal:  J Mol Biol       Date:  2000-10-13       Impact factor: 5.469

2.  Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors.

Authors:  Leonid A Mirny; Mikhail S Gelfand
Journal:  J Mol Biol       Date:  2002-08-02       Impact factor: 5.469

Review 3.  Helicases: an overview.

Authors:  Mohamed Abdelhaleem
Journal:  Methods Mol Biol       Date:  2010

4.  Identification of functional residues and secondary structure from protein multiple sequence alignment.

Authors:  C D Livingstone; G J Barton
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

5.  Ran's C-terminal, basic patch, and nucleotide exchange mechanisms in light of a canonical structure for Rab, Rho, Ras, and Ran GTPases.

Authors:  Andrew F Neuwald; Natarajan Kannan; Aleksandar Poleksic; Naoya Hata; Jun S Liu
Journal:  Genome Res       Date:  2003-04       Impact factor: 9.043

Review 6.  Hydrogen bonding in globular proteins.

Authors:  E N Baker; R E Hubbard
Journal:  Prog Biophys Mol Biol       Date:  1984       Impact factor: 3.667

7.  Galpha Gbetagamma dissociation may be due to retraction of a buried lysine and disruption of an aromatic cluster by a GTP-sensing Arg Trp pair.

Authors:  Andrew F Neuwald
Journal:  Protein Sci       Date:  2007-11       Impact factor: 6.725

8.  Bayesian classification of residues associated with protein functional divergence: Arf and Arf-like GTPases.

Authors:  Andrew F Neuwald
Journal:  Biol Direct       Date:  2010-12-03       Impact factor: 4.540

9.  Ensemble approach to predict specificity determinants: benchmarking and validation.

Authors:  Saikat Chakrabarti; Anna R Panchenko
Journal:  BMC Bioinformatics       Date:  2009-07-02       Impact factor: 3.169

10.  CDD: specific functional annotation with the Conserved Domain Database.

Authors:  Aron Marchler-Bauer; John B Anderson; Farideh Chitsaz; Myra K Derbyshire; Carol DeWeese-Scott; Jessica H Fong; Lewis Y Geer; Renata C Geer; Noreen R Gonzales; Marc Gwadz; Siqian He; David I Hurwitz; John D Jackson; Zhaoxi Ke; Christopher J Lanczycki; Cynthia A Liebert; Chunlei Liu; Fu Lu; Shennan Lu; Gabriele H Marchler; Mikhail Mullokandov; James S Song; Asba Tasneem; Narmada Thanki; Roxanne A Yamashita; Dachuan Zhang; Naigong Zhang; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2008-11-04       Impact factor: 16.971

View more
  12 in total

1.  Initial Cluster Analysis.

Authors:  Stephen F Altschul; Andrew F Neuwald
Journal:  J Comput Biol       Date:  2017-08-03       Impact factor: 1.479

2.  Evaluating, comparing, and interpreting protein domain hierarchies.

Authors:  Andrew F Neuwald
Journal:  J Comput Biol       Date:  2014-02-21       Impact factor: 1.479

3.  A Bayesian sampler for optimization of protein domain hierarchies.

Authors:  Andrew F Neuwald
Journal:  J Comput Biol       Date:  2014-02-04       Impact factor: 1.479

4.  Tracing the origin and evolution of pseudokinases across the tree of life.

Authors:  Annie Kwon; Steven Scott; Rahil Taujale; Wayland Yeung; Krys J Kochut; Patrick A Eyers; Natarajan Kannan
Journal:  Sci Signal       Date:  2019-04-23       Impact factor: 8.192

5.  Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures.

Authors:  Andrew F Neuwald; Christopher J Lanczycki; Aron Marchler-Bauer
Journal:  BMC Bioinformatics       Date:  2012-06-22       Impact factor: 3.169

6.  Co-conserved MAPK features couple D-domain docking groove to distal allosteric sites via the C-terminal flanking tail.

Authors:  Tuan Nguyen; Zheng Ruan; Krishnadev Oruganty; Natarajan Kannan
Journal:  PLoS One       Date:  2015-03-23       Impact factor: 3.240

7.  Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

Authors:  Andrew F Neuwald; Stephen F Altschul
Journal:  PLoS Comput Biol       Date:  2016-12-21       Impact factor: 4.475

Review 8.  Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently.

Authors:  Andrew Currin; Neil Swainston; Philip J Day; Douglas B Kell
Journal:  Chem Soc Rev       Date:  2015-03-07       Impact factor: 54.564

9.  Identification and classification of small molecule kinases: insights into substrate recognition and specificity.

Authors:  Krishnadev Oruganty; Eric E Talevich; Andrew F Neuwald; Natarajan Kannan
Journal:  BMC Evol Biol       Date:  2016-01-06       Impact factor: 3.260

10.  Statistical investigations of protein residue direct couplings.

Authors:  Andrew F Neuwald; Stephen F Altschul
Journal:  PLoS Comput Biol       Date:  2018-12-31       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.