Literature DB >> 21268112

Learning generative models for protein fold families.

Sivaraman Balakrishnan1, Hetunandan Kamisetty, Jaime G Carbonell, Su-In Lee, Christopher James Langmead.   

Abstract

We introduce a new approach to learning statistical models from multiple sequence alignments (MSA) of proteins. Our method, called GREMLIN (Generative REgularized ModeLs of proteINs), learns an undirected probabilistic graphical model of the amino acid composition within the MSA. The resulting model encodes both the position-specific conservation statistics and the correlated mutation statistics between sequential and long-range pairs of residues. Existing techniques for learning graphical models from MSA either make strong, and often inappropriate assumptions about the conditional independencies within the MSA (e.g., Hidden Markov Models), or else use suboptimal algorithms to learn the parameters of the model. In contrast, GREMLIN makes no a priori assumptions about the conditional independencies within the MSA. We formulate and solve a convex optimization problem, thus guaranteeing that we find a globally optimal model at convergence. The resulting model is also generative, allowing for the design of new protein sequences that have the same statistical properties as those in the MSA. We perform a detailed analysis of covariation statistics on the extensively studied WW and PDZ domains and show that our method out-performs an existing algorithm for learning undirected probabilistic graphical models from MSA. We then apply our approach to 71 additional families from the PFAM database and demonstrate that the resulting models significantly out-perform Hidden Markov Models in terms of predictive accuracy.
Copyright © 2011 Wiley-Liss, Inc.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21268112     DOI: 10.1002/prot.22934

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  125 in total

1.  From residue coevolution to protein conformational ensembles and functional dynamics.

Authors:  Ludovico Sutto; Simone Marsili; Alfonso Valencia; Francesco Luigi Gervasio
Journal:  Proc Natl Acad Sci U S A       Date:  2015-10-20       Impact factor: 11.205

2.  Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning.

Authors:  Jianzhu Ma; Sheng Wang; Zhiyong Wang; Jinbo Xu
Journal:  Bioinformatics       Date:  2015-08-14       Impact factor: 6.937

3.  Synthetic protein alignments by CCMgen quantify noise in residue-residue contact prediction.

Authors:  Susann Vorberg; Stefan Seemayer; Johannes Söding
Journal:  PLoS Comput Biol       Date:  2018-11-05       Impact factor: 4.475

4.  Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era.

Authors:  Hetunandan Kamisetty; Sergey Ovchinnikov; David Baker
Journal:  Proc Natl Acad Sci U S A       Date:  2013-09-05       Impact factor: 11.205

5.  Evolutionary couplings of amino acid residues reveal structure and function of bacterial signaling proteins.

Authors:  Hendrik Szurmant
Journal:  Mol Microbiol       Date:  2019-07-03       Impact factor: 3.501

6.  Systematic Testing of Belief-Propagation Estimates for Absolute Free Energies in Atomistic Peptides and Proteins.

Authors:  Rory M Donovan-Maiye; Christopher J Langmead; Daniel M Zuckerman
Journal:  J Chem Theory Comput       Date:  2017-12-22       Impact factor: 6.006

7.  Structure of the Bacterial Cytoskeleton Protein Bactofilin by NMR Chemical Shifts and Sequence Variation.

Authors:  Maher M Kassem; Yong Wang; Wouter Boomsma; Kresten Lindorff-Larsen
Journal:  Biophys J       Date:  2016-06-07       Impact factor: 4.033

8.  Learning sequence determinants of protein:protein interaction specificity with sparse graphical models.

Authors:  Hetunandan Kamisetty; Bornika Ghosh; Christopher James Langmead; Chris Bailey-Kellogg
Journal:  J Comput Biol       Date:  2015-05-14       Impact factor: 1.479

9.  Zinc finger domain of the human DTX protein adopts a unique RING fold.

Authors:  Kazuhide Miyamoto; Yuma Fujiwara; Kazuki Saito
Journal:  Protein Sci       Date:  2019-04-12       Impact factor: 6.725

10.  The unique N-terminal zinc finger of synaptotagmin-like protein 4 reveals FYVE structure.

Authors:  Kazuhide Miyamoto; Arisa Nakatani; Kazuki Saito
Journal:  Protein Sci       Date:  2017-10-25       Impact factor: 6.725

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.