Literature DB >> 12112672

Optimally informative backbone structural propensities in proteins.

Armando D Solis1, S Rackovsky.   

Abstract

We use basic ideas from information theory to extract the maximum amount of structural information available in protein sequence data. From a non-redundant set of protein X-ray structures, we construct local-sequence-dependent [phi,psi] distributions that summarize the influence of local sequence on backbone conformation. These distributions, approximations of actual backbone propensities in the folded protein, have the following properties: (1) They compensate for the problem of scarce data by an optimized combination of local-sequence-dependent and single-residue specific distributions; (2) They use multi-residue information; (3) They exploit similarities in the local coding properties of amino acids by collapsing the amino acid alphabet to streamline local sequence description; (4) They are designed to contain the maximum amount of local structural information the data set allows. Our methodology is able to extract around 30 cnats of information from the protein data set out of a total 387 cnats of initial uncertainty or entropy in a finely discretized [phi,psi] dihedral angle space (18 x 18 structural states), or about 7.8%. This was achieved at the hexamer length scale; shorter as well as longer fragments produce reduced information gains. The automatic clustering of amino acids into groups, a component of the optimization procedure, reveals patterns consistent with their local coding properties. While the overall information gain from local sequence is small, there are some local sequences that have significantly narrower structural distributions than others. Distribution width varies from at least 20% less than the average overall entropy to at least 14% above. This spread is an expression of the influence of local sequence on the conformational propensities of the backbone chain. The optimal ensemble of local-sequence-specific backbone distributions produced is useful as a guide to structural predictions from sequence, as well as a tool for further explorations of the nature of the local protein code. Copyright 2002 Wiley-Liss, Inc.

Mesh:

Substances:

Year:  2002        PMID: 12112672     DOI: 10.1002/prot.10126

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  10 in total

1.  On the properties and sequence context of structurally ambivalent fragments in proteins.

Authors:  Igor B Kuznetsov; S Rackovsky
Journal:  Protein Sci       Date:  2003-11       Impact factor: 6.725

2.  Global characteristics of protein sequences and their implications.

Authors:  S Rackovsky
Journal:  Proc Natl Acad Sci U S A       Date:  2010-04-26       Impact factor: 11.205

3.  An information theoretic approach to macromolecular modeling: II. Force fields.

Authors:  Tiba Aynechi; Irwin D Kuntz
Journal:  Biophys J       Date:  2005-11       Impact factor: 4.033

4.  Packing regularities in biological structures relate to their dynamics.

Authors:  Robert L Jernigan; Andrzej Kloczkowski
Journal:  Methods Mol Biol       Date:  2007

5.  Dihedral-angle information entropy as a gauge of secondary structure propensity.

Authors:  Shi Zhong; Jeremy M Moix; Stephen Quirk; Rigoberto Hernandez
Journal:  Biophys J       Date:  2006-09-15       Impact factor: 4.033

6.  Fold homology detection using sequence fragment composition profiles of proteins.

Authors:  Armando D Solis; Shalom R Rackovsky
Journal:  Proteins       Date:  2010-10

7.  Information-theoretic analysis of the reference state in contact potentials used for protein structure prediction.

Authors:  Armando D Solis; Shalom R Rackovsky
Journal:  Proteins       Date:  2010-05-01

8.  Early-stage folding in proteins (in silico) sequence-to-structure relation.

Authors:  Michał Brylinski; Leszek Konieczny; Patryk Czerwonko; Wiktor Jurkowski; Irena Roterman
Journal:  J Biomed Biotechnol       Date:  2005-06-30

9.  Automated alphabet reduction for protein datasets.

Authors:  Jaume Bacardit; Michael Stout; Jonathan D Hirst; Alfonso Valencia; Robert E Smith; Natalio Krasnogor
Journal:  BMC Bioinformatics       Date:  2009-01-06       Impact factor: 3.169

10.  Deriving high-resolution protein backbone structure propensities from all crystal data using the information maximization device.

Authors:  Armando D Solis
Journal:  PLoS One       Date:  2014-06-04       Impact factor: 3.240

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.