| Literature DB >> 27825220 |
Alice Coucke1, Guido Uguzzoni2, Francesco Oteri2, Simona Cocco3, Remi Monasson1, Martin Weigt2.
Abstract
Coevolution of residues in contact imposes strong statistical constraints on the sequence variability between homologous proteins. Direct-Coupling Analysis (DCA), a global statistical inference method, successfully models this variability across homologous protein families to infer structural information about proteins. For each residue pair, DCA infers 21 × 21 matrices describing the coevolutionary coupling for each pair of amino acids (or gaps). To achieve the residue-residue contact prediction, these matrices are mapped onto simple scalar parameters; the full information they contain gets lost. Here, we perform a detailed spectral analysis of the coupling matrices resulting from 70 protein families, to show that they contain quantitative information about the physico-chemical properties of amino-acid interactions. Results for protein families are corroborated by the analysis of synthetic data from lattice-protein models, which emphasizes the critical effect of sampling quality and regularization on the biochemical features of the statistical coupling matrices.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27825220 DOI: 10.1063/1.4966156
Source DB: PubMed Journal: J Chem Phys ISSN: 0021-9606 Impact factor: 3.488