Literature DB >> 26407535

Amino acid alphabet reduction preserves fold information contained in contact interactions in proteins.

Armando D Solis1.   

Abstract

To reduce complexity, understand generalized rules of protein folding, and facilitate de novo protein design, the 20-letter amino acid alphabet is commonly reduced to a smaller alphabet by clustering amino acids based on some measure of similarity. In this work, we seek the optimal alphabet that preserves as much of the structural information found in long-range (contact) interactions among amino acids in natively-folded proteins. We employ the Information Maximization Device, based on information theory, to partition the amino acids into well-defined clusters. Numbering from 2 to 19 groups, these optimal clusters of amino acids, while generated automatically, embody well-known properties of amino acids such as hydrophobicity/polarity, charge, size, and aromaticity, and are demonstrated to maintain the discriminative power of long-range interactions with minimal loss of mutual information. Our measurements suggest that reduced alphabets (of less than 10) are able to capture virtually all of the information residing in native contacts and may be sufficient for fold recognition, as demonstrated by extensive threading tests. In an expansive survey of the literature, we observe that alphabets derived from various approaches-including those derived from physicochemical intuition, local structure considerations, and sequence alignments of remote homologs-fare consistently well in preserving contact interaction information, highlighting a convergence in the various factors thought to be relevant to the folding code. Moreover, we find that alphabets commonly used in experimental protein design are nearly optimal and are largely coherent with observations that have arisen in this work.
© 2015 Wiley Periodicals, Inc.

Keywords:  amino acid sequence; contact potential; knowledge-based potential; protein structure; sequence representation; threading

Mesh:

Substances:

Year:  2015        PMID: 26407535     DOI: 10.1002/prot.24936

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  7 in total

1.  IHEC_RAAC: a online platform for identifying human enzyme classes via reduced amino acid cluster strategy.

Authors:  Hao Wang; Qilemuge Xi; Pengfei Liang; Lei Zheng; Yan Hong; Yongchun Zuo
Journal:  Amino Acids       Date:  2021-01-23       Impact factor: 3.520

2.  Coevolutionary Landscape of Kinase Family Proteins: Sequence Probabilities and Functional Motifs.

Authors:  Allan Haldane; William F Flynn; Peng He; Ronald M Levy
Journal:  Biophys J       Date:  2018-01-09       Impact factor: 4.033

3.  RaacFold: a webserver for 3D visualization and analysis of protein structure by using reduced amino acid alphabets.

Authors:  Lei Zheng; Dongyang Liu; Yuan Alex Li; Siqi Yang; Yuchao Liang; Yongqiang Xing; Yongchun Zuo
Journal:  Nucleic Acids Res       Date:  2022-05-25       Impact factor: 19.160

4.  RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule.

Authors:  Lei Zheng; Shenghui Huang; Nengjiang Mu; Haoyue Zhang; Jiayu Zhang; Yu Chang; Lei Yang; Yongchun Zuo
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

5.  A Simplified Amino Acidic Alphabet to Unveil the T-Cells Receptors Antigens: A Computational Perspective.

Authors:  Raffaele Iannuzzi; Grazisa Rossetti; Andrea Spitaleri; Raoul J P Bonnal; Massimiliano Pagani; Luca Mollica
Journal:  Front Chem       Date:  2021-02-25       Impact factor: 5.221

6.  Immunoglobulin Classification Based on FC* and GC* Features.

Authors:  Hao Wan; Jina Zhang; Yijie Ding; Hetian Wang; Geng Tian
Journal:  Front Genet       Date:  2022-01-24       Impact factor: 4.599

Review 7.  Research progress of reduced amino acid alphabets in protein analysis and prediction.

Authors:  Yuchao Liang; Siqi Yang; Lei Zheng; Hao Wang; Jian Zhou; Shenghui Huang; Lei Yang; Yongchun Zuo
Journal:  Comput Struct Biotechnol J       Date:  2022-07-04       Impact factor: 6.155

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.