Literature DB >> 31908743

Importance of Fluctuating Amino Acid Residues in Folding and Binding of Proteins.

Renganathan Senthil1, Singaravelu Usha2, Konda Mani Saravanan1,2.   

Abstract

BACKGROUND: Conformational flexibility of proteins remains as one of the major events in protein-protein/DNA/ligand/small molecule binding to achieve its biological function in the cell. The availability of high-resolution structures of protein complexes is a valuable resource for researchers to understand the mechanisms behind such interactions and it is found that the flexibility of amino acid residues at binding sites is crucial for many important functions in the cell.
METHODS: In this article, our statistical method (PreFRP) developed based on fluctuating amino acid residues and various amino acid indices related to flexibility/rigidity were used to study the importance of fluctuating amino acid residues in thermonucleases from pathogenic bacteria, cell penetrating peptides and intrinsically disordered proteins responsible for many neural disorders.
RESULTS: The results from our analysis reveal the importance of fluctuating amino acid residues in folding and binding of proteins. The role of moderate and high fluctuating residues in themonucleases, cell penetrating peptide and disordered regions are discussed in detail.
CONCLUSION: Therefore, our analysis will help in understanding the importance of fluctuating amino acid residues in proteins which undergo a conformation change phenomenon. Copyright
© 2019 Avicenna Research Institute.

Entities:  

Keywords:  Amino acid; Binding sites; Cell-penetrating peptides; DNA-binding proteins

Year:  2019        PMID: 31908743      PMCID: PMC6925403     

Source DB:  PubMed          Journal:  Avicenna J Med Biotechnol        ISSN: 2008-2835


Introduction

Proteins are functional units of the cell and play many important roles in various cellular processes like transport, metabolism, and signaling. The collection of proteins within a cell determines its health and function and the proteins in the cell interact with other proteins/DNA/small molecules to perform biological functions. During these interactions, the amino acid residues at the interface/binding site alter conformation in such a way to attain stability 1. Recognizing or identifying the binding sites in a protein is one of the major goals of bioinformatics. It was believed that the extraction of biologically relevant features of the binding sites of proteins may ultimately lead to accurate prediction of binding sites 2. The problem with this kind of approach is due to lack of our understanding of what investigated homologous subsequences in the 3D structure of proteins and reported surprising structural adaptability of identical subsequences 3. Wilson et al stated that common sequences of up to eight residues do occur in unrelated proteins and sequence-specific antibodies can be generated to test binding to identical sequences contained in unrelated proteins 4. Argos has examined the most frequently observed residue substitutions and their correlation with structural changes in the oligopeptide pairs of identical pentapeptides in unrelated proteins which yielded a possible guide for site-directed mutagenesis experiments when no tertiary structural information is at hand 5. Minor and Kim have designed an 11 amino acid sequence (chameleon sequence fragment) that folds as an alpha helix in one position and beta sheet at another position of the IgG binding domain of the protein and they demonstrated that non-local interactions can determine the secondary structure of peptide sequences of substantial length 6. After a careful study, Dalal and Regan demonstrated careful selection of key amino acid residues to manipulate the balance of short and long-range interactions which stabilize either a helical or sheet conformation 7. Using all the knowledge provided by the various protein science research groups, it is clear that the fluctuating amino acid residues at binding sites or at certain positions are important for folding and binding of proteins.

Materials and Methods

Definition of fluctuating residues and amino acid indices

The definition of high, moderate and weak fluctuating amino acid residues by Ruvinsky et al 8 is presented in table 1 and is used in this study. The five highly fluctuating amino acid residues are glycine, alanine, serine, proline and aspartic acid, respectively. Glycine and ala-nine are two small hydrophobic amino acids as the most flexible molecules in proteins. Serine contains an OH group in side chain which can form a hydrogen bond, whereas aspartic acid contains a carboxylic acid in side chain which can lose a proton to give a negatively charged COO−. The role of five fluctuating residues in protein structure, folding and stability has been revealed in our previous studies 9,10.
Table 1.

Fluctuating amino acid residues: High, moderate and weak fluctuating amino acid residues

Fluctuating typeAmino acid residues
High fluctuating amino acid residuesGlycine, Alanine, Serine, Proline and Aspartic acid
Moderate fluctuating amino acid residuesThreonine, Glutamic acid, Asparagine, Lysine, Cysteine, Glutamine, Arginine and Valine
Weak fluctuating amino acid residuesHistidine, Leucine, Methionine, Isoleucine, Tyrosine, Phenylalanine and Tryptophan
Fluctuating amino acid residues: High, moderate and weak fluctuating amino acid residues Hydrophobicity as a physicochemical property is effectively used to characterize secondary structures of proteins and it is considered as a dominant force for protein folding 9 and hence, 70 amino acid indices from AA index database (https://www.genome.jp/aaindex/) were the focus of this study. The indices related to hydrophobicity which reflect flexibility and rigidity were carefully selected from database. The twenty amino acid residues are written in order (higher to lower values of their property) to understand the contribution of fluctuating amino acid residues in proteins. The position-specific matrix of each amino acid residue in the indices is given in table 2.
Table 2.

Position profiles of amino acid residues in various hydrophobicity indices

PositionACDEFGHIKLMNPQRSTVWY
1016117301122100110002103
21453100112552100100974
30133121111354202200974
4040581071124552222631
5331160332152314030767
62251611329132222001025
7831131523013244033554
8770218300291212271312
9640203101613162247345
10166121951011032078016
11932201492030351167021
12430503605113314317245
1331330382505272972134
1431850360304278552143
15145413212011176377500
16106625231221354153432
17145111321250868160321
18329804253405010741102
19111225114924534753001
201133743116111461111140
Position profiles of amino acid residues in various hydrophobicity indices

Prediction and visualization of fluctuating residues

In our previous work, a web server “PreFRP” was developed to visualize and predict fluctuating amino acid residues in proteins 11. The web server gets a Protein Data Bank file or sequence file to perform prediction of fluctuating amino acid residues based on the carbon content in amino acid residues and classification of 20 amino acid residues into high (G, A, S, P, and D), moderate (T, E, N, K, C, Q, R, and V), and weak fluctuating residues (H, L, M, I, Y, F, and W). The program assigns the index to three different groups of fluctuating amino acid residues as follows: For high fluctuating residues, the index is −2 and for moderate fluctuating residues, the index is −1 and for weak fluctuating residues, the index is 2.

Fluctuating residues in thermonuclease

For almost three decades, Staphylococcus aureus (S. aureus) has been used as a model system to understand various functions in the cell and hence, the importance of fluctuating amino acid residues in thermonucleases has been studied in this research. A dataset of 127 thermonuclease protein structures was retrieved from Protein Data Bank 12. A uniprot search was made by using 1A2T (thermonuclease in S. aureus) as a reference thermonuclease which resulted in 127 protein structures with less than 3Å resolution. Fluctuating residues at different positions like helices in the dataset were computed by using PreFRP web server 11.

Fluctuating residues in cell penetrating peptides

Many reports on structure and function of small molecules are available in the literature, but studies on peptides showing transport activity are very limited. In the present work, cell penetrating peptide, especially crotamine structure was investigated by computing composition of fluctuating amino acid residues. The peptide sequence and structure were retrieved from Protein Data Bank (PDB ID: 1H5O). The length of the peptide was 42 amino acid residues made up of one alpha helix, two antiparallel sheets, and three disulfide bridges.

Fluctuating residues in intrinsic disordered proteins

The proteins with varying length (86 entries) made up of long disordered regions (greater than 30 amino acid residues) were downloaded from disprot database 13. Sequence based analysis of intrinsic disordered regions in proteins has revealed that the fluctuating amino acid residues plays vital role in promoting disorder 14. The predictors such as Glob plot 15, Ronn 16 and Pondr 17 were employed to perform predictions to compare with PreFRP results.

Results

Fluctuating residues in amino acid indices

The amino acid indices rank twenty amino acid residues based on their important physicochemical properties which contribute to protein folding. In the present work, seventy of such indices which define rigidity or flexibility of amino acid residues in the proteins were used. The position-specific scoring matrix of seventy indices is shown in table 2. From table 2, it can be observed that the fluctuating amino acid residues like glycine, alanine, and serine occupy maximum space at the tenth and eleventh positions, whereas the aspartic acid lies maximally at the eighteenth and nineteenth position. Similarly, proline prefers to occupy the last ten positions. The location of fluctuating residues in position-specific scoring matrix clearly implies that the residues can form chameleon sequence regions which are responsible for many neural disorders 18.

Fluctuating residues in thermonucleases

The fluctuating index for the amino acid residues in thermonucleases along the sequence is shown in figure 1. The high and moderate fluctuating amino acid residues occur in higher percentages (28.9 and 49.1%). From our results, it is clear that the moderate fluctuating amino acid residues dominated in comparison to the other two types of fluctuating residues. Therefore, the distribution of high and moderate fluctuating residues helps the formation of secondary structure, mainly helices.
Figure 1.

Fluctuating index for the amino acid residues along the sequences of thermonucleases.

Fluctuating index for the amino acid residues along the sequences of thermonucleases. The amino acid composition of cell penetrating peptides is shown in figure 2. Comparison between amino acid compositions of cell penetrating peptides reveals the higher occurrences of high fluctuating amino acid residues. In figure 3, the lack of formation of secondary structure is evident due to the presence of high fluctuating amino acid residues in cell penetrating peptides. The random coil state of the peptides will form a regular secondary structure like helix or strands while binding with protein/DNA. Interestingly, the cell penetrating peptides are rich in cysteine, which can form disulfide bridges and contribute to the stability of peptides.
Figure 2.

Amino acid composition in cell penetrating peptides.

Figure 3.

Random coil behavior of cell-penetrating peptides with more high fluctuating residues.

Amino acid composition in cell penetrating peptides. Random coil behavior of cell-penetrating peptides with more high fluctuating residues.

Fluctuating amino acid residues in intrinsic disordered proteins

Analysis of sequence composition of ordered and disordered proteins has shown the difference and defines tertiary structures with fundamental knowledge for understanding molecular assembly and protein folding 14. The high fluctuating residues dominated in the case of sequences of intrinsically disordered proteins, whereas moderate fluctuating residues dominated in other cases. The prediction scores of four different methods such as G plot, Ronn, Pondr and PreFRP, respectively were computed and compared. Both positive and negative numbers in PreFRP helped to classify the ranking system better than entirely negative or positive ones (Figure 4). The methods used for comparison and its scoring scheme are shown in table 3 and the results imply that PreFRP performs comparatively better prediction like other methods.
Figure 4.

Comparison of prediction scores of four different predictors.

Table 3.

Sensitivity and specificity of PreFRP compared with the established algorithm

Tool/server(sequence/structure)Proposed methodScoring (Negative/positive)Specificity
Glob plotSequenceHypothesis/PropensityOrder/globularity and disorder
RonnSequenceNeural network+Native disordered region
PondrSequenceNeural network+Natural disordered region
PreFRPSequence/Extraction based on structureProbability/carbon Atom propensityBothFlexibility and fold/unfold
Comparison of prediction scores of four different predictors. Sensitivity and specificity of PreFRP compared with the established algorithm

Discussion

Despite the explosive growth in the number of high-resolution 3D protein structures, a key challenge for protein scientists is to understand how the sequence of amino acid residue code for a particular fold. Statistical analysis of sequence compositions of native folded protein structures revealed the role of certain amino acid residues in regular (helix or extended) and irregular regions (coil-like states). The conformation change phenomenon is mainly driven by high fluctuating residues occur at irregular regions and these residues play important role in folding and binding. Many computational methods were developed to identify fluctuating residues which mainly rely on the sequence or structural similarity. However, the existing methods cannot achieve high accuracy due to less conservation of fluctuation residues in homologous sequences and hence, a statistical method (PreFRP) to predict the amino acid residues in proteins as high, moderate and weak fluctuating was developed in this study. The PreFRP method focuses on the effective relationship between protein sequence, ability to undergo conformation change and functionality of protein structures.

Conclusion

In the present study, the higher composition of high and moderate fluctuating amino acid residues in thermonucleases, cell penetrating peptides and intrinsically disordered protein regions was identified. Moreover, the location of high fluctuating amino acid residues in various classical amino acid indices were analyzed and discussed which are classified based on flexibility and rigidity property. Through the article, the importance of fluctuating amino acid residues in conformation change phenomenon was explored which helps protein scientists to understand the role of fluctuating residues in folding and binding of proteins.
  18 in total

1.  Understanding the sequence determinants of conformational switching using protein design.

Authors:  S Dalal; L Regan
Journal:  Protein Sci       Date:  2000-09       Impact factor: 6.725

2.  RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins.

Authors:  Zheng Rong Yang; Rebecca Thomson; Philip McNeil; Robert M Esnouf
Journal:  Bioinformatics       Date:  2005-06-09       Impact factor: 6.937

3.  Search for identical octapeptides in unrelated proteins: Structural plasticity revisited.

Authors:  K M Saravanan; S Selvaraj
Journal:  Biopolymers       Date:  2011-05-17       Impact factor: 2.505

4.  Analysis of dihedral angle preferences for alanine and glycine residues in alpha and beta transmembrane regions.

Authors:  K M Saravanan; S Krishnaswamy
Journal:  J Biomol Struct Dyn       Date:  2014-03-13

5.  Analysis of sequence-similar pentapeptides in unrelated protein tertiary structures. Strategies for protein folding and a guide for site-directed mutagenesis.

Authors:  P Argos
Journal:  J Mol Biol       Date:  1987-09-20       Impact factor: 5.469

6.  Sequence fingerprints distinguish erroneous from correct predictions of intrinsically disordered protein regions.

Authors:  Konda Mani Saravanan; A Keith Dunker; Sankaran Krishnaswamy
Journal:  J Biomol Struct Dyn       Date:  2017-12-27

7.  Context-dependent secondary structure formation of a designed protein sequence.

Authors:  D L Minor; P S Kim
Journal:  Nature       Date:  1996-04-25       Impact factor: 49.962

8.  PONDR-FIT: a meta-predictor of intrinsically disordered amino acids.

Authors:  Bin Xue; Roland L Dunbrack; Robert W Williams; A Keith Dunker; Vladimir N Uversky
Journal:  Biochim Biophys Acta       Date:  2010-01-25

9.  DisProt 7.0: a major update of the database of disordered proteins.

Authors:  Damiano Piovesan; Francesco Tabaro; Ivan Mičetić; Marco Necci; Federica Quaglia; Christopher J Oldfield; Maria Cristina Aspromonte; Norman E Davey; Radoslav Davidović; Zsuzsanna Dosztányi; Arne Elofsson; Alessandra Gasparini; András Hatos; Andrey V Kajava; Lajos Kalmar; Emanuela Leonardi; Tamas Lazar; Sandra Macedo-Ribeiro; Mauricio Macossay-Castillo; Attila Meszaros; Giovanni Minervini; Nikoletta Murvai; Jordi Pujols; Daniel B Roche; Edoardo Salladini; Eva Schad; Antoine Schramm; Beata Szabo; Agnes Tantos; Fiorella Tonello; Konstantinos D Tsirigos; Nevena Veljković; Salvador Ventura; Wim Vranken; Per Warholm; Vladimir N Uversky; A Keith Dunker; Sonia Longhi; Peter Tompa; Silvio C E Tosatto
Journal:  Nucleic Acids Res       Date:  2016-11-28       Impact factor: 16.971

10.  Chameleon sequences in neurodegenerative diseases.

Authors:  Golnaz Bahramali; Bahram Goliaei; Zarrin Minuchehr; Ali Salari
Journal:  Biochem Biophys Res Commun       Date:  2016-02-23       Impact factor: 3.575

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.