| Literature DB >> 32336912 |
Eshel Faraggi1,2, A Keith Dunker3, Robert L Jernigan4, Andrzej Kloczkowski5,6.
Abstract
Entropy should directly reflect the extent of disorder in proteins. By clustering structurally related proteins and studying the multiple-sequence-alignment of the sequences of these clusters, we were able to link between sequence, structure, and disorder information. We introduced several parameters as measures of fluctuations at a given MSA site and used these as representative of the sequence and structure entropy at that site. In general, we found a tendency for negative correlations between disorder and structure, and significant positive correlations between disorder and the fluctuations in the system. We also found evidence for residue-type conservation for those residues proximate to potentially disordered sites. Mutation at the disorder site itself appear to be allowed. In addition, we found positive correlation for disorder and accessible surface area, validating that disordered residues occur in exposed regions of proteins. Finally, we also found that fluctuations in the dihedral angles at the original mutated residue and disorder are positively correlated while dihedral angle fluctuations in spatially proximal residues are negatively correlated with disorder. Our results seem to indicate permissible variability in the disordered site, but greater rigidity in the parts of the protein with which the disordered site interacts. This is another indication that disordered residues are involved in protein function.Entities:
Keywords: entropy; fluctuations; mutations; protein disorder; protein structure
Year: 2019 PMID: 32336912 PMCID: PMC7182347 DOI: 10.3390/e21080764
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Cartoons of the structures for the seeds for the sets L1, L2, and L3. (A) Seed for the L1 sets L1A and L1B, an antibody fragment, PDBID: 5U68E. (B) Seed for the L2 sets L2A and L2B, JNK3 mitogen-activated protein kinase 10, PDBID: 3TTIA. (C) Seed for the L3 sets L3A and L3B, nanobody MU551, PDBID: 5F1OB. The color scheme is according to the secondary structure types, with beta strands yellow, helix red and coil green. Note that we keep a dark background to aid in viewing loops and especially loops with missing residues.
Properties of the protein sets used.
| Set: | L1 | L2 | L3 | |||
|---|---|---|---|---|---|---|
| SID to seed: | 30–50% | 60–80% | 30–50% | 60–80% | 30–50% | 60–80% |
| Number of proteins | 1759 | 586 | 398 | 261 | 378 | 393 |
| Length of MSA | 666 | 565 | 930 | 498 | 535 | 228 |
| >20 MSA sites | 494 | 397 | 569 | 157 | 355 | 182 |
| TMS mean | 0.45 | 0.49 | 0.51 | 0.37 | 0.53 | 0.67 |
| TMS median | 0.43 | 0.49 | 0.24 | 0.36 | 0.23 | 0.87 |
| TMS STDEV | 0.06 | 0.06 | 0.26 | 0.05 | 0.32 | 0.30 |
| Number of residues in seed protein | 294 | 464 | 163 | |||
| Seed PDBID | 5U68E | 3TTIA | 5F1OB | |||
| Function title | Antibody fragment | JNK3 mitogen-activated kinase | Nanobody MU551 | |||
Properties of the most abundant clusters of related proteins in the PDB.
Figure 2Example plots of: (A); the two entropies and (B); (C); and (D), for all MSA sites having at least 20 residues contributing to the alignment for the set L1A.
Correlations between the disorder propensity and entropic parameters.
| Parameter | L1 | L2 | L3 | |||
|---|---|---|---|---|---|---|
| 30–50% | 60–80% | 30–50% | 60–80% | 30–50% | 60–80% | |
|
| −0.074 | 0.092 | 0.030 | 0.240 | 0.020 | 0.250 |
|
| −0.402 | −0.316 | −0.346 | −0.286 | −0.145 | 0.042 |
|
| −0.213 | −0.138 | 0.050 | −0.077 | −0.205 | 0.080 |
|
| −0.141 | −0.216 | −0.039 | −0.101 | −0.128 | 0.022 |
and are entropies with respect to fluctuations in residue type for the MSA and CLRC sites, respectively, and and are entropies with respect to fluctuations in secondary structure assignment for the MSA and CLRC sites, respectively.
Correlations between the entropic parameters.
| Parameter | L1 | L2 | ||
|---|---|---|---|---|
| 30–50% | 60–80% | 30–50% | 60–80% | |
|
| 0.467 | 0.319 | 0.526 | 0.681 |
|
| 0.267 | 0.179 | 0.280 | 0.403 |
|
| 0.207 | 0.206 | 0.280 | 0.544 |
|
| 0.442 | 0.424 | 0.284 | 0.483 |
|
| 0.562 | 0.601 | 0.458 | 0.745 |
|
| 0.375 | 0.370 | 0.374 | 0.549 |
Correlation between the different entropic parameters calculated for the different sets of aligned proteins.
Correlations between the disorder propensity and spatial parameters.
| Parameter | L1 | L2 | ||
|---|---|---|---|---|
| 30–50% | 60–80% | 30–50% | 60–80% | |
|
| 0.082 | −0.097 | 0.354 | 0.296 |
|
| 0.191 | 0.114 | 0.425 | 0.248 |
|
| 0.259 | 0.181 | 0.026 | 0.147 |
|
| −0.131 | −0.163 | −0.013 | −0.182 |
|
| 0.247 | 0.142 | 0.052 | 0.236 |
|
| −0.137 | −0.148 | −0.042 | −0.143 |
|
| −0.050 | −0.097 | 0.061 | −0.237 |
|
| −0.243 | −0.159 | −0.002 | −0.161 |
Correlations between spatial characteristics and the disorder propensity at a given MSA site.
Correlations between the disorder propensity and ASA parameters.
| Parameter | L1 | L2 | ||
|---|---|---|---|---|
| 30–50% | 60–80% | 30–50% | 60–80% | |
|
| 0.167 | 0.093 | 0.388 | 0.450 |
|
| −0.077 | −0.051 | 0.340 | 0.120 |
|
| 0.129 | −0.040 | 0.317 | 0.253 |
|
| −0.153 | −0.235 | 0.215 | −0.078 |
|
| 0.180 | 0.090 | 0.409 | 0.409 |
|
| −0.040 | −0.042 | 0.393 | 0.133 |
|
| 0.163 | 0.006 | 0.320 | 0.214 |
|
| −0.163 | −0.206 | 0.214 | −0.186 |
Correlations between ASA characteristics and the disorder propensity at a given MSA site.
Correlations between the disorder propensity and dihedral angle parameters.
| Parameter | L1 | L2 | ||
|---|---|---|---|---|
| 30–50% | 60–80% | 30–50% | 60–80% | |
|
| 0.269 | 0.216 | 0.410 | 0.326 |
|
| 0.106 | 0.029 | 0.480 | −0.026 |
|
| −0.006 | −0.041 | 0.213 | 0.054 |
|
| 0.035 | 0.028 | 0.262 | 0.235 |
|
| 0.184 | 0.272 | −0.003 | 0.269 |
|
| −0.204 | −0.173 | −0.015 | −0.314 |
|
| −0.264 | −0.243 | −0.079 | −0.292 |
|
| −0.156 | −0.125 | −0.082 | −0.192 |
Correlations between dihedral angles characteristics and the disorder propensity at a given MSA site.
Correlations between the disorder propensity and secondary structure probabilities.
| Parameter | L1 | L2 | ||
|---|---|---|---|---|
| 30–50% | 60–80% | 30–50% | 60–80% | |
| ss1h | −0.093 | −0.067 | −0.378 | −0.120 |
| ss1c | −0.333 | −0.183 | −0.303 | −0.354 |
| ss1e | −0.339 | −0.189 | −0.207 | −0.393 |
| ss2h | −0.104 | −0.084 | −0.385 | −0.150 |
| ss2c | −0.316 | −0.144 | −0.400 | −0.319 |
| ss2e | −0.424 | −0.264 | −0.294 | −0.575 |
Correlations between secondary structure probabilities and the disorder propensity at a given MSA site.
Figure 3The entropy of secondary structure at a given MSA site versus the value of at that site.
Figure 4The entropy of secondary structure at the CLRC to the MSA site versus the value of at that site.
Correlations between secondary structure type probabilities.
| Type | L1 | L2 | ||
|---|---|---|---|---|
| 30–50% | 60–80% | 30–50% | 60–80% | |
| Helix | 0.198 | 0.179 | 0.509 | 0.289 |
| Coil | 0.611 | 0.480 | 0.414 | 0.442 |
| Sheet | 0.741 | 0.716 | 0.643 | 0.620 |
| Helix | −0.033 | −0.007 | 0.551 | 0.175 |
| Coil | 0.566 | 0.332 | 0.893 | 0.993 |
| Sheet | 0.768 | 0.585 | 0.721 | 0.658 |
| Helix | 0.277 | 0.307 | 0.450 | 0.278 |
| Coil | 0.588 | 0.559 | 0.341 | 0.354 |
| Sheet | 0.707 | 0.770 | 0.632 | 0.523 |
Correlations between the secondary structure probabilities of the original MSA site and its CLRC.