| Literature DB >> 29799751 |
Markus Fleck1, Anton A Polyansky1, Bojan Zagrovic1.
Abstract
The recently developed NMR techniques enable estimation of protein configurational entropy change from the change in the average methyl order parameters. This experimental observable, however, does not directly measure the contribution of intramolecular couplings, protein main-chain motions, or angular dynamics. Here, we carry out a self-consistent computational analysis of the impact of these missing contributions on an extensive set of molecular dynamics simulations of different proteins undergoing binding. Specifically, we compare the configurational entropy change in protein complex formation as obtained by the maximum information spanning tree approximation (MIST), which treats the above entropy contributions directly, and the change in the average NMR methyl and NH order parameters. Our parallel implementation of MIST allows us to treat hard angular degrees of freedom as well as couplings up to full pairwise order explicitly, while still involving a high degree of sampling and tackling molecules of biologically relevant sizes. First, we demonstrate a remarkably strong linear relationship between the total configurational entropy change and the average change in both methyl and backbone-NH order parameters. Second, in contrast to canonical assumptions, we show that the main-chain and angular terms contribute significantly to the overall configurational entropy change and also scale linearly with it. Consequently, linear models starting from the average methyl order parameters are able to capture the contribution of main-chain and angular terms well. After applying the quantum-mechanical harmonic oscillator entropy formalism, we establish a similarly strong linear relationship for X-ray crystallographic B-factors. Finally, we demonstrate that the observed linear relationships remain robust against drastic undersampling and argue that they reflect an intrinsic property of compact proteins. Despite their remarkable strength, however, the above linear relationships yield estimates of configurational entropy change whose accuracy appears to be sufficient for qualitative applications only.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29799751 PMCID: PMC9245193 DOI: 10.1021/acs.jctc.8b00100
Source DB: PubMed Journal: J Chem Theory Comput ISSN: 1549-9618 Impact factor: 6.578
Simulated Protein Set: Molecule Names, Numbers of Atoms, PDB Codes, and Abbreviations
| Name | No. atoms | PDB code | Complex | Short name | Abbreviation |
|---|---|---|---|---|---|
| PPIase A | 1641 | PPIA | 1 | ||
| PR160Gag-Pol | 1408 | gag-pol | 2 | ||
| Alkaline protease | 4503 | aprA | 3 | ||
| Alkaline protease inhibitor | 997 | aprI | 4 | ||
| Subtilisin Carlsberg | 2433 | apr | 5 | ||
| Ovomucoid | 498 | OM | 6 | ||
| Uracil-DNA Glycosylase | 2333 | UNG | 7 | ||
| Uracil-DNA Glycosylase inhibitor | 788 | UGI | 8 | ||
| Micronemal protein 6 | 496 | MIC6 | 9 | ||
| Micronemal protein 1 | 1226 | MIC1 | 10 | ||
| Tsg101 protein | 1480 | TSG101 | 11 | ||
| Ubiquitin | 760 | UBQ | 12a,b,c,d,e | ||
| ESCRT-I complex subunit VPS23 | 1493 | sst6 | 13 | ||
| Ubiquitin | 760 | UBQ | 14a,b,c,d,e | ||
| gGGA3 Gat domain | 949 | 1YD8* | GGA3 | 15 | |
| Ubiquitin | 760 | UBQ | 16a,b,c,d,e | ||
| E3 Ubiquitin-protein ligase CBL-B | 457 | CBLB | 17 | ||
| Ubiquitin | 760 | UBQ | 18a,b,c,d,e |
Number of atoms in individual proteins.
PDB codes[22,23] of individual proteins.
PDB codes of complexes.
Short names used in the text.
Key to abbreviations used in Figure a and SI Figure 3 and SI Figure 7.
For ubiquitin, five separate simulations were used to generate the plots, reflected as the additional abbreviation tags a, b, c, d, and e.
The constituent GGA3 Gat domain was extracted from the PDB structure of the 1YD8 complex and named 1YD8*, accordingly.
Figure 4Error analysis. (a) Comparison of un-normalized ΔSMIST against the average methyl order parameters scaled by the respective number of degrees of freedom. (b) Errors as the absolute value of the deviation along the y-axis from the linear regression in (a). (c) Relationship between fractional coupling, |ΔSMIST–ΔS1D|/|ΔS1D|, and fractional error, |ΔSMISTestim – ΔSMIST|/|ΔSMIST|, for all three experimental probes. For clarity, the range of the graph has been truncated to show 91% of all of the data. The remaining outliers stem from vanishing denominators on both axes. The inset shows the medians and the quartiles of fractional couplings and fractional errors for all three experimental probes. In panel (c), errors were estimated based on the linear regressions given in Figure .
Figure 1Self-consistent comparison of protein configurational entropy changes and experimental proxies of protein dynamics. For every protein, we independently calculate ΔSMIST and (a) Δ⟨OCH2⟩ and Δ⟨ONH2⟩ or (b) ΔSBfact and correlate them against each other. (c) Compactness and (d) fraction of methyl-bearing residues for proteins used in this study as compared to the analogous values for the representative set of 1109 complete 3D structures[21] from the PDB[22,23] (red distributions).
Figure 2Comparison between experimentally accessible measures of protein dynamics and ΔSMIST. (a) Δ⟨OCH2⟩ vs ΔSMIST, (b) Δ⟨ONH2⟩ vs ΔSMIST, and (c) ΔSBfact vs ΔSMIST. All values reflect the entropy changes upon complex formation, evaluated separately for each individual protein. The ΔSMIST and ΔSBfact values are normalized by the number of degrees of freedom in each protein (3N – 6, where N is the number of atoms). For each comparison, we provide the least-squares linear fit and the associated Pearson correlation coefficient R.
Figure 3Dependence of the relationship between ΔSMIST and different entropy proxies on the completeness of the set of experimental reporters. Distributions of Pearson correlation coefficients R between ΔSMIST, evaluated for the full set of degrees of freedom, and the undersampled (a) Δ⟨OCH2⟩, (b) Δ⟨ONH2⟩, or (c) ΔSBfact over the set of 34 binding processes. The degree of undersampling is given in the inset. Each distribution is based on 1000 independent repetitions of the undersampling procedure. All values are based on the changes upon complex formation, evaluated separately for each constituent and normalized by the number of degrees of freedom for ΔSMIST and ΔSBfact. The arrow marks the Pearson correlation R when taking the full set of reporters into account. (d) Absolute values of the medians of Pearson R histograms as a function of the degree of undersampling.
Figure 5Effect of pairwise couplings in the MIST approximation. Shown are configurational entropy changes upon binding for every protein in the simulated set, whereby coupling corrections of pairwise order are included on the y-axis and excluded on the x-axis. The values are normalized by the number of degrees of freedom of the respective molecules.
Figure 6Contributions to the total configurational entropy change. (a) Average contributions across the whole protein set. Shown are the magnitudes of the change in torsional (ΔSMISTtor), and angular (ΔSMISTang) entropy contributions and their mutual coupling (−ΔIMISTang/tor) and main-chain (ΔSMISTmc) and side-chain (ΔSMISTsc) contributions and their mutual coupling (ΔSMISTmc/sc) as well as uncoupled entropy change (ΔS1D) and total configurational MIST entropy with vibrations excluded by coarse-graining the sampled probability distributions to three bins only (ΔSMIST3 bins). The bars represent the value of the slope of the linear fit between the contributions in question and the total configurational entropy change, while the values in parentheses indicate the associated Pearson Rs. The fitting procedure is illustrated for the case of the side-chain contribution in the inset. (b) Absolute values of different entropic terms including temperature and no normalization for two different binding processes. (c) High-quality linear relationships involved allow one to use the slopes of individual steps in order to estimate the full configurational entropy change ΔSMIST (top arrow) starting from the vibration-suppressed, uncoupled torsional side-chain entropy, which is directly approximated by NMR methyl order parameters. However, starting from the order parameter changes, such transitivity is broken.
Effect of Different Configurational Entropy Contributions on Estimated Error for Methyl Order Parameters
| Quantity | Pearson | slope |
|---|---|---|
| Δ | –0.33 | –0.59 |
| Δ | –0.01 | –0.03 |
| Δ | 0.33 | 6.63 |
| Δ | 0.12 | 0.34 |
| Δ | –0.14 | –0.47 |
| Δ | –0.02 | –0.13 |
| Δ | –0.21 | –0.57 |
| Δ | –0.28 | –0.76 |