| Literature DB >> 24683370 |
Csilla Várnai1, Nikolas S Burkoff1, David L Wild1.
Abstract
Maximum Likelihood (ML) optimization schemes are widely used for parameter inference. They maximize the likelihood of some experimentally observed data, with respect to the model parameters iteratively, following the gradient of the logarithm of the likelihood. Here, we employ a ML inference scheme to infer a generalizable, physics-based coarse-grained protein model (which includes Go̅-like biasing terms to stabilize secondary structure elements in room-temperature simulations), using native conformations of a training set of proteins as the observed data. Contrastive divergence, a novel statistical machine learning technique, is used to efficiently approximate the direction of the gradient ascent, which enables the use of a large training set of proteins. Unlike previous work, the generalizability of the protein model allows the folding of peptides and a protein (protein G) which are not part of the training set. We compare the same force field with different van der Waals (vdW) potential forms: a hard cutoff model, and a Lennard-Jones (LJ) potential with vdW parameters inferred or adopted from the CHARMM or AMBER force fields. Simulations of peptides and protein G show that the LJ model with inferred parameters outperforms the hard cutoff potential, which is consistent with previous observations. Simulations using the LJ potential with inferred vdW parameters also outperforms the protein models with adopted vdW parameter values, demonstrating that model parameters generally cannot be used with force fields with different energy functions. The software is available at https://sites.google.com/site/crankite/.Entities:
Year: 2013 PMID: 24683370 PMCID: PMC3966533 DOI: 10.1021/ct400628h
Source DB: PubMed Journal: J Chem Theory Comput ISSN: 1549-9618 Impact factor: 6.006
The CHARMM and AMBER Atom Types Whose LJ Parameters Were Adopted for the CRANKITE Atom Types in the LJCHARMM and LJAMBER Models
| CRANKITE | CA | CB | C | N | O | S |
|---|---|---|---|---|---|---|
| CHARMM | CT1 | CT2 | C | NH2 | O | S |
| AMBER | CT | CT | C | N | O | S |
Figure 1Dependence of the converged potential parameter values, as a function of the Monte Carlo (MC) step size, inferred using the ASTRAL PDB structures after removing overlapping atoms (solid lines), thus using a dataset that better represents the Boltzmann distribution. The plots correspond to (a) hydrogen-bond strength (H), (b) α-helix backbone dihedral angle bias potential strength (ηα), (c) β-strand backbone dihedral angle bias potential strength (ηβ), (d) β–β contact bias potential strength (κβ), (e) β–β contact equilibrium distance (r0,β), and (f) Cα valence angle stress potential strength (kτ). For the hydrogen-bond strength plot (panel a) only, parameter values inferred using the ASTRAL PDB structures without removing overlapping atoms are also shown (represented by a dotted line). Vertical dashed lines mark a crankshaft MC step size of 0.01. The error bars correspond to one standard deviation of the distribution of the converged parameter value.
Inferred Potential Parameters Using Contrastive Divergence, for the Protein Models Using the Hard Cutoff and the Lennard-Jones (LJ)-Type van der Waals (vdW) Potentialsa
| vdW
and Backbone Stress Potential Parameters | ||||||||
|---|---|---|---|---|---|---|---|---|
| vdW potential | ε( | |||||||
| hard cutoff | 1.57 | 1.57 | 1.42 | 1.29 | 1.29 | 2.00 | 90 | |
| LJlearnt | 2.43 | 1.97 | 1.82 | 1.74 | 1.98 | 3.10 | 0.018 | 98 |
| LJCHARMM | 2.275 | 2.175 | 2.00 | 1.85 | 1.70 | 2.00 | 103 | |
| LJAMBER | 1.908 | 1.908 | 1.908 | 1.824 | 1.6612 | 2.00 | 114 | |
The vdW potential parameters of the hard cutoff model were taken from ref (20), while those of the LJCHARMM and LJAMBER models were taken from the CHARMM[55] and AMBER[54] force fields, respectively.
ε/RT values from the CHARMM force field (0.0338, 0.0929, 0.186, 0.338, 0.203, and 0.760 for the CA, CB, C, N, O, and S atom types respectively).
ε/RT values from the AMBER force field (0.185, 0.185, 0.145, 0.287, 0.355, and 0.422 for the CA, CB, C, N, O, and S atom types respectively). The potential parameters are described in section 2.3; wherever a unit of length is not indicated, the unit of length is Å.
Figure 2Hydrogen-bond pattern from MC simulations of an Ala16 peptide, using the protein models employing the hard cutoff vdW potential (solid line), the LJlearnt model (dashed line), the LJCHARMM model (dotted line), and the LJAMBER model (dash-dotted line). Potential parameters are listed in Table 2. On the horizontal axis, −4 represents a hydrogen-bond between amino acid residues i→j = i–4, typical of α-helices, while −3 is typical of (3,10)-helices, and −5 of π-helices. The small peak between +3 and +5 corresponds to left-handed helices.
Relative Probabilities of the Turn Types Identified from Nested Sampling Simulations of 16-Residue Peptides Applying a β-Hairpin Bias, at 298 Ka
| turn residues | vdW model | turn II′ | turn I′ | turn I | turn II |
|---|---|---|---|---|---|
| AAAA | hard cutoff | 0.968 | 0.000 | 0.000 | 0.032 |
| AAAA | LJlearnt | 0.983 | 0.000 | 0.003 | 0.014 |
| AAAA | LJCHARMM | 0.965 | 0.000 | 0.028 | 0.000 |
| AAAA | LJAMBER | 0.997 | 0.000 | 0.002 | 0.001 |
| AGAA | hard cutoff | 0.980 | 0.000 | 0.001 | 0.020 |
| AGAA | LJlearnt | 0.993 | 0.000 | 0.001 | 0.006 |
| AGAA | LJCHARMM | 0.997 | 0.000 | 0.003 | 0.000 |
| AGAA | LJAMBER | 1.000 | 0.000 | 0.000 | 0.000 |
| AAGA | hard cutoff | 0.864 | 0.022 | 0.001 | 0.113 |
| AAGA | LJlearnt | 0.873 | 0.023 | 0.001 | 0.102 |
| AAGA | LJCHARMM | 0.619 | 0.091 | 0.029 | 0.182 |
| AAGA | LJAMBER | 0.588 | 0.383 | 0.001 | 0.025 |
| AAAG | hard cutoff | 0.944 | 0.000 | 0.009 | 0.046 |
| AAAG | LJlearnt | 0.980 | 0.000 | 0.007 | 0.012 |
| AAAG | LJCHARMM | 0.931 | 0.000 | 0.066 | 0.000 |
| AAAG | LJAMBER | 0.969 | 0.000 | 0.030 | 0.000 |
Turn type IV was excluded from the analysis. Substituting the i+1, i+2, or i+3 residue of the turn by glycine (AGAA, AAGA, and AAAG, respectively) increases the relative probability of the type II′, the types I′ and II, and the type I turn, respectively.
Critical Temperatures (Tc) of Heat-Capacity Curves and the Heat-Capacity Value at Tc (Cv,c) in Units of R for the Ala16 Nested Simulations with α-Helix and β-Hairpin Secondary Structure Bias, Using the Hard Cutoff (Hard) and Lennard-Jones Type vdW Modelsa
| Critical
Temperature Data (°C) | |||||
|---|---|---|---|---|---|
| α-helix | 130 | 70 | 0 | 150 | 0–30 |
| β-hairpin | 10 | 40 | 20 | 30 | |
Approximate experimental values (exp) are taken from ref (74).
Figure 3The backbone RMSD from the native state (top), and the angle of the helix with respect to the axis of the β-strands (bottom), as a function of the potential energy for the conformations in the main basin of the energy landscape, explored by nested sampling simulations using the protein model with (left) hard cutoff vdW potential and (right) Lennard-Jones type vdW potential with inferred vdW parameters. The estimated energy at room temperature is marked by solid vertical lines. Conformations obtained by using the LJ potential show a wide range of allowed helix orientation angle at room temperature, including the native angle in the crystal structure, 21.8° (dashed horizontal line), while simulations using the hard cutoff potential fail to find the native helix orientation.
Figure 4Distribution of the helix angle at room temperature from a MC simulation for the different models: (top left) hard cutoff model, (top right) LJlearnt , (bottom left) LJCHARMM, and (bottom right) LJAMBER. Simulation length: 1010 MC steps, starting from the crystal structure. Vertical dashed lines show the helix orientation angle in the crystal structure.