Mark A Olson1. 1. Systems and Structural Biology Division, USAMRIID, Frederick, Maryland 21702, United States.
Abstract
This study presents parallel-tempering lattice Monte Carlo simulations based on the side-chain-only (SICHO) model for calculating the conformational landscape of a 28-residue intrinsically disordered peptide extracted from the Ebola virus protein VP35. The central issue is the applicability of the SICHO potential energy function and in general coarse-grained (CG) representations of intermediate resolution for modeling large-scale conformational heterogeneity that includes both folded and unstructured peptide states. Crystallographic data shows that the peptide folds in a 410-helix-turn-310-helix topology upon complex formation with the Ebola virus nucleoprotein, whereas in isolation, the peptide transitions to a disordered conformational ensemble as observed in circular dichroism experiments. The simulation reveals a potential of mean force that displays conformational diversity along the helix-forming reaction coordinate consistent with disorder-order transitions, yet unexpectedly the bound topology is poorly sampled, and a population shift to an unstructured state incurs a significant free-energy penalty. Applying an elastic network interpolation model suggests a hybrid binding mechanism through conformational selection of the 410-helix followed by an induced fit of the 310-helix. A comparison of the CG model with previously reported all-atom CHARMM-based simulations highlights a lattice-based approach that is computationally fast and with the correct parameterization yields good resolution to modeling conformational plasticity.
This study presents parallel-tempering lattice Monte Carlo simulations based on the side-chain-only (SICHO) model for calculating the conformational landscape of a 28-residue intrinsically disordered peptide extracted from the Ebola virus protein VP35. The central issue is the applicability of the SICHO potential energy function and in general coarse-grained (CG) representations of intermediate resolution for modeling large-scale conformational heterogeneity that includes both folded and unstructured peptide states. Crystallographic data shows that the peptide folds in a 410-helix-turn-310-helix topology upon complex formation with the Ebola virus nucleoprotein, whereas in isolation, the peptide transitions to a disordered conformational ensemble as observed in circular dichroism experiments. The simulation reveals a potential of mean force that displays conformational diversity along the helix-forming reaction coordinate consistent with disorder-order transitions, yet unexpectedly the bound topology is poorly sampled, and a population shift to an unstructured state incurs a significant free-energy penalty. Applying an elastic network interpolation model suggests a hybrid binding mechanism through conformational selection of the 410-helix followed by an induced fit of the 310-helix. A comparison of the CG model with previously reported all-atom CHARMM-based simulations highlights a lattice-based approach that is computationally fast and with the correct parameterization yields good resolution to modeling conformational plasticity.
The Ebola virus encodes seven structural proteins of which most
occupy unique points in topological fold space. Given the proteome
size and repertoire of globular domains, the interactome is thought
to be largely multifunctional during the virus lifecycle. Elucidation
of the landscape of viral protein interactions in terms of association
networks is of fundamental importance for the discovery of effective
therapeutics. A step forward is the recent X-ray crystallographic
structure of a 28-residue peptide extracted from the Ebola virus protein
VP35 in association with the Ebola virus nucleoprotein.[1] The peptide folds in a 410-helix-turn-310-helix topology upon binding the nucleoprotein, while, in
isolation, a disordered ensemble of states is observed in circular
dichroism (CD) experiments.The coupled folding and binding
suggests that the peptide can be
broadly characterized as an intrinsically disordered peptide (IDP).[2−5] What is intriguing is when the peptide is added to a solution of
50% trifluoroethanol (TFE), CD experiments show a transition from
a predominately unstructured form to helical structures of ∼30–40%
helicity, thus suggesting a strong underlying secondary-structure
propensity.[1] Questions remain on how the
fold propensity influences the binding mechanism and whether the process
is one of conformational selection, induced fit, or a hybrid scenario.
A leverage of gaining a better understanding is therapeutic discovery
of peptide-based inhibitors.A recent simulation study of the
conformational heterogeneity of
the VP35 peptide was reported by applying CHARMM-based potential energy
functions and a generalized Born (GB) solvent model.[6,7] The study found that the extent of conformational plasticity along
the secondary-structure forming reaction coordinate depended on the
force field and sampling algorithm. In the best outcome, the conformational
ensemble of the peptide was determined to be a large manifold of thermally
accessible states, which is characteristic of the observed CD disordered
ensemble. The less favorable outcome was kinetic trapping of helical
conformational states. Regardless of the selected force field, all
simulation models tested found the highly populated states to exhibit
collapsed measures of radius of gyration (Rg).This brief work extends the analysis of the VP35 peptide
and reports
a simulation study based on a reductionist strategy of modeling the
peptide by a coarse-grained (CG) model. The advantage of CG models
is a significant reduction in computational time in generating structural
ensembles as compared to fully atomistic explicit solvent simulations.
Current strategies of CG models are quite broad and vary in their
atomic granularity of modeling peptides and proteins.[8] The most popular approach is the molecular dynamics simulation
method of applying CG force fields. An alternate to continuous representations
and a computationally much faster method is lattice-based polymer
physics models. Formulations range from the notable HP (hydrophobic-polar)
lattice model[9] to the more realistic approach
of the CABS (C-alpha, beta, and side chain) model.[10] The former is a simplistic representation based on low-resolution
modeling of amino acid residues being classified as either hydrophobic
or polar, and the configurational ensemble of residue contacts is
modeled via a self-avoiding walk on a lattice. In contrast, the CABS
is a medium resolution model of which side chains are represented
by two spherical virtual atoms (viz., pseudoatoms): one centered on
Cβ, and the other placed in the center of mass of the remaining
portion of the side chain. The main-chain Cα positions in CABS
are constrained to a cubic lattice.An intermediate resolution
lattice model is the side-chain-only
(SICHO) model constructed of a single pseudoatom per residue.[8,11,12] As pointed out by Kolinski and
co-workers in a recent review of CG models and their applications,[8] the SICHO model has not been extensively studied,
despite promising preliminary results.[13−15] Here, we apply the SICHO
model to the VP35 peptide using replica-exchange lattice Monte Carlo
simulations. Accurate reconstruction of all-atom models from lattice
generated conformations is carried out by a technique developed by
Feig and co-workers.[16] Previous application
of the SICHO model with parallel tempering and the rebuilding of all-atom
structures is illustrated by the multiscale refinement of protein
loops.[17] Here, the issue is whether the
terms of the SICHO potential energy function can be weighted to avoid
kinetic traps introduced by a starting helical topology and at the
same time allow large-scale conformational heterogeneity of unfolded
and folded states. Heterogeneity should include the underlying propensity
in the secondary structure as observed by CD with the addition of
TFE.As with previous parallel-tempering simulations of the
VP35 peptide,
conformational sampling of transitions is assessed by potentials of
mean force (PMFs) as a function of fractional helicity (fH) and values of Rg. What
is new here from earlier studies is an application of a CG elastic
network interpolation model (ENI) developed by Kim and co-workers[18−21] to estimate the transition-state cost of excursions on a conformational
network. This analysis combined with the PMF suggests a possible hybrid
binding process of conformational selection and an induced fit for
molecular association with the Ebola virus nucleoprotein. Finally,
a comparison is presented of sampling distributions extracted from
previous all-atom simulations using a statistical potential for assessing
peptide folds.
Results and Discussion
Conformational Landscape
The advantage
of lattice protein models in comparison with all-atom representations
is the reduced computational demand to obtain exhaustive conformational
sampling and provide efficient characterization of the phase diagram.
While sampling is more robust, the issue is the accuracy of CG potential
energy functions to correctly generate known conformations and populate
the energy landscape. An example of a previous application of the
SICHO model is the modeling of protein loops, where the pseudoatoms
of the protein stem outside of the loop region were tethered to their
initial lattice positions.[17] The prior
study found that, depending on the loop environment, the SICHO model
sampled native-like conformational states among the ensembles and
their detection depended on the density probability distributions.Unlike loop modeling, the computational exploration of IDPs is
represented as unrestrained sampling and thought to take place on
flat energy surfaces. Consequently, the difficulty arises from the
inherent biases in potential energy functions that favor the formation
of secondary-structure elements. An illustrative example of the effect
of biases on secondary structure formation is given by the modeling
of the VP35 peptide using different CHARMM-based force fields.[6,7] The kinetic trapping of states in helical topologies was found to
be influenced by the force field and the conformational sampling algorithm.
Using a parallel tempering method, the force field CHARMM22+CMAP (designated
as C22) distinguished itself from CHARMM36m (C36m)[22] in producing greater conformational plasticity along the
helix-forming reaction coordinate, although the refined C36m produced
greater diversity of helical, β-hairpins, and coils when applied
with an adaptive tempering method.[7]To avoid the pitfalls of falling into a kinetic trap on the unfolding–folding
landscape, the SICHO potential term that favors the formation of regular
protein secondary structure elements was scaled in attempt to replicate
a set of minimum energy conformers taken from the C22 simulations.[7] It is found that the lattice simulation with
the stiffness parameter αstiffness set to a value
of 1.0, αshort = 0.5, and αcentrosym = 0.1 yielded the best agreement with the C22 energy landscape.
Alternative values explored for αstiffness were set
as 0.5 and 0.0, where the latter produced no helical structures.Shown in Figure A,B
are the PMFs at a reduced temperature T = 1
calculated by the parallel tempering weighted analysis method (PTWHAM)
algorithm.[23] The probability density profiles
are denoted as W(fH,Rg) and W(fH,Z) for order parameters fH, Rg and the SICHO energy Z-score. Reconstructed all-atom structures were applied
in evaluating fH and Rg from lattice generated conformations. The calculated W(fH,Rg) reveals the SICHO model produced significant conformational
diversity among multiple states, and while the landscape is of lower
resolution in defining density contours as compared to all-atom simulations,[6,7] the SICHO model exhibits conformational specificity in lattice energies
as detected in the W(fH,Z) profile. At the same time, the model
avoids kinetic traps to produce PMFs that are consistent with the
notion of a heterogeneous ensemble.
Figure 1
Energy landscapes and conformational network.
(A) Potential of
mean force (PMF) is presented at a Monte Carlo effective temperature T = 1 for order parameters fractional helicity and radius
of gyration. Transitions are displayed by a free-energy scale where
the color light blue represents minima and is followed in order by
the colors darker blue, green, yellow, and red, the last of which
denotes high-energy states of low population. Illustrated conformations
of the peptide are the X-ray crystallographic bound conformation (fH = 0.43) and those extracted from the CG lattice
model simulation to include the energy minimum (Emin) conformation (fH = 0.43),
a helical-like bundle (fH = 0.64), and
an unstructured peptide (fH = 0). (B)
PMF for Z-score of the lattice potential energies
as a function of fractional helicity. (C) Predicted conformational
network starting with the bound folded peptide extracted from the
association with the Ebola virus protein NP (molecular surface colored
gray). The transition-state costs are given by CENI for morphing conformational state i → j. Selected conformations were taken from clustering and
selected based on minimization of CENI between clusters. The term min-PMF is the minimum in the free-energy
landscape and corresponds to an unfolded peptide state.
Energy landscapes and conformational network.
(A) Potential of
mean force (PMF) is presented at a Monte Carlo effective temperature T = 1 for order parameters fractional helicity and radius
of gyration. Transitions are displayed by a free-energy scale where
the color light blue represents minima and is followed in order by
the colors darker blue, green, yellow, and red, the last of which
denotes high-energy states of low population. Illustrated conformations
of the peptide are the X-ray crystallographic bound conformation (fH = 0.43) and those extracted from the CG lattice
model simulation to include the energy minimum (Emin) conformation (fH = 0.43),
a helical-like bundle (fH = 0.64), and
an unstructured peptide (fH = 0). (B)
PMF for Z-score of the lattice potential energies
as a function of fractional helicity. (C) Predicted conformational
network starting with the bound folded peptide extracted from the
association with the Ebola virus protein NP (molecular surface colored
gray). The transition-state costs are given by CENI for morphing conformational state i → j. Selected conformations were taken from clustering and
selected based on minimization of CENI between clusters. The term min-PMF is the minimum in the free-energy
landscape and corresponds to an unfolded peptide state.As noted in section , transitions among helical
conformations are with the secondary structure of the bound state
annotated in the starting sequence file and the weights of the scaling
parameters selected for allowing excursions. The bound peptide shows
an initial folded topology of fH = 0.43
composed of 410-helical conformation for residues Trp28-Gly36
and a 310-helix for Val41-Asp43. For problems of modeling
peptides without a priori knowledge of the secondary structure, predictions
must be made. For this particular peptide, predictions without bias
of the crystallographic conformation return an estimate of fH ≈ 0.3 with probabilities of >0.9
for
helical formation in the peptide region of Gly27 to Met34.[24] Starting with no secondary structure annotated
in the sequence leads to rare formation of any helical structures
independent of the weights placed on the potential energy function.A simple measure of sampled space is the analysis of the population
density distributions and their free-energy values. The global minimum
in W(fH,Rg) is observed at (fH = 0, Rg = 8.2 Å), and the transition
to the weakly populated bound conformation modeled by an approximate
PTWHAM bin (fH = 0.42, Rg = 10.4 Å) yields a ΔW of 4.6kBT with
scaling of the reduced T to 300 K. To put this magnitude
into perspective, the conformational basin that contains the minimum
in the SICHO lattice energy (Emin) is
observed at (fH = 0.43, Rg = 8.2 Å), and its transition to the bound conformation
shows a ΔW of 1.6kBT. A similar transition from the Emin to an unfolded conformation shows a ΔW of 1.8kBT. A final comparison is the free-energy difference between
the minimally folded state given by (fH = 0.18, Rg = 8.2 Å) and the unfolded
state, which yields ΔW ≈
1kBT.The free
energy of shifting a highly populated folded conformation
to the global minimum in W(fH,Rg) is observed to be greater
than thermal fluctuations and suggests a multistate transition pathway.
While Markov state models provide powerful tools to investigate conformational
transition networks, their applications to parallel tempering of swapping
configurations between ensembles are challenging and require further
benchmarking. Alternatively, the ENI model is straightforward, and Figure C illustrates its
use where the transition-state costs are given by CENI for morphing state i → j. The network was constructed by applying low-energy conformations
extracted from clustering and selecting modeled structures that minimize CENI between states. Results reveal a CENI network that is relatively flat across the fH range up to ∼0.5 and displays values
that are comparatively lower than the peak value of CENI = 4200 for the transition from a helical-like bundle
to the minimum in W(fH,Rg). The flatness in transition-state
costs combined with free-energy differences near thermal fluctuations
is characteristic of a glassy-like protein state.The unambiguous
determination of the mechanism of coupled folding
and binding of intrinsically disorder peptides and proteins is a difficult
task from either experimental measurements or simulations. Binding
mechanisms encompass two different limiting scenarios: (i) conformational
selection where a shift in population takes place in an ensemble of
unbound states to a bound-like conformation prior to binding and (ii)
an induced-fit model where conformational changes accompany the binding
processes. From a modeling perspective, the first step in the analysis
is to understand the unbound form and the parameters that govern the
state populations and their transitions on the free-energy landscape.
Here, the implication of the observed conformational plasticity of
the peptide up to an fH threshold of 0.5,
which include bound-like structures, is a binding mechanism by a process
of conformational selection. An observable determined from the simulation
is low free-energy barriers for this subset of helical transitions.
In contrast, the fold topology in the C-terminal region of the peptide
chain that contains the 310-helix is weakly populated within
the SICHO potential energy function and disappears in sampling with
αstiffness = 0.5. The ΔW for the transition from the SICHO Emin to the helical-like bundle (fH = 0.75, Rg = 8.2 Å) is unfavorable
by 2.4kBT and exhibits
a CENI of nearly twice the transition Emin → bound state. Because of the high
transition barrier, the inference is kinetics that favors a process
of induced fit of the helical fold in the C-terminal region. To obtain
a more complete picture of binding, the interaction details of the
fit requires molecular docking to the viral nucleoprotein and simulations
to observe conformational reorganization upon binding. Given the caveat
of only modeling the unbound state, the net mechanism of binding is
proposed to be a hybrid process.Figure displays
the root-mean-square deviation (RMSD) of the main-chain Cα of
all-atom conformations reconstructed from the lattice simulation at
the replica client T = 1. The conformational deviation
is from the starting bound fold of the peptide. The plot demonstrates
that the lattice Monte Carlo simulation achieved convergence in generating
a statistical average Cα-RMSD of 6.4 Å with a deviation
of ±0.8 Å. In addition, the lattice model reports a sampled
RMSD space beyond a narrow range, showing a significant spread and
revisiting in conformational space to find a minimum RMSD of ∼4
Å and an extreme of ∼10 Å. As illustrated in Figure , the average RMSD
contains a coexistence of a diverse set of peptide folds, ranging
from a small helix to an extended helix in the N-terminal region.
Figure 2
The plot
displays the root-mean-square deviation (RMSD) of the
main-chain Cα of conformations extracted from the lattice simulation
at replica client T = 1. The number of extracted
conformations consisted of 200,000 peptide structures. The RMSD is
from the starting bound conformation of the peptide and reports a
statistical average for Cα-RMSD of 6.4 ± 0.8 Å. Illustrated
are diverse peptide topological folds that represent the statistical
average from a simple moving average.
The plot
displays the root-mean-square deviation (RMSD) of the
main-chain Cα of conformations extracted from the lattice simulation
at replica client T = 1. The number of extracted
conformations consisted of 200,000 peptide structures. The RMSD is
from the starting bound conformation of the peptide and reports a
statistical average for Cα-RMSD of 6.4 ± 0.8 Å. Illustrated
are diverse peptide topological folds that represent the statistical
average from a simple moving average.
Comparison of the CG Model and CHARMM-Based
Distributions
Figure A illustrates a contrast of the SICHO model simulation results
with PMF data extracted from previously reported all-atom simulations
of the peptide using the C36m and C22 force fields with a GB solvent
model.[6,7] Two calculations are presented for the CG
model: (i) results of extracting values from W(fH,Rg) along a fH pathway of connecting PTWHAM
bins that contain minimum free-energy values and (ii) a pathway of
connecting the fH bins along a constant Rg pathway where the value of Rg is taken from the bin W(fH,Rg) =
0. For C36m and C22, the calculations follow the second approach as
with the CG model and provide comparative boundaries. Uncertainty
in the free energies is ∼0.2kBT, determined from different bin sizes in constructing weighted
histograms and the application of the multistate Bennett acceptance
ratio method[25] to test numerical convergence.
Errors are greater where populations are limited due to sampling fold
topologies that are weakly favorable in the SICHO function.
Figure 3
(A) Potential
of mean force (PMF) as a function of fractional helicity
for comparing the CG model with CHARMM-based C36m and C22 force-field
simulations, where the latter two are re-evaluation of data from earlier
studies.[6,7] Red symbols and dashed line represent the
PMF computed by the CG model using minimum free-energy values along
the fractional helicity reaction coordinate independent of the radius
of gyration (Rg). Dark salmon colored
circles denote the PMF for the CG model computed with the constraint Rg = 8.4 Å. PMF for C36m is represented
by blue colored circles computed at Rg = 8.0 Å, and that for C22 is green colored circles at Rg = 8.8 Å. Areas colored gray are either
restricted regions of helical length formation or energy landscapes
of low probability of conformational sampling. (B) Plots of fractional
helicity as a function of Rg across the
ensemble of the 24 replica clients. Shown are statistical averages
over datasets for the CG model and a comparison with C22 and C36m
force fields. Symbols and their designated colors are noted above.
(A) Potential
of mean force (PMF) as a function of fractional helicity
for comparing the CG model with CHARMM-based C36m and C22 force-field
simulations, where the latter two are re-evaluation of data from earlier
studies.[6,7] Red symbols and dashed line represent the
PMF computed by the CG model using minimum free-energy values along
the fractional helicity reaction coordinate independent of the radius
of gyration (Rg). Dark salmon colored
circles denote the PMF for the CG model computed with the constraint Rg = 8.4 Å. PMF for C36m is represented
by blue colored circles computed at Rg = 8.0 Å, and that for C22 is green colored circles at Rg = 8.8 Å. Areas colored gray are either
restricted regions of helical length formation or energy landscapes
of low probability of conformational sampling. (B) Plots of fractional
helicity as a function of Rg across the
ensemble of the 24 replica clients. Shown are statistical averages
over datasets for the CG model and a comparison with C22 and C36m
force fields. Symbols and their designated colors are noted above.The plot of Figure A shows that, even though on a fixed Rg path, C22 displays broad helical plasticity among low
free-energy
states, and W(fH,Rg = 8.8 Å) = 0 takes place
at fH = 0.36 (recall the bound form exhibits fH = 0.43). In stark contrast, C36m reveals the
kinetic trapping of a helical fold at fH = 0.26 following a downhill slope from the unfolded peptide state
and proceeded by a sharp incline to form additional helicity at a
significant free-energy cost. Between these two bounds, the CG model
with an unrestrained Rg path best mirrors
C22 with the display of coexistence between states of varying fractional
helicity at free-energy values near or lower than thermal fluctuations.
Despite the coarse-grained treatment, the rebuilding of lattice conformations
to all-atom representations obtained sufficient resolution to capture
disorder–order transitions among the helical folds.Illustrated
in Figure B are plots
of helix formation as a function of Rg evaluated across the 24 replica clients. Shown are statistical
averages over datasets for the CG model and simulations applying C22
and C36m taken from examination of earlier data.[6,7] The
analysis shows that the CG model exhibits the weakest cooperativity
among the simulation models. What is furthermore important are the
initial slopes of the fH–Rg profiles in terms of modeling helix stiffness,
where both the CG and C22 models show sharp declines, while C36m presents
a slow initial drop in fH. This qualitative
ranking of helix stiffness is consistent with the observed plasticity
along the helix-forming reaction coordinate shown in Figure A. While the statistical average
of ∼30% helicity produced by C36m would appear to be a result
of the refined force field aimed at reducing secondary-structure biases
and would seem to be more consistent with CD experiments with TFE,
C36m failed to capture significant disorder–order helical fluctuations.
Conversely, the CG model of SICHO sampled significant helical transitions,
and although the statistical average is greater than CD experiments,
weighting of the populations across multiple low free-energy states
in W(fH,Rg) is more consistent with the notion of an
IDP.A further observation of the fH–Rg profile for the CG model is
that the sampled
conformations are more collapsed, approaching the unfolded state than
the models C22 and C36m. This result is the outcome of the parametrization
of the terms αstiffness and αshort along with αcentrosym to allow the CG model to
generate an ensemble of conformational transitions. While the Rg is lower for the CG model at replica clients
containing higher temperatures, values for these extended folds correctly
encompass a negligible secondary structure and provide a sufficiently
broad statistical distribution for modeling the PMF in Figure A.It is also observed
that all simulation models at the lowest client
in the exchange show native states that are overly collapsed in comparison
to a typical unfolded 28-residue peptide showing an Rg ≈ 13 Å.[26] A similar
effect is detected of applying an explicit/implicit solvent hybrid
replica-exchange simulation method[27] of
modeling the peptide to determine the effect of explicit water interactions.[6] These combined studies note the inherent deficiencies
of all-atom force fields and solvent models that tend to overstabilize
fold propensities. Conceivably, alternative CG models CABS,[10] MARTINI,[28] and UNRES[29] may avoid some of these problems. For all-atom
simulations, recent developments in molecular mechanics force fields
(e.g., refs (30−32)) offer wide-ranging
parametrizations to reproduce more accurately fold propensities and
should help improve modeling the temperature dependence of IDPs.[33]Figure shows a
distinction of the CG models with datasets extracted from C36m and
C22 simulations. The two CG models are with the stiffness parameters
(αstiffness) set to 1.0 and 0.5. Profiles are constructed
from 5000 conformations and are evaluated using the statistical potential
DFIRE[34] to assess measures of the minimum
end-to-end distance of the peptide (denoted as dmin) and the fraction of native contacts, Q. The X-ray crystallographic conformation shows a DFIRE energy value
of -36 kcal/mol and a dmin of 22.5 Å.
Figure 4
Conformational
sampling distributions. The statistical potential
DFIRE is applied to evaluate conformations taken from the simulations
using order parameters minimum end-to-end distances and fraction of
native contacts, denoted by Q. (A) CG simulation
results with the stiffness parameter set to 1. Several conformations
are extracted from the simulation to highlight fold topologies. As
a reference, the X-ray crystallographic bound structure exhibits a
DFIRE value of -36 kcal/mol and a minimum end-to-end distance of 22.5
Å (represented by vertical line). (B) CG simulation results with
the stiffness parameter set to 0.5. (C) C36m simulation data. (D)
C22 simulation data.
Conformational
sampling distributions. The statistical potential
DFIRE is applied to evaluate conformations taken from the simulations
using order parameters minimum end-to-end distances and fraction of
native contacts, denoted by Q. (A) CG simulation
results with the stiffness parameter set to 1. Several conformations
are extracted from the simulation to highlight fold topologies. As
a reference, the X-ray crystallographic bound structure exhibits a
DFIRE value of -36 kcal/mol and a minimum end-to-end distance of 22.5
Å (represented by vertical line). (B) CG simulation results with
the stiffness parameter set to 0.5. (C) C36m simulation data. (D)
C22 simulation data.A comparison of the two
CG simulations (Figure A,B) shows as anticipated that the lower
stiffness value yields a profile of dmin that populates conformations with a more opened fold topology. While
the sampling range of dmin values is satisfactory,
both simulations show limited Q values of populating
the bound topology. As noted above, the principal reason of selecting
αstiffness = 1 is the better sampling of the short
helix in the C-terminal half of the peptide. Examples are shown at
the minimum energy in DFIRE and the maximum values of Q.Figure C,D
shows
profiles for C36m and C22. Unlike the CG model, both CHARMM-based
simulations only weakly populated values of dmin near the bound state. A further distinction is that the
all-atom simulations appear to “refine” conformations
in terms of DFIRE values that are much more favorable. Among the simulations,
C22 samples greater Q values, and the energy minimum
exhibits a conformation with a helical topology in the C-terminal
region, a fold most similar to the bound state from X-ray diffraction.
Conclusions
This work presented an application
of a coarse-grained model in
replica-exchange Monte Carlo simulations of a 28-residue peptide extracted
from the Ebola virus VP35 protein. With a selection of the weights
in the SICHO potential function, the simulation of the peptide produced
a free-energy landscape that displayed conformational diversity along
the helix-forming reaction coordinate. Despite the diversity in populations,
the generated conformational ensemble produced states that showed
compacted folds unlike the bound conformation. To better understand
the ensemble, an elastic network interpolation model was applied to
culled peptide structures and estimate the transition costs of morphing
state i → j. The analysis
showed a relatively flat manifold of transition costs of helix formation
along the N-terminal end of the peptide and weak helical populations
in the C-terminal end. While determination of the binding mechanism
is a challenge from either experiments or simulations, the SICHO model
suggests from analysis of the unbound transition network a hybrid
mechanism through conformational selection of the 410-helix
followed by an induced fit of the 310-helix. Finally, a
comparison of the model simulations with previously reported CHARMM-based
simulations of the peptide demonstrates distinctions in the PMFs along
helix-forming paths. By applying a statistical potential, the all-atom
simulations appear to refine conformations to lower-energy distributions,
as anticipated from refined force fields. Among the model simulations,
CHARMM22 results are distinct in populating a higher fraction of native
contacts, although the SICHO model performed admirably for an intermediate
resolution representation. A promising application of the CG model
is generating starting conformations for refinement by either all-atom
simulations or a combined hybrid approach.
Computational
Methods
Lattice Ensemble Tempering
Chain
conformations were generated on a cubic lattice using the MONSSTER
program developed by Skolnick and co-workers.[11,12] In the SICHO model, each amino acid is represented by a single virtual
particle located at the side-chain center of mass and projected onto
the cubic lattice. The SICHO potential energy function incorporates
a combination of statistical and empirical terms and takes on the
following formwhere the additive V terms are atom-specific potential
functions, and the α
parameters denote scaling factors. The stiffness parameter αstiffness controls the scaling of the potential term favoring
the formation of secondary-structure elements of the peptide chain.
The parameter αshort scales short-range interactions
that depend on the type of secondary-structure elements in the sequence
file calculated from an algorithm such as DSSP.[35] The remaining potentials account for N-body interactions of pairwise, long-range, and soft-core repulsive
interactions. The scaling parameter αcentrosym sets
the scaling of the centrosymmetric potential that is used to favor
more compact structures over extended forms.The peptide sequence
is 20-MPGPELSGWISEQLMTGRIPVSDIFCDI-47, and the folded form of the
peptide was extracted from PDB 4YPI.[1] The grid
size for the cubic lattice was set at a value of 125 lattice units
in each direction at a resolution of 1.45 Å grid spacing. Three
separate simulations were conducted, where the parameter αstiffness was varied. One simulation had the parameter set
to a value of 1.0 (where the default is 1.25), another set to 0.5,
and the last model set to 0.0. Other potential energy function parameters
were set at αshort = 0.5 (default: 0.5) and αcentrosym = 0.1 (default: 0.9). Each simulation was started
from the PDB conformation of the bound peptide fold, and the sequence
file was annotated with the DSSP secondary structure.The number
of lattice parallel-tempering simulation cycles at each
temperature was set to 20, and the number of Monte Carlo moves per
cycle was 50. Culled conformations from the simulations were extracted
from a sampled population using 24 replicas. Replicas were exponentially
spaced from a reduced temperature T of 1.0 to 2.4,
where T is normalized by a reference temperature
such that β–1 = kBT represents the energy unit (where kB is the Boltzmann constant). The temperatures for parallel
tempering were selected based on a geometrically spaced set. The value T = 1 is set to represent the distribution of conformations
modeled by the SICHO force field at approximately 300 K. All-atom
structures were reconstructed from the lattice simulations by using
the Multiscale Modeling Tools for Structural Biology (MMTSB).[36] Each structure was refined by energy minimization
of applying the C36m force field with the GBMV2 implicit solvent model
using a procedure given in the previous study of loop predictions.[17] PMFs were calculated using the PTWHAM algorithm[23] with order parameters set at modeling fH, Rg and the lattice
energy Z-score. The extracted conformations were
clustered using a k-means algorithm documented in
MMTSB.Comparison of the SICHO lattice model with data sets
from C22 and
C36m simulations presented in a previous study[7] centers on computing additional order parameters. These include
distributions of the minimum distance between residues Met20 and Ile47,
the fraction of side-chain contacts equivalent to the native fold
(denoted as Q), and scoring conformations using the
statistical potential DFIRE.[34] The simulations
with C22 and C36m were conducted using the GBMV2 solvent model, and
details of the methods are given in the previous studies.[6,7] Previous studies also include the application of an adaptive temperature-based
replica-exchange simulation method in improve conformational sampling.[37]
Elastic Network Model
Conformational
transitions between clustered states of the VP35 peptide are further
evaluated by applying the ENI model.[18−21] Calculations morph state i to state j and start by transforming
the coordinates for two peptide conformations into a non-standardized
form by adjusting the center of mass and principal axis for each molecule.
Rather than applying internal coordinates, ENI converts two sets of
distances between spatially close representative atoms along the two
chains. The coordinates for the starting conformer is denoted by {x} and the final conformer {X} along a pathway. Intermediate
conformations are generated iteratively by solving a quadratic potential-like
cost function given bywhere k is a spring constant, N is the number
of representative atoms, and δ is a 3N-dimensional vector of displacements. The term I is the desired length at a certain intermediate state determined
by distance interpolation and is given bywhere α is the coefficient
specifying how far a given state is along the transition from the
starting conformer to the final conformer. When the coefficient α
is 0.5, the calculated conformation is positioned with inter-residue
distances at the average of conformations {x} and {X}. For the interpolation scheme applied here, analyzed conformations
were extracted from the lattice replica exchange at T = 1 in the PDB format as an input to KOSMOS.[14] The query option in KOSMOS was set to advance level with
coarse-graining performed at the Cα scale, and the cutoff distance
was set at 50 Å.
Authors: Daisuke Kihara; Yang Zhang; Hui Lu; Andrzej Kolinski; Jeffrey Skolnick Journal: Proc Natl Acad Sci U S A Date: 2002-04-16 Impact factor: 11.205
Authors: Jonathan E Kohn; Ian S Millett; Jaby Jacob; Bojan Zagrovic; Thomas M Dillon; Nikolina Cingel; Robin S Dothager; Soenke Seifert; P Thiyagarajan; Tobin R Sosnick; M Zahid Hasan; Vijay S Pande; Ingo Ruczinski; Sebastian Doniach; Kevin W Plaxco Journal: Proc Natl Acad Sci U S A Date: 2004-08-16 Impact factor: 11.205
Authors: Sidhartha Chaudhury; Mark A Olson; Gregory Tawa; Anders Wallqvist; Michael S Lee Journal: J Chem Theory Comput Date: 2012-02-03 Impact factor: 6.006