Anna S Kamenik1, Uta Lessel2, Julian E Fuchs3, Thomas Fox2, Klaus R Liedl1. 1. Institute of General, Inorganic and Theoretical Chemistry, Center for Molecular Biosciences Innsbruck , University of Innsbruck , 6020 Innsbruck , Austria. 2. Medicinal Chemistry , Boehringer Ingelheim Pharma GmbH & Co. KG , 88397 Biberach , Germany. 3. Department of Medicinal Chemistry , Boehringer Ingelheim RCV GmbH & Co KG , 1120 Vienna , Austria.
Abstract
Macrocycles are of considerable interest as highly specific drug candidates, yet they challenge standard conformer generators with their large number of rotatable bonds and conformational restrictions. Here, we present a molecular dynamics-based routine that bypasses current limitations in conformational sampling and extensively profiles the free energy landscape of peptidic macrocycles in solution. We perform accelerated molecular dynamics simulations to capture a diverse conformational ensemble. By applying an energetic cutoff, followed by geometric clustering, we demonstrate the striking robustness and efficiency of the approach in identifying highly populated conformational states of cyclic peptides. The resulting structural and thermodynamic information is benchmarked against interproton distances from NMR experiments and conformational states identified by X-ray crystallography. Using three different model systems of varying size and flexibility, we show that the method reliably reproduces experimentally determined structural ensembles and is capable of identifying key conformational states that include the bioactive conformation. Thus, the described approach is a robust method to generate conformations of peptidic macrocycles and holds promise for structure-based drug design.
Macrocycles are of considerable interest as highly specific drug candidates, yet they challenge standard conformer generators with their large number of rotatable bonds and conformational restrictions. Here, we present a molecular dynamics-based routine that bypasses current limitations in conformational sampling and extensively profiles the free energy landscape of peptidic macrocycles in solution. We perform accelerated molecular dynamics simulations to capture a diverse conformational ensemble. By applying an energetic cutoff, followed by geometric clustering, we demonstrate the striking robustness and efficiency of the approach in identifying highly populated conformational states of cyclic peptides. The resulting structural and thermodynamic information is benchmarked against interproton distances from NMR experiments and conformational states identified by X-ray crystallography. Using three different model systems of varying size and flexibility, we show that the method reliably reproduces experimentally determined structural ensembles and is capable of identifying key conformational states that include the bioactive conformation. Thus, the described approach is a robust method to generate conformations of peptidic macrocycles and holds promise for structure-based drug design.
Macrocycles are an
intriguing compound class, as they promise to
address long-standing druggability challenges such as protein–protein
interfaces with remarkably high affinity.[1−3] Especially peptidic
macrocycles show significantly increased activity and bioavailability
compared to their acyclic counterparts.[4] The advantages of peptidic macrocycles were demonstrated for several
challenging drug targets, such as the HIV-1 protease,[5,6] where the cyclization of peptidic compounds increased their affinity
by up to 4 orders of magnitude. The striking affinity enhancement
due to macrocyclization, which has been observed repeatedly, is proposed
to originate from a structural preorganization:[6−8] In macrocyclic
compounds, the ring closure reduces the accessible conformational
states which ideally would be able stabilize the peptide in its bioactive
conformations. Following the concept of conformational selection,
this shift of state populations toward the active state favors binding.[9] However, in case the macrocyclization stabilizes
a nonbioactive conformational state in solution, it could also slow
down binding. Additionally, the conformational restraints decrease
the loss in conformational entropy upon binding, which contributes
to the increased affinities found for macrocyclic compounds.[10]Hence, macrocycles are among the most
promising compound classes
in the “beyond-rule-of-5” orally available drug space.[4,11−13] Peptidic macrocycles in particular benefit from the
conformational restraints, which allow them to bypass the fast proteolytic
degradation observed for their acyclic analogs.[14] Peptidic macrocycles are typically characterized by a large
polar surface area, which presumably hinders membrane permeability.[15] Nevertheless, it has already been demonstrated
that the exceptional structural characteristics of peptidic macrocycles
can allow considerably high permeability and bioavailability.[16,17] In the prominent case of cyclosporine A, the observed permeability
is attributed to a particular N-methylation pattern which promotes
a conformational switch upon changes in solvent polarity.[18−21] Several other studies on varying model systems also reported a remarkable
permeability and metabolic stability for macrocyclic peptides with
rigidified scaffolds.[6,22]To perform rational drug
design, a reliable and extensive description
of the conformational space of putative ligands is vital.[23,24] For peptidic macrocycles, this task is particularly important as
their benefits in selectivity and bioavailability originate from the
exceptional characteristics of their structural ensembles.[25] Hence, a computational characterization of their
conformational space may significantly aid the drug development process
for macrocyclic compounds.Several comprehensive studies demonstrated
that standard algorithms
to generate small molecule conformations fail to sample physically
meaningful ensembles of macrocycles.[26,27] Hence, dedicated
macrocycle conformation generators were developed, but some challenges
yet remain.[14,28,29] While these state-of-the-art algorithms excel in creating a diverse
ensemble of macrocycle conformations, they are typically not designed
to provide reliable thermodynamic insights.[14,30,31] Peptidic macrocycles are particularly challenging
even for dedicated macrocycle conformation generators due to their
unique structural preferences, dynamics, and biophysical properties.[14,30,32] However, a recent study demonstrated
the reliability of current force-field-based sampling techniques by
elucidating several key features defining the conformational space
of peptidic macrocycles.[32] Also other force-field-based
studies applying enhanced sampling techniques, such as metadynamics[33,34] and replica-exchange MD,[35] to characterize
the structural ensemble of cyclic peptides showed highly promising
results.[36−38]In the present work, we show accelerated molecular
dynamics (aMD)
simulations[39] to be a strikingly reliable
tool to gather a diverse and thermodynamically meaningful conformational
ensemble for peptidic macrocycles. In aMD simulations, the introduction
of a bias potential decreases the height of the barrier between individual
conformational states. Thus, aMD simulations sample the conformational
space more efficiently than conventional MD (cMD) simulations, yet
allow the recalibration to the “real” phase space potential
energy surface (PES).[40−42] Two further advantages of aMD simulations are that
no prior knowledge of the PES is required and that the computational
effort is comparable to cMD simulations. The approach is well established
as demonstrated in several studies where aMD simulations capture diverse
conformational ensembles of biomolecules while preserving the thermodynamic
information on the original PES.[43−47]For this first evaluation of the proposed aMD-based
approach, we
exclusively considered data from NMR experiments in aqueous solution.
An enhanced understanding of macrocycles in aqueous solution is urgently
needed to enhance our understanding of conformational preorganization
for binding. Despite the great interest in macrocyclic peptides as
potent drug candidates, we found that the available experimental data
on their structural ensembles in aqueous solution are rather sparse,
probably due to their limited solubility in water.[48]Additionally, organic solvents still hold numerous
challenges in
MD-based computational modeling. There is much less experience for
MD simulations in explicit organic solvents compared to water, which
results in a limited reliability of available force field parameters.[49−51] Moreover, the correlation times, which describe the time scale needed
for molecular reorientation, are much longer for organic solvents
than for water, and thus the necessary simulation time increases significantly.[52]To keep the computational effort tractable,
we further restricted
the size of our model systems to a maximum of 12 rotatable bonds,
corresponding to 6 amino acid residues. We profile the conformational
space of three different cyclopeptidic model systems of varying size,
flexibility, and available experimental data (Figure ). The first system studied is the integrin-binding
pentapeptide cyclo-(Pro-Ser-leu-Asp-Val)[53] for which NOE restraints in aqueous solution and the relative prevalence
of two of its major components, a cis- and a trans-proline state, were reported.[54] Furthermore, we characterize the structural ensemble of the cyclic
pentapeptide cilengitide, cyclo-(Arg-Gly-Asp-phe-[(N-Me)-Val]). Cilengitide
has been studied as an anticancer drug candidate and its conformational
space in aqueous solution as well as its target-bound structure have
been thoroughly characterized using NMR and X-ray crystallography.[55−57] The third system is the antimicrobial hexapeptide cyclo-(Arg-Arg-Trp-Trp-Arg-Phe)[58] for which comprehensive NMR data in aqueous
solution were kindly provided by Peter Schmieder.[59]
Figure 1
Studied peptidic macrocycles. (1) The integrin-binding pentapeptide
cyclo-(Pro-Ser-leu-Asp-Val), where ωVal-5,
(the peptide bond between Val-5 and Pro-1) distinguishes the cis- and trans-proline state, (2) the anticancer
drug candidate cyclo-(Arg-Gly-Asp-phe-([N-Me]Val)), known as cilengitide,
and (3) the antimicrobial hexapeptide cyclo-(Arg-Arg-Trp-Trp-Arg-Phe).
Studied peptidic macrocycles. (1) The integrin-binding pentapeptide
cyclo-(Pro-Ser-leu-Asp-Val), where ωVal-5,
(the peptide bond between Val-5 and Pro-1) distinguishes the cis- and trans-proline state, (2) the anticancer
drug candidate cyclo-(Arg-Gly-Asp-phe-([N-Me]Val)), known as cilengitide,
and (3) the antimicrobial hexapeptide cyclo-(Arg-Arg-Trp-Trp-Arg-Phe).The scope of present study is
to evaluate the accuracy of the enhanced
sampling approach in reproducing structural features and relative
state ratios.
Methods
Structure Preparation
All structures were generated
and prepared in Molecular Operation Environment (MOE) 2016.08[60] using the implemented protonate3D tool.[61] For the cis and trans state of the cyclic pentapeptide cyclo-(Pro-Ser-leu-Asp-Val), starting
conformations were constructed based on published backbone torsions
using the “Protein Builder” tool in MOE. Starting coordinates
of cilengitide were obtained by isolating the macrocycle from the
complex X-ray structure (PDB 1L5G).[57] To reduce a possible
bias introduced by the starting structure, a geometry optimization
in vacuum was performed using the Amber10:EHT force field and default
settings. This yielded a starting conformation with a backbone RMSD
of 1.0 Å (2.6 Å all atom RMSD) to the crystal structure
(Figure S2). For the simulations of the
cyclic hexapeptide cyclo-(Arg-Arg-Trp-Trp-Arg-Phe), we started from
an unpublished NMR structure kindly provided by Peter Schmieder.[58,59]
Simulation Setup
MD simulations were performed and
analyzed using the Amber16 simulation package.[62] We used the program tleap to generate an Amber topology
and initial coordinate files.[62] The starting
coordinates of the macrocycles were prepared as described above, then
a octahedral box of TIP3P waters[63] was
added so that every macrocycle atom was at least 12 Å from the
box boundary. The three model systems were parametrized using the
Amber force field 14SB;[64] additional parameters
for N-methylated amino acids were taken from Forcefield_NCAA.[65] A thorough equilibration scheme was applied
as described previously[66] to minimize and
relax the system. NpT MD simulations were run at atmospheric pressure,
by applying the Berendsen barostat,[67] and
300 K using the Langevin thermostat.[68] The
SHAKE algorithm[69] was applied to restrain
all bonds involving hydrogens, allowing a time step of 2 fs. Productive
cMD and aMD simulations of 1 μs length were performed using
the GPU implementation of pmemd[70] with
a nonbonded cutoff of 8 Å and stored as 500,000 equally spaced
snapshots. For cilengitide, 20 aMD simulations of 50 ns each (25,000
frames) were run with randomized starting velocities and combined
to a 1 μs trajectory for further analysis. In Figure S1 we illustrate that we are not able to capture significant
structural changes with the 1 μs cMD simulation using cyclo-(Pro-Ser-leu-Asp-Val)
as a representative example.All aMD simulations were performed
using the dual-boost algorithm implemented in Amber16,[62] thus a bias was applied on the total potential
with an additional boost on the dihedral term.[39,71] The parameters determining the boosting were derived as proposed
by Pierce et al.[42,43] based on average energies from
prior cMD simulations, the number of atoms and residues, and are listed
in the Supporting Information.
Trajectory
Analysis
Thermodynamic information on the
unbiased ensemble is reconstructed from the aMD ensemble with Boltzmann
reweighting using Maclaurin series expansion up to the 10th order.[40] For small biasing potentials, which exactly
show Gaussian distribution, the cumulant expansion to the second order
produces the most accurate results.[40,72] For large
biomolecular systems and high boosting potentials, Maclaurin series
expansion is the more robust method.[42,43,73] To sample the state transitions in the studied macrocycles,
we applied parameters that are expected to achieve high boosting potentials,
and thus Maclaurin series expansion was used to estimate the exponential
term in the reweighting scheme.The aMD trajectories are analyzed
with cpptraj[74] and in-house scripts. A
protocol provided by Miao et al.[40] was
applied to perform the Boltzmann reweighting. For each system, the
covered conformational space was analyzed by performing principal
component analysis (PCA) based on the backbone dihedral angles φ,
ψ, and ω as implemented in cpptraj. Likewise, also principal
components based on Cartesian coordinates of backbone heavy atoms
were calculated (Figure S1). For the studied
macrocycles, we found that PCA in Cartesian space leads to a less
distinct representation of highly populated states compared to dihedral
PCA. Additionally, we characterized the internal motions using the
eccentricity ε[75] as a global variable
for each snapshot. A value of ε near 1 describes an aspherical
compound and 0 indicates perfect globularity. As shown in Figure S3 we find that this metric is able to
capture the topological differences between the cis and trans state of cyclo-(Pro-Ser-leu-Asp-Val).
Yet, similar to the Cartesian PCA, the differentiation between densely
sampled states is not as distinct as observed in the dihedral PCA
space. Therefore, for further investigations we used the representation
of the structural ensemble in dihedral space.The resulting
PCA plots of the reweighted structural data projected
on the respective eigenvectors were created using Gnuplot 5.0.[76] To study varying structural properties, dihedral
angles and backbone RMSD to the bioactive conformation were calculated
with cpptraj and used to color-code the projected data in PCA coordinates. Furthermore,
reweighted flexibilities were retrieved by calculating dihedral entropies
using in-house scripts.[43] Dihedral entropies
are an alignment-independent metric to quantify backbone motions of
proteins and peptides and have been shown to be applicable for cMD
as well as aMD simulations.[43,77]To simplify the
clustering, we benchmarked the minimum number of
representative snapshots required to achieve reliable and coherent
results. We applied an energetic cutoff based on the boosting potential
discarding high-energy conformations. For the studied systems, we
found that incorporating 2000 snapshots, that is, 0.4% of the total
number of snapshots, significantly reduces the calculation time and
increases robustness of cluster analysis compared to using all 500,000
snapshots. We found that including only snapshots with high boosting
potentials consistently leads to cluster representatives that are
located near the free energy minima in PCA space. Height and distribution
of the boosting potentials are strongly varying for each system, hence
so does the range of the boosting potential of the top 2000 snapshots.
For cyclo-(Pro-Ser-leu-Asp-Val), the incorporated snapshots comprise
boosting potentials from 31 to 49 kcal/mol, and for cilengitide, this
window ranges from 27 to 42 kcal/mol and for the largest system cyclo-(Arg-Arg-Trp-Trp-Arg-Phe)
from 31 to 51 kcal/mol. The average interproton distances and asymmetric
errors calculated from the reduced ensemble are consistent with the
full ensemble. However, the optimal number of snapshots entirely relies
on the analysis objectives and is strongly dependent on the system
under investigation.We conducted a hierarchical agglomerative
clustering based on the
sine and cosine contributions of the backbone dihedrals using cpptraj
to accomplish a thorough structural characterization of representative
conformations. The number of clusters was used as a stop criterion
for the clustering and was gradually increased until at least one
representative structure was found for each free energy minimum in
the reweighted PCA.To compare the aMD ensemble to experimental
NOE measurements, we
calculated an inverse sixth power ensemble average of the reweighted
hydrogen distances, as described in the literature.[78,79] The aMD reweighting procedure was performed using a python toolkit
by Miao et. al as described above. An additional distance weighting,
that is, the use of r–6 averaging
is necessary to make aMD and NMR-derived distances comparable.[79] As expected, we observed that the sampled NOE
distances are not evenly distributed around the average. To account
for this asymmetric distance distribution, we determined asymmetric
error bars by separately calculating the population weighted RMSD
from the mean NOE distance for values above and below the mean.
Experimental References
Special care has to be taken
when comparing interproton distances from MD simulations and NMR experiments.
First of all, the sign of the NOE in NOESY spectra depends on the
correlation time of the molecule. Particular unfortunate combinations
of field strength and molecular size can even lead to zero or near
zero NOEs. Thus, the measured NOE may be biased toward too long distances
or completely vanish even though two protons are close in space.[52,80] Furthermore, experimental NOE signals may overlap and thus cannot
be resolved.[52,80] Hence, not all close hydrogen
distances observed in MD simulations are expected to be found experimentally,
but all experimental NOEs should be present within a given range in
our MD simulations.
Cyclo-(Pro-Ser-leu-Asp-Val)
A comprehensive
study characterized
the conformational ensemble of the cyclic pentapeptide cyclo-(Pro-Ser-leu-Asp-Val)
using NMR experiments.[54] In the 1H-spectrum published in this work, five slowly interconverting conformational
states were identified, but only two of them were sufficiently populated
to distinguish them in the NOESY spectrum. Hence, hydrogen distance
restraints and populations for two out of the five states were reported,
which were identified as cis- and trans-proline states (Figure ).
Cilengitide
Structure and activity
of the integrin
binding peptide cilengitide have been thoroughly characterized in
several experimental studies.[55,56,81] The available structural data comprise distance restraints from
ROESY experiments in aqueous solution[55] as well as a crystal structure of the cyclic pentapeptide bound
to its target integrin (PDB 1L5G)[57] (Figure ).
Cyclo-(Arg-Arg-Trp-Trp-Arg-Phe)
Structural characteristics
of cyclo-(Arg-Arg-Trp-Trp-Arg-Phe) were profiled in an extensive NMR
study comprising 2D NOESY and ROESY spectra in varying environments.[58] To determine the structure in aqueous solution,
a simulated annealing protocol was applied using ROE and NOE distance
restraints as well as homonuclear coupling constants. Based on the
high number of NOE/ROE violations and high RMSD values, the cyclic
hexapeptide was reported as a remarkably flexible macrocycle in water
(Figure ).
Results
Cyclo-(Pro-Ser-leu-Asp-Val)
To illustrate the sampled
conformational space, we performed a PCA. We found that 36.9% and
14.3% of the total variance of dihedral movements in the simulation
were described by PC1 and PC2, respectively. In Figure A, the 1 μs aMD trajectory was projected
onto the first two principal component vectors (PC1 and PC2) and color-coded
according to the Boltzmann reweighted free energy. In the conformational
space spanned by PC1 and PC2, we found at least three clearly separated
free energy minima located approximately at [−3, −0.5],
[0.8, −0.5], and [0.8, 2.5].
Figure 2
Conformational space of cyclo-(Pro-Ser-leu-Asp-Val).
(A) The aMD
ensemble was color-coded according to the reweighted free energies
and depicted as projection onto the first two PCA eigenvectors. (B)
The 2000 most favorable snapshots, color-coded according to their
conformational state, were projected onto the same PCA eigenvectors.
Structures with |ωVal-5| ≤ 90°
were considered as cis (blue) and the remaining structures
with |ωVal-5| > 90° as trans state (red).
Conformational space of cyclo-(Pro-Ser-leu-Asp-Val).
(A) The aMD
ensemble was color-coded according to the reweighted free energies
and depicted as projection onto the first two PCA eigenvectors. (B)
The 2000 most favorable snapshots, color-coded according to their
conformational state, were projected onto the same PCA eigenvectors.
Structures with |ωVal-5| ≤ 90°
were considered as cis (blue) and the remaining structures
with |ωVal-5| > 90° as trans state (red).As described in the Methods section, we
were focusing on the 2000 snapshots with the highest boosting potential,
that is, the snapshots with the lowest potential energy. We compared
relative state ratios from aMD simulations and NMR experiments by
calculating the backbone torsion of Valine-5, ωVal-5 (highlighted in Figure B), as defined by Viles et al.[54] to distinguish between cis- and trans-proline states. Structures with |ωVal-5|
≤ 90° were considered as cis and the
remaining structures with |ωVal-5| > 90°
as trans conformational states. Within the 2000 investigated
snapshots, we found 491 structures in the trans state
and 1509 in the cis state. This corresponds to a
distribution of 25/75 which agrees strikingly well with the experimentally
reported trans to cis state ratio
of 20/80 (Figure B).
Furthermore, we reweighted the state distribution of the full aMD
ensemble, again finding a trans to cis state ratio of 25/75.The average interproton distances and
the respective asymmetric
errors are shown in Figure , where we plotted the aMD distances for the cis and trans ensemble of cyclo-(Pro-Ser-leu-Asp-Val)
compared to the published NMR distances and boundaries. For the trans structure, 10 NOE restraints were reported, which
were all fulfilled by the weighted ensemble average of the aMD distance
distributions. The cis state was characterized by
18 NOE restraints. In the simulated ensemble, all but one restraint,
NOE 11, were fulfilled. The distance between the amidehydrogen of
Val-5 and Asp-4 Cα hydrogens was found to be too
short in the aMD ensemble compared to the experimental boundaries.
However, as discussed in the Methods section,
it is possible that NOE-derived distance boundaries underestimate
the true distance between two atoms. Thus, our results do not necessarily
disagree with the experimental data.
Figure 3
Comparison of interproton distances in
Å from NOESY experiments
and aMD simulations of the cis and trans state of cyclo-(Pro-Ser-leu-Asp-Val). The gray diamonds indicate
the experimentally estimated average distance. The allowed area within
the NMR boundaries is depicted in white. Average distances calculated
from the aMD snapshots corresponding to the cis and trans state with the according asymmetric error bands are
shown in color.
Comparison of interproton distances in
Å from NOESY experiments
and aMD simulations of the cis and trans state of cyclo-(Pro-Ser-leu-Asp-Val). The gray diamonds indicate
the experimentally estimated average distance. The allowed area within
the NMR boundaries is depicted in white. Average distances calculated
from the aMD snapshots corresponding to the cis and trans state with the according asymmetric error bands are
shown in color.For a detailed characterization
of the structural features captured
with aMD simulations, we clustered the 2000 snapshots with the highest
boosting potential—thus lowest energy—resulting in 30
representative structures. In Figure the cluster representatives were projected onto the
PCA space. As the color coding in Figure B shows, the cluster in the highly populated
area on the left-hand side of the PCA around [−3, −0.5]
represented structures in the trans state. Cluster
structures from the two remaining minima on the right-hand side of
the PCA plot both represented cis-proline states.
The structures around [0.8, 0.0] and [0.8, 2.5] could be distinguished
by the dihedrals of the amide bond between Asp-4 and Val-5. In structures
of cluster c23, representing the least populated minimum, the carbonyl
oxygen was oriented toward the center of the ring. In cluster c19
which was found in the center of the most populated minimum around
[0.8, 0.0], the same carbonyl oxygen is oriented perpendicular to
the ring plane.
Figure 4
Representative structures of the cyclo-(Pro-Ser-leu-Asp-Val)
ensemble.
The 30 representative structures are color-coded according to their
reweighted free energies and projected onto the PCA eigenvectors.
Free energy minima in the PCA space are indicated with gray contour
lines from 5.2 to 6.2 kcal/mol in 0.2 kcal/mol steps. Representatives
of cluster c23 and c19 were both identified as cis clusters differing most prominently in the backbone torsion ψAsp-4, see Table S2 for further
details.
Representative structures of the cyclo-(Pro-Ser-leu-Asp-Val)
ensemble.
The 30 representative structures are color-coded according to their
reweighted free energies and projected onto the PCA eigenvectors.
Free energy minima in the PCA space are indicated with gray contour
lines from 5.2 to 6.2 kcal/mol in 0.2 kcal/mol steps. Representatives
of cluster c23 and c19 were both identified as cis clusters differing most prominently in the backbone torsion ψAsp-4, see Table S2 for further
details.In summary, Figure indicated that three to five states were
visited in the aMD simulation.
The clustering result (Figure ) further highlighted that at least three clearly distinct
conformational states were densely sampled by the aMD ensemble. These
strongly sampled areas in the PCA comprised one trans and two different cis states, which is in good
agreement with the available NMR experiments.
Cilengitide
We
projected the reweighted structural
data of cilengitide on the first two PCA eigenvectors, which comprise
25% and 22% of the total variance (Figure A). The resulting free energy surface showed
two distinct areas separated by the second eigenvector of the PCA.
In each of the two densely sampled areas, several local minima were
connected by rather broad pathways. The receptor-bound conformation
(gray diamond) was located in the free energy minimum around [1.8,
0.8]. The 2000 conformations with the highest boost are displayed
in Figure B, color-coded
according to their RMSD to the bioactive conformation. We observed
that snapshots that were found in the same free energy minimum as
the bioactive conformation in PCA space also showed structural characteristics
highly similar to the target bound conformation.
Figure 5
Conformational space
of cilengitide. (A) The aMD ensemble was color-coded
according to the reweighted free energies and depicted as projection
onto the first two PCA eigenvectors. The gray diamond marks the bioactive
conformation. (B) The 2000 lowest energy snapshots were projected
onto the PCA eigenvectors and color-coded according to the RMSD to
the bioactive conformation.
Conformational space
of cilengitide. (A) The aMD ensemble was color-coded
according to the reweighted free energies and depicted as projection
onto the first two PCA eigenvectors. The gray diamond marks the bioactive
conformation. (B) The 2000 lowest energy snapshots were projected
onto the PCA eigenvectors and color-coded according to the RMSD to
the bioactive conformation.The comparison of the calculated average interproton distances
with the experimentally derived NOE boundaries showed that 9 out of
10 experimental upper boundaries are fulfilled by the aMD ensemble.
(Figure ) The outlier,
NOE 6, represented the distance between amidehydrogens of Asp-3 and
phe-4, which on average was sampled as 0.22 Å too long in the
aMD simulation. The violation was rather small and might result from
force field inaccuracies, especially in the description of the d-phenylalanine.
Figure 6
Comparison of interproton distances in Å from NMR
experiments
and aMD simulations of cilengitide. Upper distance boundaries from
ROESY experiments in aqueous solution were used to benchmark the structural
accuracy of the aMD ensemble. The white background area indicates
distances that fulfill the NMR-derived restraints. Interproton distances
calculated from the aMD ensemble of cilengitide with the respective
asymmetric errors bands are depicted in orange.
Comparison of interproton distances in Å from NMR
experiments
and aMD simulations of cilengitide. Upper distance boundaries from
ROESY experiments in aqueous solution were used to benchmark the structural
accuracy of the aMD ensemble. The white background area indicates
distances that fulfill the NMR-derived restraints. Interproton distances
calculated from the aMD ensemble of cilengitide with the respective
asymmetric errors bands are depicted in orange.Clustering the reduced aMD ensemble, as described earlier,
allowed
more detailed structural investigations. Figure shows the projection of the resulting cluster
representatives onto PC1 and PC2 and selected representative snapshots
to illustrate structural features occurring in the individual free
energy basins. The representative of cluster c16 represented the same
minimum as the bioactive conformation, and the superposition of both
structures, shown in Figure , highlighted their conformational similarity with a backbone
RMSD of 0.6 Å. Further details can be found in Table S3.
Figure 7
Twenty representative structures of the cilengitide ensemble
obtained
by cluster analysis. The representative structures were color-coded
according to their reweighted free energies and projected onto the
PCA eigenvectors. The free energy minima in the PCA space are indicated
with gray contour lines from 4.6 to 4.9 kcal/mol in 0.1 kcal/mol steps.
The backbone conformation of cluster c16 closely resembled the bioactive
conformation. When comparing representatives of other minima with
the bioactive conformation (gray diamond), the most prominent differences
were reflected by φArg-1 and ψAsp-3. See Table S2 for further details.
Twenty representative structures of the cilengitide ensemble
obtained
by cluster analysis. The representative structures were color-coded
according to their reweighted free energies and projected onto the
PCA eigenvectors. The free energy minima in the PCA space are indicated
with gray contour lines from 4.6 to 4.9 kcal/mol in 0.1 kcal/mol steps.
The backbone conformation of cluster c16 closely resembled the bioactive
conformation. When comparing representatives of other minima with
the bioactive conformation (gray diamond), the most prominent differences
were reflected by φArg-1 and ψAsp-3. See Table S2 for further details.
Cyclo-(Arg-Arg-Trp-Trp-Arg-Phe)
As our third example,
we characterized the conformational space of the antimicrobial cyclic
hexapeptide cyclo-(Arg-Arg-Trp-Trp-Arg-Phe). In the performed PCA,
the first two eigenvectors comprised 33% and 15% of the total variance
of dihedral motions. Projection of the 1 μs aMD trajectory onto
PCA space yielded a potential energy surface with a large number of
minima, indicating a very diverse conformational ensemble (Figure S3). This is in line with NMR structure
calculations, where large RMSD values were reported for this macrocycle
in aqueous solution.The average interproton distances, calculated
from 2000 snapshots, were all consistent with the NMR-derived boundaries
(Figure ).
Figure 8
Comparison
of interproton distances in Å from ROESY experiments
and aMD simulations of cyclo-(Arg-Arg-Trp-Trp-Arg-Phe). The white
background area indicates distances that fulfill the NMR-derived restraints.
Average distances calculated from the aMD ensemble and the according
asymmetric error band are depicted in green.
Comparison
of interproton distances in Å from ROESY experiments
and aMD simulations of cyclo-(Arg-Arg-Trp-Trp-Arg-Phe). The white
background area indicates distances that fulfill the NMR-derived restraints.
Average distances calculated from the aMD ensemble and the according
asymmetric error band are depicted in green.To illustrate some of the conformational motifs obtained
with the
aMD approach, we discretized the ensemble into 15 clusters. Projecting
the representative snapshots onto the PCA space, we note that the
selected cluster structures represented the minima obtained from the
full ensemble (Figure ).
Figure 9
Representative structures of the cyclo-(Arg-Arg-Trp-Trp-Arg-Phe)
ensemble obtained by clustering. The representative structures were
color-coded according to their reweighted free energies and projected
onto the PCA eigenvectors. The free energy minima in the PCA space
are indicated with gray contour lines reaching from 6.2 to 7.4 kcal/mol
in 0.2 kcal/mol steps. See Table S5 for
further details.
Representative structures of the cyclo-(Arg-Arg-Trp-Trp-Arg-Phe)
ensemble obtained by clustering. The representative structures were
color-coded according to their reweighted free energies and projected
onto the PCA eigenvectors. The free energy minima in the PCA space
are indicated with gray contour lines reaching from 6.2 to 7.4 kcal/mol
in 0.2 kcal/mol steps. See Table S5 for
further details.
Discussion
In
the present study, we applied aMD simulations as a reliable
tool to characterize the structural as well as thermodynamic properties
of peptidic macrocycles. We assessed the performance of the approach
for three different cyclic peptides with varying physicochemical properties.While previous studies reported unsatisfying structural diversity
from conventional MD simulations,[14] we
showed that the aMD enhanced sampling approach is able to efficiently
sample a very diverse ensemble and reliable relative state ratios.We used PCA as a straightforward tool to illustrate the conformational
space covered in the simulations of each cyclo-peptide. For the first
system, cyclo-(Pro-Ser-leu-Asp-Val), we found at least three clearly
delineated free energy minima in the reweighted PCA plot. The representation
of the conformational ensemble in PCA space is well in line with NMR
experiments on this cyclic pentapeptide, which report clearly distinguishable,
slowly interconverting conformational states. The antimicrobial cyclo-(Arg-Arg-Trp-Trp-Arg-Phe)
on the other hand shows exceptionally high flexibility in NMR experiments
in aqueous solution. The cyclic hexapeptide interconverts relatively
fast between different conformational states, as reflected by a high
number of NOE violations and large RMSD values in the NMR structure
calculations. The PCA from aMD simulations of this system showed a
large number of rather broad minima and maxima, indicating a high
number of rapidly interchanging conformational states. Compared to
the other, less flexible, investigated compounds, the ensemble of
cyclo-(Arg-Arg-Trp-Trp-Arg-Phe) also shows a higher structural variance
as indicated by the significantly larger axes of the PCA. Thus, PCA
of the aMD ensemble reflects the experimentally determined trends
in conformational flexibility. To affirm these findings, we further
quantified the backbone flexibility of each macrocycle using sum of
dihedral entropies[43,77] (Table S4) and again found good agreement with the experimental trends in
flexibility.To assess the structural accuracy of the macrocycle
ensemble generated
with aMD, we calculated average interproton distances and compared
them to distance restraints from NOESY and ROESY experiments. Observables
from NMR experiments excel as reference for MD simulations as they
provide information on the entire structural ensemble of a compound.
For all tested systems, we find reasonable agreement with the NMR
restraints. For the highly flexible macrocycle cyclo-(Arg-Arg-Trp-Trp-Arg-Phe),
all NMR restraints are fulfilled by the aMD ensemble. Yet, calculation
of interproton distances for single representative structures shows
large violations, which further highlights that NOE restraints from
NMR experiments represent ensemble averages and not necessarily single
structural states (Figure S2). Hence, the
good agreement with the NMR restraints implies physically meaningful
sampling of the conformational space of cyclo-(Arg-Arg-Trp-Trp-Arg-Phe).For the cis state of cyclo-(Pro-Ser-leu-Asp-Val),
one distance is found to be on average too short in the aMD ensemble.
As described in the Methods section, the possibility
of a zero crossing of the NOE cannot be ruled out. Thus, the experimental
interproton distance restraint might also be artificially shifted
toward higher values, and we do not consider this an alarming result.
In the cilengitide ensemble, on the other hand, we find one average
interproton distance as being 0.22 Å too long in the aMD ensemble.
This outlier clearly indicates a slight discrepancy between aMD and
experimental state populations. This discrepancy might derive from
simplifications of the force field, for example, d- and l-amino acids being described by the same parameters. Nevertheless,
considering the accuracy of the experimental reference paired with
the limitation of current force fields, the overall agreement of aMD
and NMR is strongly convincing.Further experimental indications
on the structural features of
the studied macrocycles were given by 3JαNH coupling constants. These NMR coupling constants
can be related to the backbone dihedral angle Θ (Θ = |φ
– 60| for l-amino acids and Θ = |φ + 60|
for d-amino acids) via the Karplus equation.[54,78] We calculated the reweighted distribution of Θ for the cis and trans state of cyclo-(Pro-Ser-leu-Asp-Val)
(Figure S5). From the published 3JαNH coupling constants, we would
expect two significant changes in the respective backbone torsion.
On the one hand, the experimental data imply a shift of the Θ(Val)
distribution toward larger torsional angles when switching from cis to trans. For the Θ(Asp) distribution,
on the other hand, the NMR measurements indicate a shift toward smaller
angles in the trans state compared to the cis state. As shown in Figure S5, we find good agreement between the distinct structural preferences
of cis and trans states captured
with NMR and aMD. However, the agreement between average 3JαNH coupling constants estimated
by NMR coupling constants and aMD simulations is rather unsatisfying
with a RMSD of 2 Hz. Due to the nature of the Karplus relation, calculation
of 3JαNH coupling constants
from dihedral data is very sensitive. Small fluctuations of Θ
around large angles lead to large deviations in the average 3JαNH. Considering the height of
the boosting potential which is applied on all torsions and the overall
potential, it is not expected to achieve an accurate reproduction
of 3JαNH coupling constants.
The strength of the aMD approach lies in the efficient sampling of
a large portion of the conformational space but in the reweighting
step, a large, so-called “energetic noise” is introduced.[41] Consequently, the overall effects of large conformational
rearrangements on dihedral distributions are captured. But we observe
that the method does not provide the resolution to reweight the subtle
differences within the dihedral distributions, which would be needed
to accurately estimate the experimental coupling constants. Nevertheless,
while the small dissonances in the backbone angle distributions lead
to large deviations in the average 3JαNH coupling constants, the overall structures still
agree well with experiment, as indicated by interproton distances.Summarizing, we showed that aMD simulations adequately capture
the dynamic and structural properties of peptidic macrocycles in solution.
Even more strikingly, we found that also the relative ratio of trans- and cis-proline states of the integrin
binding peptide cyclo-(Pro-Ser-leu-Asp-Val) is coherent with the published
NMR data. Despite the known limitations in current force fields and
sampling techniques, the presented results strongly indicate that
aMD simulations are well suited to structurally and also thermodynamically
profile the conformational space of peptidic macrocycles.
Conclusion
We evaluated the performance of explicit solvent aMD simulations
as conformer generator for peptidic macrocycles. Applying the approach
on three different cyclic peptides with varying structural characteristics,
we find excellent agreement with experimentally determined structural
features.We showed that aMD simulations are a valid approach
to capture
a diverse as well as structurally and thermodynamically meaningful
conformational ensemble for peptidic macrocycles. The proposed approach
allows an extremely thorough characterization of the conformational
space and the dynamic behavior of cyclic peptides. Furthermore, for
cilengitide we found that also the target-bound conformation is among
the frequently sampled states in the aMD simulation. Hence, when the
target-bound conformation is already significantly populated in the
solution ensemble, aMD identifies the bioactive conformation as a
key state, which should be considered in further analysis. So far,
the approach is computationally rather expensive. Yet, the calculations
can easily be parallelized and further optimized to increase the sampling
efficiency.Altogether, aMD simulations are an excellent technique
to perform
a comprehensive study on promising macrocyclic compounds. Besides
its function as a conformation generator, the aMD approach also allows
to estimate thermodynamic properties, such as the flexibility of macrocycles.
Furthermore, it is possible to trace changes in the relative ratio
of stable conformational states upon modifications within the backbone
or functional groups of a macrocyclic compound. The comprehensive
computational characterization can be extremely beneficial in macrocycle
drug design projects, where synthesis is often rather complicated
and costly. Thus, we surmise that aMD simulations hold great potential
to significantly enhance the development of macrocyclic drugs.
Authors: Nils-Ole Friedrich; Christina de Bruyn Kops; Florian Flachsenberg; Kai Sommer; Matthias Rarey; Johannes Kirchmair Journal: J Chem Inf Model Date: 2017-10-18 Impact factor: 4.956
Authors: C López-Martínez; P Flores-Morales; M Cruz; T González; M Feliz; A Diez; Josep M Campanera Journal: Phys Chem Chem Phys Date: 2016-04-21 Impact factor: 3.676
Authors: Adrian Glas; Eike-Christian Wamhoff; Dennis M Krüger; Christoph Rademacher; Tom N Grossmann Journal: Chemistry Date: 2017-08-30 Impact factor: 5.236
Authors: Daniel W Carney; Karl R Schmitz; Jonathan V Truong; Robert T Sauer; Jason K Sello Journal: J Am Chem Soc Date: 2014-01-24 Impact factor: 15.419
Authors: George A Khoury; James Smadbeck; Phanourios Tamamis; Andrew C Vandris; Chris A Kieslich; Christodoulos A Floudas Journal: ACS Synth Biol Date: 2014-01-14 Impact factor: 5.110
Authors: Ishika Saha; Eric K Dang; Dennis Svatunek; Kendall N Houk; Patrick G Harran Journal: Proc Natl Acad Sci U S A Date: 2020-09-18 Impact factor: 11.205
Authors: Satoshi Ono; Matthew R Naylor; Chad E Townsend; Chieko Okumura; Okimasa Okada; R Scott Lokey Journal: J Chem Inf Model Date: 2019-05-08 Impact factor: 4.956
Authors: Paris R Watson; Timothy W Craven; Xinting Li; Stephen Rettie; Parisa Hosseinzadeh; Fátima Pardo-Avila; Asim K Bera; Vikram Khipple Mulligan; Peilong Lu; Alexander S Ford; Brian D Weitzner; Lance J Stewart; Adam P Moyer; Maddalena Di Piazza; Joshua G Whalen; Per Jr Greisen; David W Christianson; David Baker Journal: Nat Commun Date: 2021-06-07 Impact factor: 14.919
Authors: Anna S Kamenik; Isha Singh; Parnian Lak; Trent E Balius; Klaus R Liedl; Brian K Shoichet Journal: Proc Natl Acad Sci U S A Date: 2021-09-07 Impact factor: 11.205
Authors: Ilda D'Annessa; Francesco Saverio Di Leva; Anna La Teana; Ettore Novellino; Vittorio Limongelli; Daniele Di Marino Journal: Front Mol Biosci Date: 2020-05-05
Authors: Anna S Kamenik; Johannes Kraml; Florian Hofer; Franz Waibl; Patrick K Quoika; Ursula Kahler; Michael Schauperl; Klaus R Liedl Journal: J Chem Inf Model Date: 2020-06-29 Impact factor: 6.162
Authors: Ajay N Jain; Ann E Cleves; Qi Gao; Xiao Wang; Yizhou Liu; Edward C Sherer; Mikhail Y Reibarkh Journal: J Comput Aided Mol Des Date: 2019-05-03 Impact factor: 3.686