Derya Meral1, Davide Provasi1, Marta Filizola1. 1. Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, New York 10029, USA.
Abstract
Computational strategies aimed at unveiling the thermodynamic and kinetic properties of G Protein-Coupled Receptor (GPCR) activation require extensive molecular dynamics simulations of the receptor embedded in an explicit lipid-water environment. A possible method for efficiently sampling the conformational space of such a complex system is metadynamics (MetaD) with path collective variables (CVs). Here, we applied well-tempered MetaD with path CVs to one of the few GPCRs for which both inactive and fully active experimental structures are available, the μ-opioid receptor (MOR), and assessed the ability of this enhanced sampling method to estimate the thermodynamic properties of receptor activation in line with those obtained by more computationally expensive adaptive sampling protocols. While n-body information theory analysis of these simulations confirmed that MetaD can efficiently characterize ligand-induced allosteric communication across the receptor, standard MetaD cannot be used directly to derive kinetic rates because transitions are accelerated by a bias potential. Applying the principle of Maximum Caliber (MaxCal) to the free-energy landscape of morphine-bound MOR reconstructed from MetaD, we obtained Markov state models that yield kinetic rates of MOR activation in agreement with those obtained by adaptive sampling. Taken together, these results suggest that the MetaD-MaxCal combination creates an efficient strategy for estimating the thermodynamic and kinetic properties of GPCR activation at an affordable computational cost.
Computational strategies aimed at unveiling the thermodynamic and kinetic properties of G Protein-Coupled Receptor (GPCR) activation require extensive molecular dynamics simulations of the receptor embedded in an explicit lipid-water environment. A possible method for efficiently sampling the conformational space of such a complex system is metadynamics (MetaD) with path collective variables (CVs). Here, we applied well-tempered MetaD with path CVs to one of the few GPCRs for which both inactive and fully active experimental structures are available, the μ-opioid receptor (MOR), and assessed the ability of this enhanced sampling method to estimate the thermodynamic properties of receptor activation in line with those obtained by more computationally expensive adaptive sampling protocols. While n-body information theory analysis of these simulations confirmed that MetaD can efficiently characterize ligand-induced allosteric communication across the receptor, standard MetaD cannot be used directly to derive kinetic rates because transitions are accelerated by a bias potential. Applying the principle of Maximum Caliber (MaxCal) to the free-energy landscape of morphine-bound MOR reconstructed from MetaD, we obtained Markov state models that yield kinetic rates of MOR activation in agreement with those obtained by adaptive sampling. Taken together, these results suggest that the MetaD-MaxCal combination creates an efficient strategy for estimating the thermodynamic and kinetic properties of GPCR activation at an affordable computational cost.
G protein-coupled receptors (GPCRs) are broadly expressed cell surface receptors whose
functional role is to transmit signals from the exterior to the interior of the cell through
recognition of different ligands, such as bioactive peptides, amines, nucleosides, and
lipids. It is therefore not
surprising that about 30% of drugs available in the market today target GPCRs for the
purpose of alleviating the effects of a wide range of diseases and conditions. Understanding the mechanistic, thermodynamic,
and kinetic details of ligand-induced GPCR activation and consequent signaling through
intracellular G proteins or β-arrestins is very important as it informs the rational design
of improved therapeutics. However, this remains a fairly challenging undertaking both for
experimental and computational approaches.Molecular dynamics (MD) simulations are a particularly valuable tool to probe the level of
atomistic detail that is necessary to identify testable hypotheses of molecular determinants
that are responsible for GPCR allostery, energetics, and kinetics. However, simulating
ligand-induced activation of GPCRs in a realistic lipid-water environment is computationally
expensive and one cannot rely on standard MD alone to obtain converged free-energy
landscapes, as the largest conformational changes accompanying receptor activation occur at
the time scale of hundreds of microseconds to milliseconds.Enhanced sampling algorithms can help speed up these time scales as we demonstrated a few
years ago by applying well-tempered metadynamics (MetaD) with path collective variables (CVs) to study the ligand-induced modulation of the free-energy
landscape of two prototypic GPCRs, specifically rhodopsin and the β2-adrenergic receptor embedded in an explicit
1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC)/10% cholesterol bilayer. More
recently, using a high-throughput molecular dynamics (HTMD) adaptive sampling protocol run on large distributed computational
resources, we demonstrated that without
introducing bias potentials, a total simulation time of ∼240 µs was still
necessary to achieve convergence of the free-energy landscape of a morphine-bound μ-opioid
receptor (MOR) system embedded in an explicit POPC/10% cholesterol bilayer and to build
reliable Markov State Models (MSMs) of the system’s activation dynamics. This approach
successfully revealed the presence of two distinct metastable regions of the conformational
space in addition to metastable regions comprising crystal-like inactive and active
conformations of MOR. Furthermore, it unveiled ligand-specific kinetic rates between these
regions and, combined with information theory, it elucidated molecular details of the allosteric transmission of
the signal across the receptor. However, the conformational sampling portion of this
approach is still computationally very expensive and often requires exclusive computational
resources.Here, we propose a computational strategy that combines well-tempered MetaD using path
CVs with Maximum Caliber (MaxCal) and
n-body information theory (nBIT) to
efficiently study ligand-induced activation of GPCRs at the atomistic scale. Using
MetaD-derived free energies as an input for Maximum Caliber (MaxCal), we built MSMs and estimated kinetic rates of
morphine-induced MOR activation. Moreover, the n-body information theory method nBIT was
used to elucidate molecular details of MOR allosteric modulation leading to signal
transmission. The efficiency and accuracy of the proposed approach were evaluated by
comparing these results with those obtained with a more expensive HTMD adaptive sampling
protocol run on distributed computational resources.
COMPUTATIONAL DETAILS
Setup of ligand-free active and inactive MOR systems
Inactive and active three-dimensional models of the MOR were built based on corresponding
available experimental structures at the start of this work (PDB 4DKL and 5C1M, respectively). For the active system, the nanobody was removed,
the N-terminal region was truncated at residue S641.40, and the missing
residues of helix 8 (H8) were modeled using as a template the atomic coordinates of H8 in
the inactive crystal structure. The receptor termini were capped with an N-terminal acetyl
and a C-terminal N-methyl amide using Maestro. An 80 × 80 Å2 POPC/10%
cholesterol bilayer with TIP3P water was generated using the CHARMM-GUI webserver and equilibrated according to its
standard protocol, comprising a 20 ns unrestrained MD simulation of the membrane. The
inflategro script,
edited to use a deflation ratio of 0.97 instead of the default 0.95, was used to embed the
final MOR structures in the membrane mimetic environment. After system neutralization with
NaCl at a concentration of ∼150 mM, the entire simulation system of approximately 54 000
atoms with dimensions of 75 × 75 × 100 Å3 was minimized and equilibrated for 50
ns using the CHARMM36 force field for protein, lipids, and ions. All simulations reported in this
work were carried out using the GROMACS 5.1.4 package.
Setup of the morphine-bound active MOR system
Force-field parameters for morphine were obtained from the CHARMM General Force Field
(CGenFF) ParamChem webserver and
subsequently optimized and verified following established protocols. The initial binding pose of morphine was obtained by
aligning the ligand’s alkaloid scaffold with the equivalent one in the morphinan agonist
BU72, as seen in the 5C1M crystal structure. The system was subjected to minimization followed by a 1 ns run in
the NVT ensemble with restraints on lipids, receptor, and ligand. Restraints were
gradually relaxed over 20 ns, and a final 40 ns equilibration run was carried out without
restraints.All production runs were carried out in the NPT ensemble at 300 K and 1 bar with periodic
boundary conditions. The Parinello-Rahman and velocity rescaling algorithms were used for pressure and temperature coupling,
respectively. A time step of 4.0 fs coupled with hydrogen mass repartitioning was used
alongside the standard leapfrog algorithm and the LINCS algorithm. Electrostatic interactions were handled via the particle-Mesh
Ewald method and a Verlet scheme with a cutoff of 1.2 nm, while the van
der Waals modifier force switch
was set to 1.0 nm.
Adiabatic biased MD simulations
To obtain an initial representation of the transition between the inactive and active
crystal structures of MOR, we carried out adiabatic biased MD (ABMD) simulations on the ligand-free receptor using Plumed
2.1. Specifically, 10 simulations
were carried out starting from the equilibrated active MOR crystal structure guided
towards the receptor inactive conformation, and 10 simulations were run in the opposite
direction. In ABMD simulations, the system is biased with an elastic potential towards a
final value of a chosen CV. The bias, however, acts on the system only when the distance
of the current value of the CV to its final value is larger than its previous minimum
distance, allowing the system to evolve undisturbed otherwise. The difference between the
contact map calculated during the simulation (with a stride of 10 simulation steps) and
the contact map of the final structure (i.e., that of the inactive or active MOR
structures, depending on the direction of the simulation), defined by a subset of contacts
relevant to the activation process, was used as a CV. Specifically, a switching function
in the form ofwas used to describe contacts between polar
atoms or the side chains of residues within the MOR transmembrane region. In this
equation, r is the distance between two polar atoms or between the center
of mass of apolar side chains, and r0 was set to 6.5 Å or 4.5
Å for side chain contacts or polar contacts, respectively. Contacts for which
, i.e., whose distance in the active and inactive structures
is significantly different, were included in the contact map definition
R =
s(r), and the matrix
norm , where RRef is either the
contact map for the active or for the inactive receptor structure, was used to drive the
ABMD simulations. The simulations were performed in a stepwise fashion with the force
constant switched from 0.1 to 15 over the span of 35 ns to ensure a smooth transition
between receptor conformations.
Path definition for MetaD simulations
In order to define a path based on information from contact maps, a pairwise distance
matrix was built for the complete set of frames from the ABMD
trajectories by running an in-house python script on contact maps calculated with the
Plumed 2.1 plugin. In order to select
frames optimally describing the low free-energy de-activation/activation pathway, we
performed simulated annealing over paths of constant length N = 10 starting from a
random initial selection of 8 points between the two furthest points in the set,
j0 and j,
respectively, and minimizing the functionwhere i,
i′ enumerate the edges along the path, is the length of the ith edge,
ρ represents the density of neighboring
points around the beginning point j, and k is the
effective temperature for the annealing procedure, which was slowly decreased over the
span of the annealing iterations. Minimization of this function, which was performed using
an in-house python script, allows us to (a) find the most densely populated regions along
the activation path, (b) ensure that each edge is of similar length, and (c) verify that
the overall length of the path is the shortest possible while fulfilling the two prior
conditions. The value of W converged to yield paths with edge length
variances below 0.05, ensuring that the chosen frames could be used to define smooth path
collective variables that describe the position S(R) of
the system along the pathway and the distance Z(R) from
this path. These were defined, respectively, aswhere
R refers to the values of the contact
maps for each point in the selected path and λ, a parameter to aid the
creation of a smooth description of the path, is set to ∼2.0.
MetaD simulations
We used well-tempered MetaD with the path collective variables S and
Z, as implemented in Plumed 2.1, to enhance the exploration of the CV space by adding a
history-dependent bias potential at regular intervals to discourage the system from
visiting previously explored regions of the conformational space. Specifically, the bias
potential acting on a conformation with contact map R iswhere t′ is a multiple of
the deposition time τ, s
are the CVs over which the bias is being deployed (S and
Z of the path CVs in our case), and the
σ are the standard deviations of the
Gaussian bias. In well-tempered MetaD, the height of the Gaussian bias,
w, is adjusted according
towhere ΔT is a parameter in
units of temperature, k is the Boltzmann
constant, and w is the initial height of the Gaussian bias potential.
This adjustment allows the total bias potential to smoothly converge in time so that the
free energy of the system can be calculated as the limitwhere T is the temperature
of the simulation. The temperature ratio
ΔT/(T+ΔT) in Eq. (6) is referred to as the “bias factor” and
ensures that the relevant free-energy barriers can be overcome within the time scale of
metadynamics simulations. Here, we run two sets of simulations of the morphine-bound MOR
system, where the bias factor was set to 12, the deposition rate was set to
τ = 5 ps, and either (σ =
0.3, σ = 0.15) or
(σ = 0.1,
σ = 0.05) were used. Both simulations
were run for 1.5 µs. Convergence was checked by ensuring that the
standard deviation of the free-energy differences converged to below 20 kJ/mol within the
last 150 ns for both simulations. We present the combined free energies and the standard
deviations of the free-energy differences in Figs. 1(a) and 1(b) of the
supplementary
material, respectively, where the two simulations,
totaling 3 µs, were combined using weights of 1:.
Free-energy calculations
To derive distributions of order parameters other than the path collective variables, it
is necessary to remove the effect of the bias on the trajectories of sampled
conformations. This can be achieved using the reweighting method introduced by Tiwary and
Parrinello and implemented via
in-house python scripts, which allows us to reconstruct a time-independent free-energy
landscape for any function of the coordinates of the system. Specifically, this method was
used to recast the free-energy landscape as a function of two parameters that are relevant
to the activation process, namely, the distance between the Cα atoms of the
residues R1653.50 and T2796.34 and the root mean square deviation
(RMSD) of Cα atoms of the NPxxYA motif (residues N3327.49 to
A3377.54) to the inactive crystal structure of MOR. We also used this method
to reweigh the distributions of the variables used in the n-body information theory and
MaxCal analyses presented in Secs. II G and II H.
Information theory analysis
We applied the n-body information theory nBIT method to reweighted metadynamics trajectories using in-house
python scripts to study the contribution of each receptor residue to the transmission of
information between the ligand binding pocket and the intracellular region of the
receptor. To this end, we limited our analysis to three classes of variables: (a) the
first two principal components of the positions of the heavy atoms of the side chains
inside the ligand binding pocket (PC1,
PC2), (b) the two CVs used to represent the activation
process, namely, the TM3-TM6 distance between the Cα atoms of residues
R1653.50 and T2796.34 and the RMSD of the NPxxYA motif (NPxxYA
RMSD) from the MOR inactive crystal, and (c) the Cartesian coordinates
(x, y, z) of the Cα atoms of
each receptor transmembrane residue.To calculate the co-information value, we built 5-dimensional free-energy landscapes for
the aforementioned parameters using the reweighting scheme described in Sec. II F.
Using these free energies, we calculated the (Shannon) entropy of a set of degrees of
freedom X aswhere
p(x) is the probability
distribution of x ∈ X. The
co-information can then be calculated for any set of variables aswhere MI is the mutual information, defined
as MI(X, Y) = H(X) +
H(Y) − H(X,
Y), and the conditional mutual information and conditional entropy are
defined as MI(X|Z) =
H(X|Z) +
H(Y|Z) −
H(X,Y|Z) and
H(X|Z) =
H(X,Z) −
H(Z), respectively. Using these definitions, it is
easy to see that the 3-body co-information is fully symmetric under permutations of its
three variables and that Eq. (8) can be
recast asA positive value of CI(X,
Y, Z) suggests that the sum of the information on
Z gained from knowledge of either X or
Y [i.e., MI(X, Z) + MI(Y, Z)] is
larger than the information on Z gained from knowing both
X and Y [i.e., ]; hence, there is redundant information pertaining to
Z in X and Y. This can be described
as a common cause structure; the correlation between X and
Y is partially explained by the value of Z. By the
same logic, a negative co-information value suggests that knowledge of both
X and Y provides additional information regarding
Z. Co-information values calculated in this way are very sensitive to
the way the histograms are built, and it is therefore crucial to use the same bin sizes
and histograms while building the multidimensional histograms for each residue, ensuring
that the ranking of the co-information values for the residues is preserved.
Maximum caliber kinetic model
By design, the MaxCal approach, which
is the maximum entropy principle applied to dynamic quantities, allows for the
construction of the most probable kinetic model that is compatible with a given
free-energy landscape and with constraints that depend on the temporal evolution of the
system. For a given set of stationary probabilities
π, we maximize, using in-house python
scripts, the path entropy, , with respect to the Markov model transition probabilities
p,under constraints that fix the average
value of dynamical quantities to specified values . To study continuous stochastic processes, the mean jump
rate is conveniently chosen as one of these dynamical
constraints. With this choice of constraints, it has been shown that the transition probabilities
p that result in maximum caliber are
proportional towhere a and
b are the Lagrange multipliers associated
with and with all other dynamical constraints
, respectively. For short lag-times δt, we
can rewrite W defining a matrix Δ and making explicit the dependence on
the lag-timewhere I is the identity
matrix, and we defined μ δt =
e−. Further imposing the condition of
detailed balance, the transition rates κij can be calculated asand the final transition matrix
isPractically, the Lagrange multipliers
a and b are derived so
that and equal given values N0 and
estimated from simulations or experiments.In order to design a self-contained strategy, we introduce here a way to estimate
N0 and directly from the MetaD simulations. We apply concepts
introduced recently for the calculation of time-lagged independent component analysis
(tICA) correlation matrices and taking
into consideration that the stochastic dynamics under a metadynamics bias can be
rigorously described by an ordinary differential equation with established asymptotic
behavior. Thus, we calculate the
unbiased value of the constraint averageswhere is the value of the collective variable at time
t and T is the total length of the trajectory, by
reweighting the biased MetaD trajectories asHere, the correction function
c(t), which can be calculated from the bias, tends
asymptotically to the irreversible work performed on the system. Since the bias
accelerates the dynamics, the lag-time δt′ must also be rescaled, as
described in Ref. 37, so thatwhere Δt is the trajectory
time step. For validation purpose, the constraints calculated from the MetaD simulations
were compared to those obtained from previously published HTMD adaptive sampling
simulations. Specifically, using the
MSMs built from the HTMD trajectories, we calculated the path ensemble averages of the
distances traveled along each CV describing receptor activation (i.e., change in TM3-TM6
distance and NPxxYA RMSD from the MOR inactive crystal) per unit time and compared these
values to those obtained from MetaD by application of Eq. (16).The kinetic model obtained from the transition matrix
p defined in Eq. (14) was then studied using standard tools for
MSM analysis available in PyEMMA 2.4,
which yielded the mean first passage times (MFPTs) reported herein.
RECAPITULATING THERMODYNAMIC PROPERTIES OF GPCR ACTIVATION
As mentioned in the Introduction, our recent study of the dynamics and kinetics of the
morphine-induced MOR activation process using an HTMD adaptive sampling protocol revealed
two metastable regions in the free-energy landscape, referred to as intermediate I and II,
in addition to metastable regions comprising crystal-like inactive and active receptor
conformations. We showed in that work that these regions contain highly populated
conformational states of MOR with intermediate region II exhibiting a much higher free
energy compared to the other metastable regions. We proceeded to compare these results with
those obtained from the MetaD-based path sampling of the morphine-induced MOR activation
process (see Sec. II) and confirmed the finding of four
main metastable regions.The reweighted free-energy landscape obtained from the MetaD simulation of the
morphine-induced activation of MOR is shown in Fig. 1(a) as a function of CVs describing major conformational changes occurring upon
activation (i.e., TM3-TM6 distance and NPxxYA RMSD from the MOR inactive crystal). In this
figure, overlaid gray dots correspond to the positions of the k-means centers of the
free-energy landscape obtained from our previously published analysis of the HTMD adaptive
sampling simulations of the morphine-induced MOR activation. As evident from this figure, the 3 µs MetaD
simulations we carried out explore a free-energy landscape that is comparable to that
obtained by ∼240 µs HTMD adaptive sampling simulations, with the only
exception of regions with very high free energies (i.e., intermediate region II). We further
calculated the populations of each metastable region, i.e., the crystal-like inactive and
active, intermediate I, and intermediate II regions of the free-energy landscape of the
morphine-induced MOR activation, and found good agreement between the two approaches, as
shown in Fig. 1(b). Taken together, these results
suggest that well-tempered MetaD using path CVs is capable of recapitulating the
thermodynamic properties of complex processes, such as GPCR activation, as seen in more
computationally intensive methods such as the HTMD adaptive sampling protocol.
FIG. 1.
Comparison between thermodynamic properties from MetaD and the HTMD adaptive sampling
protocol. (a) Free-energy landscape of the MetaD simulations of the morphine-bound MOR
as a function of CVs describing major conformational changes upon receptor activation
(i.e., TM3-TM6 distance and NPxxYA RMSD from the MOR inactive crystal). Overlaid gray
dots refer to the positions of the k-means centers of the free-energy landscape obtained
from our previously published HTMD simulations of the morphine-induced MOR activation.
(b) Correlation between the populations of each identified metastable region, i.e., the
crystal-like active and inactive, intermediate I, and intermediate II regions, of the
free-energy landscape of the morphine-induced MOR activation derived from MetaD or HTMD
simulations.
Comparison between thermodynamic properties from MetaD and the HTMD adaptive sampling
protocol. (a) Free-energy landscape of the MetaD simulations of the morphine-bound MOR
as a function of CVs describing major conformational changes upon receptor activation
(i.e., TM3-TM6 distance and NPxxYA RMSD from the MOR inactive crystal). Overlaid gray
dots refer to the positions of the k-means centers of the free-energy landscape obtained
from our previously published HTMD simulations of the morphine-induced MOR activation.
(b) Correlation between the populations of each identified metastable region, i.e., the
crystal-like active and inactive, intermediate I, and intermediate II regions, of the
free-energy landscape of the morphine-induced MOR activation derived from MetaD or HTMD
simulations.
REPLICATING INFORMATION TRANSFER ACROSS THE RECEPTOR
In addition to replicating free-energy landscapes, it is also important to verify that
molecular dynamics details obtained from the MetaD-based strategy proposed herein can be
used to capture how information is transferred from the ligand-binding pocket to the
intracellular G protein-binding region of the receptor. To this end, we calculated (see Sec.
II) the contribution of each receptor residue to the
mutual information between the ligand binding pocket and the intracellular region of the
receptor upon activation. This contribution is quantified by a three-body co-information
value calculated using (a) the first two principal components of the positions of the heavy
atoms of the side chains inside the ligand binding pocket, (b) the two CVs used to represent
the activation process, namely, the TM3-TM6 distance and the NPxxYA RMSD from the MOR
inactive crystal, and (c) the Cartesian coordinates of the Cα atoms of each
receptor transmembrane residue. Co-information values larger than 1 (in absolute value) are
listed in Table I of the supplementary
material.The calculated co-information values from the MetaD and HTMD simulation trajectories are
compared in Fig. 2(a). As mentioned in Sec. II, the more negative the co-information value is for a
given residue, the more significant is that residue’s contribution to the information
transfer between the extra- and intra-cellular regions of the receptor. A visualization of
the location of the most contributing residues to the information transfer in MOR as
calculated from MetaD or HTMD simulations is provided in Figs. 2(b) and 2(c), respectively. As can be seen
in these figures, the two simulation protocols yield similar co-information patterns, with
the majority of highly contributing residues to the information transfer [Table I of the
supplementary
material and red color in Figs. 2(b) and 2(c)] either located in the
lower halves of TM5, 6, and 7 and in H8 or in the extracellular regions of TM1, 6, and 7.
These results suggest that the well-tempered MetaD strategy reported herein replicates the
dynamical information obtained by HTMD simulations and can therefore be employed as an
effective tool for studying allostery in GPCRs.
FIG. 2.
Comparison between co-information values derived from MetaD and HTMD simulations. (a)
Normalized co-information values from MetaD and HTMD simulations plotted for each MOR
transmembrane residue. [(b) and (c)] Normalized co-information values depicted on the
inactive crystal structure of MOR using a color scheme ranging from white to yellow to
red, where red represents the highest contributing residues to the information transfer
between the ligand binding pocket and the intracellular region of MOR, as derived from
MetaD and HTMD simulations, respectively. Co-information values for the loop regions
were not calculated.
Comparison between co-information values derived from MetaD and HTMD simulations. (a)
Normalized co-information values from MetaD and HTMD simulations plotted for each MOR
transmembrane residue. [(b) and (c)] Normalized co-information values depicted on the
inactive crystal structure of MOR using a color scheme ranging from white to yellow to
red, where red represents the highest contributing residues to the information transfer
between the ligand binding pocket and the intracellular region of MOR, as derived from
MetaD and HTMD simulations, respectively. Co-information values for the loop regions
were not calculated.
DERIVING KINETIC PROPERTIES OF GPCR ACTIVATION FROM METAD USING MAXCAL
PRINCIPLE
Deriving kinetic rates directly from standard MetaD simulations is not possible because
transitions are accelerated by a bias potential in these simulations. The recently proposed
infrequent metadynamics strategy
based on transition state theory allows
to extract unbiased kinetic information from biased trajectories, but it requires a reduced
bias deposition rate, as well as multiple validation simulations. Another recently proposed
strategy based on Girsanov reweighting of the transition matrix is promising, but has yet to be validated for large
biological systems. Here, we test the efficiency and accuracy of the MaxCal principle in
extracting kinetic rates from MetaD. Specifically, the MaxCal principle, which is the
maximum entropy principle applied to dynamic quantities, allows one to construct the most
probable, minimally biased kinetic model that is compatible with a given stationary
distribution of the system and with a number of constraints imposed on the system’s kinetics
that are derived here from MetaD. A combination of MetaD with MaxCal was proposed earlier in the
literature to obtain optimal low-dimensional reaction coordinates from a larger set of
candidate collective variables.Although the MaxCal approach has previously been tested on various small peptides, GPCRs are far more complex systems
and an assessment that this method could accurately capture the kinetics of GPCR activation
was necessary prior to its application to MetaD-derived free energies. Thus, we first
applied MaxCal to the free-energy landscape of morphine-induced MOR activation derived from
the more computationally expensive HTMD adaptive sampling protocol. To capture the system’s dynamics as a function of two
commonly used variables that describe the activation process, we constrained the path
ensemble averages of (a) the difference in TM3-TM6 distance between pairs of microstates and
(b) the difference in NPxxYA RMSD from the MOR inactive crystal structure between pairs of
microstates. To compare the kinetic model derived from MaxCal to that obtained from the
direct MSM analysis of the HTMD trajectories, we calculated the path ensemble averages of
the TM3-TM6 distance between pairs of microstates and the NPxxY RMSD from the MOR inactive
crystal structure between pairs of microstates to be used as constraints from the MSM built
from these trajectories (see Table II of the supplementary
material) and then applied them during path entropy
maximization. Despite using only these two constraints, the MFPTs between the four
metastable regions of the free-energy landscape of morphine-induced MOR activation derived
from the HTMD- and the MaxCal-derived MSMs [Figs. 2(a) and 2(b) of the
supplementary
material, respectively] are in excellent agreement, as
shown by their correlation in Fig. 2(c) of the supplementary
material. This observation suggests that MaxCal is capable
of replicating the kinetics of GPCR activation estimated from MSM analysis of the HTMD
simulations.To assess the effectiveness of MaxCal in obtaining from MetaD MOR activation kinetics
comparable to that obtained from the HTMD adaptive sampling protocol, we recalculated the values of the aforementioned path
ensemble averages from the MetaD simulations. These values are in good agreement with those
obtained from the HTMD MSMs (see Table II of the supplementary
material). These MetaD-derived averages were then used,
together with the MetaD-derived free energies, to constrain the path entropy during its
maximization. The MFPTs between the most populated metastable states obtained from the
HTMD-derived MSMs and those obtained by MaxCal from the MetaD runs are shown in Figs. 3(a) and 3(b),
respectively. As shown in this figure, the MFPTs from MetaD simulations are, on average,
within a factor of 2 of the corresponding HTMD adaptive sampling results. By contrast, the
MaxCal approach failed to replicate the kinetic behavior of the system’s intermediate region
II, which contains very high free-energy areas for NPxxYA RMSDs larger than 4 Å.
FIG. 3.
Comparison between the kinetic properties of MOR activation derived from MetaD and HTMD
simulations. The mean first passage times (MFPTs) obtained for the most populated
regions of the free-energy landscapes of morphine-induced MOR activation from (a) the
MSMs built using the HTMD adaptive sampling runs and (b) the MaxCal MSMs built using the
MetaD runs.
Comparison between the kinetic properties of MOR activation derived from MetaD and HTMD
simulations. The mean first passage times (MFPTs) obtained for the most populated
regions of the free-energy landscapes of morphine-induced MOR activation from (a) the
MSMs built using the HTMD adaptive sampling runs and (b) the MaxCal MSMs built using the
MetaD runs.Given the difficulty associated with extracting kinetic rates from standard MetaD
simulations, it is gratifying that the MaxCal approach provides a means for successfully
estimating kinetic rates for metastable regions that are most relevant to GPCR activation.
We propose that additional constraints derived experimentally may further increase the
accuracy of kinetic properties obtained by this integrated MetaD-MaxCal strategy. However, a
systematic assessment of the effect of different constraints on pathway determination and
MFPT values will be required to test this hypothesis while providing further validation of
the accuracy of kinetics properties derived from the proposed methodology.
SUMMARY AND CONCLUSIONS
Our results confirm that well-tempered MetaD with path collective variables is a fast and
reliable method for studying the thermodynamic properties of GPCR activation. Furthermore,
the propagation of information across the receptor calculated from MetaD simulations is
comparable to that derived from adaptive sampling protocols, suggesting that this simulation
method also offers a more efficient strategy for studying allostery in GPCRs. Finally, we
show that the kinetic properties of complex systems can be derived from MetaD simulations
using the MaxCal principle, and these results are comparable to those obtained from more
computationally expensive adaptive sampling protocols.In conclusion, MetaD can be employed as an efficient method for studying ligand-induced
activation mechanisms in GPCRs by reducing the necessary computation time by ∼2 orders of
magnitude, and its combination with MaxCal allows to derive kinetic properties that are in
agreement with those obtained by computationally more expensive methods.Tables reporting co-information details, as well as path ensemble average values used as
constraints for MaxCal estimations, are available as supplementary
material alongside plots showing the convergence of the
metadynamics simulations and the validation of the MaxCal kinetic model based on the HTMD
free energy.
Authors: Mariona Torrens-Fontanals; Tomasz Maciej Stepniewski; David Aranda-García; Adrián Morales-Pastor; Brian Medel-Lacruz; Jana Selent Journal: Int J Mol Sci Date: 2020-08-18 Impact factor: 5.923