Thomas Tarenzi1,2,3, Vania Calandrini3, Raffaello Potestio4,5, Paolo Carloni2,3,6. 1. Computation-based Science and Technology Research Center CaSToRC , The Cyprus Institute , 20 Konstaninou Kavafi Street , 2121 Aglantzia, Nicosia , Cyprus. 2. Departments of Physics , Faculty of Mathematics, Computer Science and Natural Sciences, Aachen University , Otto-Blumenthal Straße , 52062 Aachen , Germany. 3. Computational Biomedicine, Institute for Advanced Simulations IAS-5 and Institute of Neuroscience and Medicine INM-9 , Forschungszentrum Jülich , 52428 Jülich , Germany. 4. Department of Physics , University of Trento , via Sommarive 14 Povo , Trento 38123 , Italy. 5. INFN-TIFPA, Trento Institute for Fundamental Physics and Applications , I-38123 Trento , Italy. 6. JARA-HPC, Jülich Supercomputing Center , Forschungszentrum Jülich , 52428 Jülich , Germany.
Abstract
G-protein-coupled receptors (GPCRs) constitute as much as 30% of the overall proteins targeted by FDA-approved drugs. However, paucity of structural experimental information and low sequence identity between members of the family impair the reliability of traditional docking approaches and atomistic molecular dynamics simulations for in silico pharmacological applications. We present here a dual-resolution approach tailored for such low-resolution models. It couples a hybrid molecular mechanics/coarse-grained (MM/CG) scheme, previously developed by us for GPCR-ligand complexes, with a Hamiltonian-based adaptive resolution scheme (H-AdResS) for the solvent. This dual-resolution approach removes potentially inaccurate atomistic details from the model while building a rigorous statistical ensemble-the grand canonical one-in the high-resolution region. We validate the method on a well-studied GPCR-ligand complex, for which the 3D structure is known, against atomistic simulations. This implementation paves the way for future accurate in silico studies of low-resolution ligand/GPCRs models.
G-protein-coupled receptors (GPCRs) constitute as much as 30% of the overall proteins targeted by FDA-approved drugs. However, paucity of structural experimental information and low sequence identity between members of the family impair the reliability of traditional docking approaches and atomistic molecular dynamics simulations for in silico pharmacological applications. We present here a dual-resolution approach tailored for such low-resolution models. It couples a hybrid molecular mechanics/coarse-grained (MM/CG) scheme, previously developed by us for GPCR-ligand complexes, with a Hamiltonian-based adaptive resolution scheme (H-AdResS) for the solvent. This dual-resolution approach removes potentially inaccurate atomistic details from the model while building a rigorous statistical ensemble-the grand canonical one-in the high-resolution region. We validate the method on a well-studied GPCR-ligand complex, for which the 3D structure is known, against atomistic simulations. This implementation paves the way for future accurate in silico studies of low-resolution ligand/GPCRs models.
Membrane proteins constitute
as much as 60% of the overall proteins
targeted by FDA-approved drugs.[1] Among
them, G-protein-coupled receptors (hGPCRs)[2,3] are
the largest family, comprising more than 800 members,[4] and the most important one from a pharmaceutical perspective.[5−7] Unfortunately, the lack of experimental structural information for
most (more than 90%) hGPCRs[8,9] along with low sequence
identity (SI) across these proteins (often below 30%[10,11]) has hampered rational design efforts for many highly promising
hGPCR drug targets. For instance, 99% hGPCRs have an SI < 30% with
bovinerhodopsin,[12] the first solved GPCR
structure.[13] This condition strongly limits
the predictive power of computer-aided structural predictions because
traditional homology modeling techniques applied to proteins with
low SI with their template may lead to rather inaccurate models.[14,15] All-atoms molecular dynamics (MD) simulations, very successful in
refining high-quality homology-based models,[16,17] may fail to improve the predictions and even lead to partial unfolding[18−21] if one starts from structures with highly inaccurate side-chain
orientations as encountered in very low resolution protein models.
An alternative strategy to address this issue consists of including
the minimum number of degrees of freedom of the system for the specific
problem that one has in mind, leaving out unreliable information that
could bias the simulation results.[18,22,23] In this context, hybrid multiscale simulation methods,
transcending a single, uniform resolution, represent a highly optimized
approach to predict ligands poses;[24−27] on one hand, a high-resolution,
atomistic description of the region of interest (which includes the
binding site) allows unveiling of the interactions between the receptor
and the ligand; on the other hand, the surrounding environment can
be described in a coarse-grained, less computationally intensive way.Here, we describe and validate a novel hybrid multiscale approach,
the open-boundary molecular mechanics/coarse-grained (OB-MM/CG) scheme,
tailored for the study of low-resolution GPCR–ligand complexes.
It involves a concurrent multiscale simulation of both the protein
and the water molecules (Figure ).
Figure 1
Scheme of the OB-MM/CG setup for hGPCR–ligand complexes.
Protein residues belonging to the MM, I, and CG regions
are represented in blue, orange, and black, respectively. The ligand
is represented in red in the binding site and is modeled at the MM
level. The two atomistic MMw hemispheres are coupled to
the coarse-grained reservoir CGw through the hybrid region
HY.
Scheme of the OB-MM/CG setup for hGPCR–ligand complexes.
Protein residues belonging to the MM, I, and CG regions
are represented in blue, orange, and black, respectively. The ligand
is represented in red in the binding site and is modeled at the MM
level. The two atomistic MMw hemispheres are coupled to
the coarse-grained reservoir CGw through the hybrid region
HY.(i) The dual-resolution representation
of the protein[28] is based on a scheme previously
developed in
our team,[22,28−30] successfully applied
to predict binding poses in several low-resolution GPCR–ligand
complexes.[18,19,23,31] It aims at preserving an atomistic description
of the binding pocket and of the neighboring residues (MM region,
described with the Gromos force field[32]) while leaving to a coarse-grained description the rest of the protein[22] (CG region, where each residue is modeled through
a CG bead centered on the Cα and interacting with
the other beads through a Go̅-type potential[33]). An intermediate region (I) between the
MM and the CG domains ensures the coupling between the two levels
of description and the protein backbone integrity. I residues are described atomistically, and their Cα and Cβ atoms interact with CG beads according to
the CG potential. The selection of the atoms in the MM, I, or CG regions does not require updates during the simulation. The
presence of the membrane is implicitly modeled as a series of boundary
potentials (Figure ): two repulsive potentials placed at the level of the lipids’
heads prevent penetration of water molecules in the membrane space,
while a softened Lennard-Jones-like potential (2-1) acting on transmembrane
residues’ Cα mimics the interactions between
the protein and the membrane.(ii) As opposed to the original
MM/CG scheme, where hydration around
the binding site is taken into account by introducing a droplet of
atomistic water molecules confined through a repulsive potential preventing
water evaporation (Figure SI 1), in the
new implementation we adopt a multiscale representation of the solvent,
based on an adaptive resolution scheme[34−37] in the Hamiltonian formulation
(H-AdResS).[38] This allows water molecules
to freely diffuse between fully atomistic (MMw) and coarse-grained
(CGw) domains, maintaining a uniform density between the
two while changing on-the-fly their resolution. The MMw regions, where atomistic water is described by the SPC/E potential[39] (VwMM), are shaped as hemispheres capping the extracellular and intracellular
domains of the protein, whereas coarse-grained water is present outside
of these boundaries in a bigger box (CGw). Coarse-grained
water is modeled by beads coinciding with the center of mass (CoM)
of each molecule, interacting through a potential, VwCG, derived from fully atomistic simulation
of pure water at the same state point through iterative Boltzmann
inversion.[40] The smooth coupling between
the two domains is achieved by introducing an intermediate hybrid
region (HY), where the potentials describing the MMw and
CGw models are linearly combined through a switching function
λα = λ(Rα), which depends on the instantaneous position Rα of the CoM of the α water molecule. Specifically,
λα smoothly changes from 1 to 0 when moving
from MMw to CGw boundaries. A correction term
is added to water molecules in the HY region to ensure uniform density
across the MMw, HY and CGw domains. This term
preserves a constant chemical potential,[41] leading to the simulation of a grand canonical ensemble in the high-resolution
region. In this respect, the CGw region can be regarded
as a reservoir of coarse-grained water molecules, and we call the
overall setup open-boundary (OB)-MM/CG (see the Theory section for a more complete description of the method).Compared
to the previous MM/CG method, the new OB-MM/CG enables
thus a priori rigorous ligand binding free energy calculations.[41] Moreover, by ensuring the free diffusion of
water between the binding cavity and the CGW reservoir,
it prevents possible artifacts due to the solvent confinement in the
hemisphere while keeping the number of the additional degrees of freedom
smaller than a fully atomistic simulation. Given the recognized role
of water for ligand binding in membrane receptors[42−47] and the overall interplay between water dynamics and protein properties,[48] this is likely to improve the description of
interactions between the ligand and the protein. In this paper, we
tackle in particular this point.It is worth noticing that the
applicability of H-AdResS for the
simulation of water solvating biomolecules has been recently assessed[49] using as test systems fully atomistic cytoplasmic
proteins in the center of a spherical MMw region (OB-MM).
However, it has never been applied in the presence of a dual-resolution
protein and, importantly, never in the presence of a discontinuity,
here represented by the repulsive walls mimicking the membrane, which
break the spherical symmetry of the MMw region of H-AdResS
tested in the previous OB-MM implementation.[49] The higher pressure attained by the CGw model with respect
to the atomistic one requires a correction of the repulsive membrane
potential in order to impose the correct virial pressure on the CGw domain[50−52] and guarantee a uniform density profile in the vicinity
of the membrane planes. Details on the derivation of this correction
term are given in the Theory section.The paper is organized as follows. After the Theory section, where the OB-MM/CG method is presented in detail, and a Methods section with the simulation details, we
validate the approach by comparing structural and dynamical properties
computed with the OB-MM/CG scheme with those from all-atom MD simulations
of a GPCR–ligand complex, namely, the β2-adrenergic receptor
in complex with its inverse agonist S-carazolol.[53] This validation represents the necessary preliminary
step toward future usage of the method for the calculation of binding
free energies, on which we are currently working. For comparison,
we present results obtained from simulations performed using the original
MM/CG scheme as well.
Theory
The total Hamiltonian of
the system readsK represents
the total kinetic
energy. We now describe the terms associated with the potential energy
of the system.Vp is the protein’s
potential
energy[22]The first term on the right-hand
side of eq describes
the MM and interface
(I) regions of the protein, along with the ligand.
The function of the I region is to mechanically couple
the MM and CG domains of the protein. MM and I regions
are described by the Gromos43a1 force field,[32] widely validated for protein simulations in the MM/CG framework.[19,22,31] The same potential is used for
the coupling between the MM and I regions (VpMM/). VpCG represents the CG protein potential
energy. The mapping employed for the coarse-graining procedure reduces
each residue to a singe CG bead, centered on the Cα. VpCG is a Go̅-type potential.[33] It integrates out the degrees of freedom defining the chemical nature
of the CG residues while preserving the folded structure of the protein
and its characteristic fluctuations by stabilizing the native contacts.
It includes both bonded and nonbonded interactions between CG protein
beads; the former are modeled as a quartic potential and the second
as a Morse-type potential.[22] The same scheme
is applied to model the interactions between the CG and I regions. In particular, the nonbonded potential is applied here
not only between Cα’s but also between Cα belonging to CG residues and Cβ belonging
to I residues.According to the H-AdResS[38] scheme,
water is described by the potential energy term Vw, which includes the contributions from the atomistic
water molecules in the MMw region (VwMM), and those from
the coarse-grained water molecules in the CGw region (VwCG). Vw reads[38]As pointed
out in the introduction, VwMM is described by the SPC/E water potential
energy,[39] while VwCG is obtained through
iterative Boltzmann inversion.[40] The switching
function λα = λ(Rα) depends on the position Rα of the
CoM of the α water molecule.[38] Its
value is 1 in the MMw region,
0 in the CGw region, and smoothly changes from 1 to 0 in
the HY region. The term represents
a correction that is applied
independently on each water molecule in the HY region. It has the
double purpose of (1) removing on average the effect of the spurious
force that emerges as a consequence of the linear combination between VwMM and VwCG(38,41) and (2) correcting for the density imbalance
across the MMw and CGw regions.[54] Specifically, readsHere
ΔF(λα) and Δp(λα) represent, respectively, the
Helmholtz free energy difference and
the pressure difference between a system at hybrid resolution λα and the CGw system at resolution λα = 0. N is the number of water molecules,
and ρ0 is the reference density of water. Correspondingly,
in each subdomain, the pressure attains the value at which that model
(either atomistic or coarse-grained) gets the target density.[54]The term Vp/w describes protein/water
interactions. Only MMw water molecules are in contact with
the protein. The interactions between atomistic water and the atomistic
region of the protein are described by the standard atomistic force
field, while the interactions between atomistic water and CG residues
are modeled with a Lennard-Jones potential so as to reproduce the
excluded volume around the Cα of the CG residues.[55]The last term in eq , Vsurf, is the
potential mimicking the
membrane effects. Specifically, it contains a term aiming at reproducing
the interactions between the protein and the membrane and a term preventing
water penetration in the transmembrane space. The first is represented
by a softened Lennard-Jones-like potential (2-1), with the minimum
at a distance rp from a virtual surface
enveloping the transmembrane portion of the protein, acting on the
Cα of the transmembrane protein residues and on each
atom of TRP and TYR residues.[22] This term
constrains the shape of the protein while providing a good degree
of flexibility. The second one is a repulsive term proportional to
1/r, where r is the distance from
virtual planar walls coinciding with the heads of the lipid bilayer.[22] These virtual walls represent boundaries that
introduce spatial inhomogeneity in the solvent. Because of the higher
pressures attained in the CGw region with respect to the
atomistic one, the water confinement achieved by applying the repulsive
force mimicking the membrane is less effective in the CGw region than that in the MMw one. Therefore, closer to
the repulsive membrane surface, CGw water expands more
than MMw water, leading to density fluctuations in the
vicinity of the membrane upon passing from CGw to MMw regions (Figure a,c). This requires the introduction of a correction of the
repulsive boundary potential in order to impose the correct virial
pressure on the CGw domain.[50−52] The correction of the
repulsive force is proportional to the gradient of the difference
between the target density profile (resulting from an atomistic simulation)
in the direction perpendicular to the membrane, ρt(r), and the current CGw density profile,
ρCG(r)[56]where r is the distance
to
the planar wall. In this way, a uniform depletion layer in proximity
of the repulsive planes of the membrane is guaranteed (Figure b,c), and the uniform bulk
density across MMw, HY, and CGw is preserved
(Figure SI 2). In the latter, only small
fluctuations in the HY regions, smaller than 5%, are observed.
Figure 2
Snapshots with
the uncorrected (a) and corrected (b) membrane potentials
in OB-MM/CG simulations. The regions parallel to the membrane considered
for the calculation of the planar radial density are highlighted with
a dashed line. (c) Planar radial density computed from the center
of the upper MMw region in a disk of height 2.0 nm just
above the membrane planes, before and after the correction to the
membrane potential. The yellow-shaded area corresponds to the presence
of the protein. The density value lower than 1 g/cm3 computed
in the corrected simulation is attributable to the natural depletion
layer in proximity of the membrane plane.
Snapshots with
the uncorrected (a) and corrected (b) membrane potentials
in OB-MM/CG simulations. The regions parallel to the membrane considered
for the calculation of the planar radial density are highlighted with
a dashed line. (c) Planar radial density computed from the center
of the upper MMw region in a disk of height 2.0 nm just
above the membrane planes, before and after the correction to the
membrane potential. The yellow-shaded area corresponds to the presence
of the protein. The density value lower than 1 g/cm3 computed
in the corrected simulation is attributable to the natural depletion
layer in proximity of the membrane plane.Here we stress that the differences between the new OB-MM/CG
scheme
and the previous MM/CG lie in the boundary conditions and in the solvent
representation, while the hybrid scheme used for the protein–ligand
system is identical. Comparing term by term the MM/CG Hamiltonian
with the OB-MM/CG one (eq ), the potential energy internal to the protein, Vp, is identical; the potential term for water, Vw, reduces to the MMw term, Vw,αMM, with λα = 1 (no coarse-grained water
is present in the MM/CG); the Vsurf term
contains an additional boundary potential, proportional to 1/r, to confine water in the hemispherical domain (placed
above the binding cavity) and prevent its evaporation. Moreover, the
repulsive potential from the membrane planes is uniform in the xy-plane, as opposed to the analogous term in the OB-MM/CG
scheme.
Methods
System Setups
The system simulated
is the human β2-adrenergic
receptor (β2-AR) in complex with its inverse agonist S-carazolol (PDB code: 2RH1).[53] The third
intracellular loop (ICL3, between residues 231 and 262), lacking in
the PDB file, was modeled using the Web server HHPred[57] in conjunction with MODELLER,[58] as in ref (22). The
system was first equilibrated by all-atoms MD. To this aim, the protein
was inserted into a bilayer of 1-palmitoyl-2-oleoyl-phosphatidylcholine
(POPC), which represents the most abundant phospholipid in animal
cell membranes.[59] After having solvated
the box with SPC/E water molecules,[39] the
total size of the system amounted to ∼232000 atoms. The protein
is described using the Gromos43a1 force field,[60] while the parameters for the POPC are those by Berger,
Edholm, and Jähnig.[61] The fully
atomistic setup differs from a previous atomistic simulation of the
same system[62] (employed in the validation
of the MM/CG approach[22]) in the choice
of the protein force field and in the composition of the membrane.
The atomic charges for the ligand have been derived by RESP fitting[63−65] using the Gaussian03 package,[66] as in
ref (62). Details on
the simulation parameters for the equilibration step are reported
in the next section.In the OB-MMCG setup, performed using the
equilibrated structure, the radius of each MMw hemisphere, rMM, is set to 3.4 nm, while the width of the
HY region, rHY, is 1.2 nm. The former
is chosen so that the distance between the atomistic protein and the
boundary of the MMw region is at least 1.6 nm, as suggested
in ref (49); the latter,
instead, corresponds to the cutoff used for the nonbonded interactions,
so that water molecules at
the border of the MMw region do not interact directly with
CGwater molecules. The x,y-coordinates
of the center of the MMw hemisphere correspond to the x,y-coordinates of the CoM of the protein.
The distance between the membrane-mimicking planes is 52 Å. As
in previous MM/CG works,[22,31] the MM region in the
OB-MM/CG simulation includes all of the residues at a maximum distance
of 5 Å from the ligand. Specifically, they consist of residues
79–82, 86, 109–118, 164, 165, 193–195, 199–208,
282, 286, 289, 290, 293, 308, and 311–316. The total number
of particles in the OB-MM/CG system is ∼163000 (including the
fourth site on each water molecule, which plays a role only in the
CG interactions).Comparisons are made both with an all-atoms
MD, in order to validate
the OB-MM/CG simulation, and with the original version of the MM/CG
in order to assess possible differences in structural and dynamical
properties. The setup for the all-atoms production runs is the same
as that for the fully atomistic equilibration. In the MM/CG simulation,
the protein is coarse-grained in the same way as in the OB-MM/CG case.
The radius of the hemisphere, rhemi, is
slightly bigger than rMM; in this way,
a similar number of atomistic water molecules (∼2200) can be
accommodated with the right density when the potential preventing
evaporation is switched on. The planar walls are placed as in the
OB-MM/CG system. The total number of particles in the MM/CG system
is ∼8100.
Equilibration and Production Simulations
The equilibration
simulation, fully atomistic, has been performed with Gromacs 5.1.2.[67] First, the all-atoms system underwent 1661 steps
of minimization using the steepest descent integrator. Subsequently,
the temperature was slowly raised to 300 K using the Nosé–Hoover
thermostat;[68] once the final temperature
was reached, we ran a 300 ns simulation in the NPT ensemble, fixing the pressure at 1 bar by means of the Parrinello–Rahman
barostat,[69] with a time constant of 5.0
ps. The pressure coupling was isotropic in the x,y-directions and independent in the z-direction.
In the NPT simulation, a temperature of 300 K was
controlled through a leapfrog stochastic dynamics integrator,[70] with an inverse friction constant of 0.5 ps.
For all of the simulations, a 2 fs integration time step was used,
constraining the hydrogen-containing bonds with the LINCS algorithm.[71] A cutoff of 1.2 nm was used for electrostatics
and van der Waals interactions. Periodic boundary conditions were
used with the minimum image convention. The equilibrated system was
used as a starting structure for the OB-MM/CG, MM/CG, and fully atomistic
production runs. All of these simulations were performed with the
same parameters as the equilibration but in the NVT ensemble. The OB-MM/CG and MM/CG simulations were performed using
customized versions of Gromacs 4.[72] For
the calculation of structural properties, two 10 ns replicas of the
production run were performed on each system with a sampling time
of 10 ps. For calculation of dynamical properties, two replicas, each
2 ns long, were performed with a sampling time of 0.05 ps.
Postprocessing
Analyses
Visual inspection of the trajectories
was made using VMD 1.9.2.[73] Data analyses
were performed with GROMACS utilities, VMD, and in-house scripts.
The 10 ns replicas were used for the calculation of structural properties
and the 2 ns replicas for dynamical properties. Water properties were
calculated on the MMw regions of the OB-MM/CG and MM/CG
simulations and on the corresponding, hemispherical volume of the
all-atom system (with an approach similar to that in refs (22), (49), and (74)). The probability distribution
of the water tetrahedral order parameter was calculated as in ref (75). The reorientation time
correlation function (tcf) of the water OH bonds was defined as ⟨P2[u(0) · u(t)]⟩, where P2 is the
second-order Legendre polynomial and u is the vector
along the OH bond. The characteristic reorientation times were computed
as the integral of the corresponding tcfs. Receptor ligand hydrogen
bonds were identified assuming that an H-bond is formed when the distance
is less than 3.5 Å and the angle is less than 30°. Hydration
of the binding site has been assessed by monitoring the number of
water molecules at a maximum distance of 5 Å from the ligand.
The reorientational tcf of the ligand in the binding pocket was computed
with respect to the angle between the vector crossing the carbazole
ring of the ligand and the z-axis (defined as the
axis perpendicular to the membrane plane).
Results and Discussion
Our approach is validated by comparing structural and dynamical
properties computed with the OB-MM/CG scheme with those from all-atoms
MD simulations of the human β2-adrenergic receptor[76] (β2-AR) rhodopsin-like (or “Class
A”) hGPCR. β2-AR is an important target for a variety
of drugs, including the FDA-approved antiasthma agonist salbutamol.[77] We focus on the complex with the inverse agonist S-carazolol [4-(2-hydroxy-3-isopropylamino-propoxy)-carbazole][78−80] (Chart ), for which
experimental structural information (PDB code 2RH1(53)) and MD studies[62,81] have been reported.
The same analyses have been carried out using the original MM/CG scheme
as well in order to assess possible differences. In the following
analysis, hydration water refers to the water molecules in the hemispherical
MMw region of MM/CG and OB-MM/CG simulations and to those
in an equivalent hemispherical region in the case of all-atoms MD
(Figure SI 3).
Chart 1
Chemical Structure
of the Inverse Agonist S-Carazolola
The orange dashed line represents
the vector coplanar to the carbazole ring that is used for calculation
of the reorientational tcf ⟨P2(cos
θ)⟩.Structural properties of
the hydration water, such as the oxygen–oxygen
and oxygen–hydrogen pair radial distribution functions (Figure SI 4) and the tetrahedral order parameter[75] (Figure ), show that the OB-MM/CG is in closer agreement with fully
atomistic simulations than the MM/CG. We observe here that also the
average orientation times of hydration water molecules are similar
(1.88 ± 0.10 and 1.82 ± 0.03 ps for OB-MM/CG and all-atoms
MD, respectively, as opposed to 2.07 ± 0.03 ps for MM/CG). These
are slightly higher values than that for all-atoms MD of pure water
using the same force field as the one used here (SPC/E)[39] (1.7 ps[82]). This
can be related to the presence of the protein, which is known to slow
down the dynamics of water molecules in its first hydration shells.[83]
Figure 3
Histograms of the tetrahedral order parameter qtet for the water molecules above the binding
site, calculated
using all-atoms, MM/CG, and OB-MM/CG approaches.
Histograms of the tetrahedral order parameter qtet for the water molecules above the binding
site, calculated
using all-atoms, MM/CG, and OB-MM/CG approaches.In accordance with the experimental structural study[53] and the reported all-atoms MD simulations,[62,81] the carbazole ring of the ligand forms, in our simulations, hydrophobic
interactions with the side chains of residues Trp286 (6.48 according
to Ballesteros–Weinstein numbering[84]), Phe289 (6.51), and Phe290 (6.52) (Figure SI 5a,c), as well as directed H-bonds with Asp113 (3.32), Ser203
(5.42), and Asn312 (7.39) (Figure SI 5b,d). The ligand–Asn312 interaction is at times mediated by a
water molecule, as in previous MD studies.[62] Another water molecule bridges the ligand and residue Tyr316 (7.43).
In the MM/CG simulation, in addition to the mentioned interactions,
Ser204 (5.43) competes with Ser203 for binding to the ligand. Histograms
of the H-bond distances involving residues Asp113, Ser203, and Asn312
as obtained from OB-MM/CG, all-atom MD, and MM/CG are displayed in Figure . The OB-MM/CG scheme
reproduces fairly well the atomistic case, both in terms of the average
and variance of the distances, which are instead slightly underestimated
by the original MM/CG in the case of Asp113 and Asn312. In the case
of Ser203, the overall distribution obtained from MM/CG is shifted
to larger distances than that in fully atomistic or OB-MM/CG, very
likely because of the competitive interaction with Ser204.
Figure 4
Histograms
of the distances of hydrogen bonds between the receptor
and the ligand. (a) Distance Asp113(3.32) CCOO–S-car
HOH. (b) Distance Ser203(5.42) OCO–S-car
HNH. (c) Distance Asn312(7.39) OCO–S-car
NNH2+.
Histograms
of the distances of hydrogen bonds between the receptor
and the ligand. (a) Distance Asp113(3.32) CCOO–S-car
HOH. (b) Distance Ser203(5.42) OCO–S-car
HNH. (c) Distance Asn312(7.39) OCO–S-car
NNH2+.We also investigate ligand dynamics through the tcfs relevant
to
the reorientational dynamics of the vector crossing its carbazole
ring (Figure ). In
the system under study, we observe that the reorientational time scales
estimated by atomistic MD and OB-MM/CG are consistent among them,
while MM/CG produces faster relaxation dynamics. The plateau values
of the tcf, related to the degree of restriction of the reorientational
dynamics, show that both atomistic MD and OB-MM/CG lead to less restricted
dynamics compared to MM/CG. These differences are consistent with
the root-mean-square fluctuation (RMSF) values of Cα atoms of the residues stabilizing the ligand through hydrophobic
interactions or hydrogen bonds (Figure ). They show that indeed these residues can in general
undergo smaller fluctuation in MM/CG simulations, indicating more
rigid binding compared to atomistic MD or OB-MM/CG. The spreading
of ϕ and ψ Ramachandran angles estimated through the PADω parameter[85] (Figure SI 6), on the other hand, shows that these
residues maintain comparable torsional plasticity in the three simulation
schemes. Regarding hydration properties, Figure , illustrating the number of water molecules
in the binding cavity, clearly proves that the binding pocket hydration
in the OB-MM/CG simulation is more representative of the fully atomistic
case with respect to the MM/CG simulation. Therefore, although the
original MM/CG scheme reproduces ligand poses fairly well, as proved
here and for a plethora of protein–ligand complexes including
GPCRs,[18,19,22,23,30,31] the more realistic hydration achieved in our system through OB-MM/CG
can explain the observed improvements in the description of the binding
configurations explored by the ligand–protein complex and of
the binding site flexibility in particular.
Figure 5
Reorientational tcfs
of the ligand in the binding pocket. When
the tcfs are fitted with a “model-free” function of
the form C(t) = S2 + (1 – S2)e–, the generalized order parameter S2 takes the values 0.963, 0.972, and 0.963 for
the all-atom MD, MM/CG, and OB-MM/CG case, respectively.
Figure 6
RMSFs of the Cα atoms in the MM region.
The overall
protein backbone flexibility is in line with that of previously published
atomistic and MM/CG models of β2-AR,[22] with differences typically smaller than 0.3 Å. The shaded areas
correspond to the residues stabilizing the ligand through hydrophobic
interactions or hydrogen bonds.
Figure 7
Histograms of the number of water molecules at a maximum distance
of 5 Å from the ligand in the binding cavity.
Reorientational tcfs
of the ligand in the binding pocket. When
the tcfs are fitted with a “model-free” function of
the form C(t) = S2 + (1 – S2)e–, the generalized order parameter S2 takes the values 0.963, 0.972, and 0.963 for
the all-atom MD, MM/CG, and OB-MM/CG case, respectively.RMSFs of the Cα atoms in the MM region.
The overall
protein backbone flexibility is in line with that of previously published
atomistic and MM/CG models of β2-AR,[22] with differences typically smaller than 0.3 Å. The shaded areas
correspond to the residues stabilizing the ligand through hydrophobic
interactions or hydrogen bonds.Histograms of the number of water molecules at a maximum distance
of 5 Å from the ligand in the binding cavity.
Conclusions
This paper introduced
a novel hybrid multiscale approach, named
OB-MM/CG, for the simulation of membrane protein–ligand complexes.
The method couples concomitant atomistic/coarse-grained representations
of the protein and the solvent, while the cell membrane is represented
implicitly through a series of confining potentials. Specifically,
the representation of the protein–ligand complex (based on
the MM/CG scheme) aims at reducing the number of degrees of freedom
of residues far from the binding site, making the approach tailored
for low-resolution protein models where the inaccurate orientations
of side chains would introduce artifacts in all-atoms simulations.
On the other side, the implementation of H-AdResS for the representation
of hydration water leads to the simulation of a rigorous statistical
ensemble—the grand canonical one—in the region at atomistic
resolution, allowing a more accurate description of hydration and
ligand–protein interactions and, in principle, opening the
possibility of binding free energy calculations.In this paper,
we validated the OB-MM/CG method on a well-studied
GPCR, the β2-adrenergic receptor, in complex with its inverse
agonist S-carazolol. Structural and dynamical properties
of both the solvent and the complex in the OB-MM/CG simulations are
in good agreement with results from fully atomistic simulations. Moreover,
some analyses (namely, hydration properties and binding site flexibility)
show appreciable improvement with respect to the previous MM/CG implementation.
These results provide solid ground for the use of the OB-MM/CG scheme
for drug design applications and binding free energy calculations
in GPCR–ligand complexes, where a fully atomistic description
of the receptor is still impaired by the lack of experiment-based
structural information.