Sangyun Lee1, Ruibin Liang1, Gregory A Voth1, Jessica M J Swanson1. 1. Department of Chemistry, Institute for Biophysical Dynamics, James Franck Institute, and Computation Institute, University of Chicago , 5735 S. Ellis Avenue, Chicago, Illinois 60637, United States.
Abstract
An important challenge in the simulation of biomolecular systems is a quantitative description of the protonation and deprotonation process of amino acid residues. Despite the seeming simplicity of adding or removing a positively charged hydrogen nucleus, simulating the actual protonation/deprotonation process is inherently difficult. It requires both the explicit treatment of the excess proton, including its charge defect delocalization and Grotthuss shuttling through inhomogeneous moieties (water and amino residues), and extensive sampling of coupled condensed phase motions. In a recent paper (J. Chem. Theory Comput. 2014, 10, 2729-2737), a multiscale approach was developed to map high-level quantum mechanics/molecular mechanics (QM/MM) data into a multiscale reactive molecular dynamics (MS-RMD) model in order to describe amino acid deprotonation in bulk water. In this article, we extend the fitting approach (called FitRMD) to create MS-RMD models for ionizable amino acids within proteins. The resulting models are shown to faithfully reproduce the free energy profiles of the reference QM/MM Hamiltonian for PT inside an example protein, the ClC-ec1 H(+)/Cl(-) antiporter. Moreover, we show that the resulting MS-RMD models are computationally efficient enough to then characterize more complex 2-dimensional free energy surfaces due to slow degrees of freedom such as water hydration of internal protein cavities that can be inherently coupled to the excess proton charge translocation. The FitRMD method is thus shown to be an effective way to map ab initio level accuracy into a much more computationally efficient reactive MD method in order to explicitly simulate and quantitatively describe amino acid protonation/deprotonation in proteins.
An important challenge in the simulation of biomolecular systems is a quantitative description of the protonation and deprotonation process of amino acid residues. Despite the seeming simplicity of adding or removing a positively charged hydrogen nucleus, simulating the actual protonation/deprotonation process is inherently difficult. It requires both the explicit treatment of the excess proton, including its charge defect delocalization and Grotthuss shuttling through inhomogeneous moieties (water and amino residues), and extensive sampling of coupled condensed phase motions. In a recent paper (J. Chem. Theory Comput. 2014, 10, 2729-2737), a multiscale approach was developed to map high-level quantum mechanics/molecular mechanics (QM/MM) data into a multiscale reactive molecular dynamics (MS-RMD) model in order to describe amino acid deprotonation in bulk water. In this article, we extend the fitting approach (called FitRMD) to create MS-RMD models for ionizable amino acids within proteins. The resulting models are shown to faithfully reproduce the free energy profiles of the reference QM/MM Hamiltonian for PT inside an example protein, the ClC-ec1 H(+)/Cl(-) antiporter. Moreover, we show that the resulting MS-RMD models are computationally efficient enough to then characterize more complex 2-dimensional free energy surfaces due to slow degrees of freedom such as water hydration of internal protein cavities that can be inherently coupled to the excess proton charge translocation. The FitRMD method is thus shown to be an effective way to map ab initio level accuracy into a much more computationally efficient reactive MD method in order to explicitly simulate and quantitatively describe amino acid protonation/deprotonation in proteins.
The hydrated excess
“proton” is in fact a unit of
net positive charge[1] due to a missing electron
that can be passed between and among biomolecules[2] in a seemingly simple dance of charge neutralization. However,
the molecular nature of this dance is complicated, involving a dynamic
charge delocalization between molecules and a constant restructuring
of covalent and hydrogen bond topology. Understanding and being able
to characterize the migration of excess protons is important given
the many roles that proton transport (PT) plays throughout biology.
Virtually all biomolecules are sensitive to pH, and many incorporate
PT into their functional mechanisms, including transporters, proton
pumps, proton channels, and enzymes. For example, the bacterial H+/Cl– antiporter ClC-ec1, a homologue of
mammalianClC antiporters, energetically couples the transmembrane
transport of two Cl– ions and one proton.[3,4] The mammalian ClCs have many physiological functions, including
maintenance of the membrane potential, regulation of transepithelial
Cl– transport, and control of pH in the cytoplasm
and intracellular organelles.[5−9] Cytochrome c oxidase (CcO), a
proton pump in the respiratory chain of mitochondria and bacteria,
reduces oxygen to water and utilizes the released free energy to pump
protons across the membrane, contributing to the transmembrane electrochemical
potential gradient that is necessary for ATP synthesis.[10−13] The influenza A M2 proton channel protein[14] transports the protons across the influenza virus membrane and triggers
the dissociation of the viral matrix proteins, which is an essential
step in the influenza virus replication cycle.[15] This is a short list of the many systems in which PT plays
a role in a functional mechanism. For such systems, the ionizable
amino acids that influence the PT pathways often play an active role
via transient protonation and deprotonation. For example in the ClC-ec1
antiporter, an intracellular-facing glutamic acid, E203, shuttles
an excess proton to an extracellular-facing glutamic acid, E148, during
transmembrane PT.[16,17] In CcO, the
highly conserved glutamic acidE286 is the key for both transmembrane
proton pumping and PT for the chemical reaction.[18,19] In the influenza A M2 channel, a highly conserved tetrad of histidine
residues (H37) is responsible for the pH-dependent channel activation
and proton selectivity.[20]Computational
approaches can play an important role in the investigation
of PT mechanisms in proteins, adding molecular level insight as well
as increased temporal and spatial resolution to experimental data.
However, it is very challenging to explicitly model
the PT process, even in simple bulk water solution, because it involves
charge defect delocalization, Grotthuss shuttling (proton hopping),
and solvent reorganization. Moreover, the migration of an excess proton
in proteins and other confined spaces can be nontrivially coupled
with changes in the hydration along the PT pathway.[21−23] As will be
discussed later, water molecules move in/out of internal protein cavities
in response to the excess positive charge as it moves between water
molecules and ionizable amino acids. Therefore, to more completely
describe PT in biological systems, a computational method should (1)
explicitly treat charge defect delocalization and Grotthuss shuttling
of the excess proton(s) between water molecules and ionizable amino
acids undergoing protonation/deprotonation; (2) allow exchange of
water molecules between different protein internal cavities as well
as between those cavities and bulk solution; and (3) be computationally
efficient enough to achieve sufficient sampling of the charge translocation
and protein and solvent configuration space for large-scale biomolecular
systems, including protein, solvent, ions, and (where needed) the
membrane. Quantum mechanical approaches, such as ab initio molecular dynamics (AIMD) or quantum mechanics molecular mechanics
(QM/MM) MD methods, can in principle explicitly treat the reactive
nature of amino acid protonation/deprotonation and the Grotthuss hopping
phenomenon. However, their computational expense limits the application
of the former to small systems much smaller than proteins and restricts
the free energy sampling of the latter when applied to large biomolecular
systems. A lack of sufficient free energy sampling generally leads
to results with large systematic errors that depend on the initial
state of the system. In addition, QM/MM methods without adaptive partitioning[24,25] prohibit the exchange of MM and QM atoms across the QM/MM boundary,
which is unphysical. Although adaptive QM/MM partitioning methods
enable the exchange of MM and QM atoms across the boundary, they can
suffer from both inaccuracy of forces on atoms near the boundary and
a lack of sufficient energy conservation.[26] The QM/MM boundary may be especially problematic when the hydration
along the PT pathway changes in response to the migration of the excess
proton charge defect since this necessitates the exchange of water
molecules between protein cavities and bulk solution, which often
occurs beyond the nanosecond time scale.An alternative approach
that explicitly accounts for the reactive
nature of the hydrated excess proton is the multiscale reactive molecular
dynamics (MS-RMD) method.[2,27,28] This approach describes the migration of an excess proton including
explicit Grotthuss shuttling and charge defect delocalization by evolving
the system on a reactive potential energy surface defined as a dynamical
linear combination of diabatic basis states, as in the earlier multistate
empirical valence bond (MS-EVB) method.[2,29,30] As will be shown in detail later, the newer MS-RMD
approach differs from the older MS-EVB approach primarily because
the underlying reactive force field of MS-RMD is largely derived from
AIMD or QM/MM data via force-matching and other means, rendering it
far less empirical than the original MS-EVB approach. MS-RMD is also
“multiscale” in the sense that quantum information on
the electronic states is variationally bridged upward in scale to
describe the forces felt by the system nuclei in the RMD model. As
shown later as well, MS-RMD is several orders of magnitude more computationally
efficient than MD from QM/MM, while still describing the charge delocalization
and reactive nature of the PT. The MS-RMD method is also not complicated
by the QM/MM boundary issues because the molecules that participate
in the proton charge defect charge delocalization are dynamically
redefined at each time step in such a way that the forces on the atoms
are more continuous. Thus, as water molecules or residues move away
from the excess proton, the forces acting on the atoms gradually transition
to those described by the classical force field. These advantages
make the MS-RMD method (and MS-EVB before it) a powerful tool for
investigating PT in a variety of biological systems.However,
the primary challenge of the MS-RMD method is that it
needs to be properly parametrized in order to faithfully simulate
PT. The parametrization of MS-RMD models is the focus of this work.
In ref (31), an approach
for parametrizing MS-RMD models for ionizable amino acids in bulk
water was developed. In this fitting approach, forces from QM/MM calculations
are bridged via an iterative variational framework into the reactive
MS-RMD model. In particular, the MS-RMD model parameters are obtained
by sampling configurations with a guess MS-RMD model and then fitting
the model parameters to best reproduce forces from QM/MM calculations
that are as accurate as possible. Our fitting procedure was partially
motivated by the work of Wang and co-workers, in which force fields
for liquid water were developed with an adaptive force matching method.[32,33] It also bears similarity to the work of Zhou and Pu on reactive
path force matching.[34] Because of the use
of configurations from condensed phase MS-RMD trajectories, the resulting
model takes into account the condensed phase environment, which is
not captured in gas phase fitting. For a given MS-RMD potential energy
functional form and with restrictions on the ranges of parameter values,
this fitting approach was shown to work well for investigating glutamic
acid (Glu) and aspartic acid (Asp) deprotonation in bulk water.[31] However, one cannot simply use these models
to study the deprotonation of amino acids such as Asp and Glu within
proteins. The electrostatic and hydration environment affecting ionizable
amino acid protonation/deprotonation inside biomolecules can be very
different from that in bulk solution. For example, deprotonated Asp
and Glu residues often form salt bridges with positively charged residues
in proteins, and the surrounding hydration structure is rarely bulk-like.
As a consequence, the pKa values of amino
acids inside proteins are often largely shifted from those measured
in bulk solution. These differences require reparameterization of
the MS-RMD models to fit the protein environment.In this work,
we demonstrated the use of FitRMD parametrization
to generate MS-RMD models for ionizable amino acid in proteins. The
paper is organized as follows. We first outline a framework for fitting
MS-RMD models for amino acid residues in biomolecular systems. We
then give the computational details for following this framework in
two example systems: the ClC-ec1 antiporter and CcO. The ability of the generated MS-RMD models to faithfully reproduce
the free energy profiles (potentials of mean force, PMFs) of the reference
QM/MM Hamiltonian is demonstrated for PT inside the ClC-ec1 channel
in two different states. The importance of sampling and related QM/MM
limitations are also demonstrated in ClC-ec1. Finally, the advantages
of developing efficient MS-RMD models are demonstrated with the calculation
of 2-dimensional free energy profiles (2D-PMFs) of PT coupled to hydration
changes in the central hydrophobic region of CcO.
Methods
Multiscale
Reactive Molecular Dynamics (MS-RMD)
The
MS-RMD (and MS-EVB) method describes electronic delocalization of
the excess proton as a linear combination of distinct states with
different chemical bonding topologies. In the central region of the
ClC-ec1 antiporter (see Figure for the images of the whole protein structure and its central
region), Figure shows
examples of a set of states in certain configurations, where the excess
proton is delocalized into a glutamate residue and surrounding water
molecules. The Hamiltonian is defined aswhere r are the system nuclear
coordinates, h is the
potential energy for state i described by a classical
force field, and h is
the coupling between states |i⟩ and |j⟩. The diagonalization of the Hamiltonian matrix
on the fly gives the energy and eigenvector of the ground state for
every configuration of nuclei r:
Figure 1
(A) Overview of the structure of the ClC-ec1
antiporter and transport
pathways for Cl– (green dashed) and H+ (red dashed). The central region is highlighted in the black box.
(B) Representative configurations of the central region with (left)
Cl–cen absent and (right) present for
the “P” state in the PMFs in Figure . The protein and water molecules (in sticks)
and Cl–cen (in a sphere) are included
in the QM region in the QM/MM calculation of the reference force.
Figure 2
Representative configurations of a set of MS-RMD
states with different
bonding topologies for the same atomic coordinates in the ClC-ec1
antiporter. The configurations represent the three mostly populated
diabatic states (with the highest values of c2(r)), captured at the windows for ζ = 0.8, when Cl–cen is
either absent (A–C) or present (D–F). The excess proton
is attached to E148 (A and D), the first water from E148 (B and E),
or the second water (C and F). The protonated species in each diabatic
state is shown inside the blue dashed line.
(A) Overview of the structure of the ClC-ec1
antiporter and transport
pathways for Cl– (green dashed) and H+ (red dashed). The central region is highlighted in the black box.
(B) Representative configurations of the central region with (left)
Cl–cen absent and (right) present for
the “P” state in the PMFs in Figure . The protein and water molecules (in sticks)
and Cl–cen (in a sphere) are included
in the QM region in the QM/MM calculation of the reference force.
Figure 3
PMFs of PT through the central region of the ClC-ec1 antiporter
with (A) Cl–cen absent and (B) present,
as calculated with MS-RMD (blue) and QM/MM (red). Labels “R”,
“T”, and “P” indicate the reactant state,
the center of the CV range, and the product state, respectively. The
CV ζ varies from 0 to 1 as the
excess proton CEC moves from E203 to E148. The average side chain
positions of four important residues are shown for state “T”
in the MS-RMD systems with (C) Cl–cen absent or (D) present. The solvent positions and proton center of
the excess charge (CEC, yellow sphere) are taken from the last snapshot.
Representative configurations of a set of MS-RMD
states with different
bonding topologies for the same atomic coordinates in the ClC-ec1
antiporter. The configurations represent the three mostly populated
diabatic states (with the highest values of c2(r)), captured at the windows for ζ = 0.8, when Cl–cen is
either absent (A–C) or present (D–F). The excess proton
is attached to E148 (A and D), the first water from E148 (B and E),
or the second water (C and F). The protonated species in each diabatic
state is shown inside the blue dashed line.PMFs of PT through the central region of the ClC-ec1 antiporter
with (A) Cl–cen absent and (B) present,
as calculated with MS-RMD (blue) and QM/MM (red). Labels “R”,
“T”, and “P” indicate the reactant state,
the center of the CV range, and the product state, respectively. The
CV ζ varies from 0 to 1 as the
excess proton CEC moves from E203 to E148. The average side chain
positions of four important residues are shown for state “T”
in the MS-RMD systems with (C) Cl–cen absent or (D) present. The solvent positions and proton center of
the excess charge (CEC, yellow sphere) are taken from the last snapshot.The forces are evaluated by the
Hellmann–Feynman theorem
and are used to propagate the system in an MD simulation. It is important
to note that the method explicitly treats the excess proton charge
defect delocalization, Grotthuss shuttling, and the polarization effect
associated with the excess proton complex. The resulting MD trajectory
is continuous and deterministic to within numerical error.The
diagonal elements h of the MS-RMD Hamiltonian are given by the potential energy
function of each basis state i. Note that there is
a single excess proton in the system and that either glutamate or
water is protonated at each state. The h corresponding to the state with protonated glutamate
(GLUH) is described aswhere the first
three terms are the inter-
and the intramolecular potentials of protonated glutamate and all
other surrounding molecules, such as waters, other protein residues,
lipids, and ions in the system. They are computed with the CHARMM22
force field,[35] with the exception of the
O–H bond in the carboxyl (−COOH) group of glutamate.
To properly represent its dissociation as a proton transfers from
the carboxyl oxygen to the wateroxygen, the harmonic bond stretch
potential is replaced with a standard Morse potential, UMorse(r):where r is the O–H
bond length, and D, α, and r0 are parameters, which are taken from our previous work.[31] Since the classical force fields between two
protonated forms of glutamate and water do not share a common energy
origin, V is introduced
to compensate the constant energy shift between the two states.In order to correct overestimated electrostatic interaction at
a short distance[36] between opposite charges
on hydronium and deprotonated glutamate, two repulsive terms, V and V, are introduced in h corresponding to the state
with deprotonated glutamate:where R is the
distance between the hydroniumoxygen, O, and the
carboxyl oxygen of glutamate, Oϵ (OE1 and OE2 in the PDB), and R is the distance between each of the three
hydroniumhydrogen atoms, H, and the carboxyl oxygen of glutamate. The functional forms
for the repulsive terms are the same as those used in the MS-EVB3.1
model.[31]B, b, b′, C, and c are fitted parameters, and d0 and d0 are fixed with the same value used in MS-EVB3.1,
which are 2.4 and 1.0 Å, respectively.The off-diagonal
element h for the coupling
between protonated glutamate and water is
given bywhere r is the distance between the donoroxygen of the carboxyl group
of glutamate and the acceptor hydrogen of the adjacent hydronium molecule. c1, c2, and c3 are fitted parameters. h for the coupling between hydronium and
water is the same as the one used in the MS-EVB3.1 model.The
MS-RMD (and MS-EVB) formalism also provides a convenient and
physically intuitive description of the excess proton center of the
excess charge (CEC), defined as[2]where the NEVB is the total number of EVB states, c2(r) is the population of state i contributing
to the MS-RMD ground state, and rCOC is the geometric
center-of-charge for state i. This CEC definition
allows the use of a continuous reaction coordinate (further discussed
below) for the PMF calculation of PT in biological systems.[2] The protonated moiety in the state with the largest
coefficient, c1 (eq ), possesses the majority of the excess positive
charge, when in bulk water this state is the most hydronium-like species[29,30,37,38] (or the so-called “pivot” hydronium, a technical term
used below).
FitRMD Parametrization Scheme
One
possible fitting
procedure of an MS-RMD potential energy function to QM/MM data was
first described in ref (31). In such an approach, configurations along the PT reaction coordinate
are sampled by MS-RMD umbrella sampling simulations with the initial
guess MS-RMD model parameter set. In the present article, the initial
guess amino acid models were taken from the previous work done in
bulk water,[31] except that the constant
energy shift between protonated/deprotonated states in the model (V) was determined by the difference
in the Coulomb energy of the RMD (EVB) complex between the most favorable
hydronium state and the protonated state. The range of the PT reaction
coordinate was set to sample configurations for both protonated/deprotonated
states (further defined below). Next, for each configuration a QM/MM
calculation was performed to collect the reference forces on each
atom in the MS-RMD reactive complex. Then, the MS-RMD model parameter
set was optimized by minimizing the variational residual:where N and N are the
number of configurations and the number atoms in each configuration,
respectively. The weight of each atom w(r) is set to unity here, but it should
be noted that other choices are possible. The atomic force F is the one obtained from the current
MS-RMD model parameter set, and FQM/MM is the
reference force from the QM/MM calculations. The whole set of the
model parameters were then divided into two groups: (1) the diagonal
terms V and (2) the
off-diagonal and repulsive terms. The model optimization was done
individually for each group. First, the off-diagonal and repulsive
terms were fit with the value of V fixed, and then V was refit with new values of the off-diagonal and repulsive
terms held fixed.
Developing MS-RMD Models Using FitRMD for
ClC-ec1
MS-RMD
simulations of the PT process in the bacterial ClC homologue, ClC-ec1,
are extensively described in ref (39). The FitRMD method was used to parametrize the
MS-RMD models from QM/MM data for the two glutamate residues in the
central region, namely, the E148 (Gluex) and the E203 (Gluin), which have been identified as intermediate sites for proton
binding along the transport pathway (see Figure ). Depending on the presence of Cl– at the central site (Cl–cen), two different
systems were setup. For each system setup, two sets of umbrella sampling
simulations were performed (deprotonating either E148 or E203) to
sample configurations for the generation of the QM/MM force data.
In order to obtain the initial configurations for the umbrella sampling,
50 ns of unconstrained classical MD was run both with Cl–cen present and absent. In this time, water penetrated
the central region from the intracellular side of the membrane, forming
a continuous hydrogen bonded network between the carboxyl oxygens
of E148 and E203. A harmonic potential with a force constant of 20
kcal·mol–1·Å–2 was
applied to a collective variable (CV), defined as |r – r|, where “X” is the carboxyl
center of mass for either E148 or E203. The centers of the windows
ranged from 2.0 to 4.0 Å and were separated by 0.25 Å. At
CV = 2.0 Å, all configurations have the amino acid fully protonated,
while at CV = 4.0 Å all were deprotonated with protonated state
contributing less than 0.01% (c2(r) ≤ 10–4).The configurations were
selected with a 2 ps interval from each window for ∼500 configurations
for each protonation site. For each configuration, a single point
QM/MM calculation was performed to evaluate the forces acting on the
QM atoms. As shown in Figure B, the QM region included Cl–cen, if present, the water molecules in the central region, and the
side chain of the pore-lining residues, including E113, E148, E202,
E203, Y445, and S107. The waters in the central region within the
third solvation shell from E148 and E203 were also included in the
QM region. The QM box size was set to be 20–30 Å in each
dimension to ensure that it was 6–8 Å larger than the
size of the QM atoms in each dimension. The Gaussian Expansion of
the Electrostatic Potential (GEEP) scheme was used to treat the QM/MM
electrostatic coupling with periodic boundary conditions (PBCs),[40,41] and the spurious QM/QM periodic image interactions were decoupled
as described in ref (42). The Cα–Cβ chemical bonds
that crossed the QM/MM boundary were capped with hydrogen atoms, the
forces on which were calculated following the IMOMM scheme[43] with a scaling factor of 1.50. The QM atoms
were treated with density functional theory (DFT) using the BLYP functional[44,45] with empirical dispersion corrections,[46] under the Gaussian and plane wave (GPW) scheme.[47] Goedecker–Teter–Hutter (GTH) pseudopotentials[48] were used, and the Kohn–Sham orbitals
were expanded in the Gaussian TZV2P basis set.After the QM/MM
forces were obtained, the MS-RMD model parameters
were optimized by minimizing the residual given in eq using a genetic algorithm.[49] All atoms in the reactive complex, as defined
by the MS-RMD state-selection algorithm,[30] were included in the fitting (i.e., the glutamic acid side chains
and solvent atoms in the central region). The V and the off-diagonal terms were iteratively
optimized for 3–6 rounds, depending on the system, until they
changed less than 1%. The models for E203 and E148 were developed
independently because they are separated by 10–14 Å, depending
on the presence of Cl–cen, and therefore
never participate in coupled delocalization of the excess charge.
The MS-RMD simulations were performed with the RAPTOR software[28] interfaced with the LAMMPS MD package (http://lammps.sandia.gov),[50] and
the umbrella potentials were controlled by the PLUMED package.[51] All single point QM/MM calculations are performed
by the CP2K package.[52] The FitRMD calculation
was performed by an in-house code.[31] Parameters
for E148 and E203 for ClC-ec1 with Cl–cen either absent or present are given in Table .
Table 1
MS-RMD Model Parameters
of E148 and
E203 in ClC-ec1
ClC-ec1
Cl–cen absent
Cl–cen present
E148
E203
E148
E203
B
0.063153
0.118175
0.012536
0.003282
b
1.571751
0.558680
0.232384
0.424087
b′
1.320947
1.311301
1.469007
0.480047
dOO0
2.4
2.4
2.4
2.4
C
0.363648
0.076533
0.014044
0.026472
c
1.117167
1.063753
1.152912
1.309021
dOH0
1.0
1.0
1.0
1.0
rsl
3.5
3.5
3.5
3.5
rsh
4.0
4.0
4.0
4.0
Vii
–147.095673
–144.522565
–151.11473
–147.01392
c1
–36.090543
–40.406344
–30.414842
–35.686750
c2
1.879933
3.826462
3.331769
1.291340
c3
1.193253
1.439519
1.422240
1.598104
D
143.003
143.003
143.003
143.003
α
1.8
1.8
1.8
1.8
r0
0.975
0.975
0.975
0.975
1D-PMF Calculations
of PT from E203 to E148 in ClC-ec1
In order to directly compare
MS-RMD and QM/MM free energy profiles,
umbrella sampling was performed with a CV defined as ζ, which is a function of the excess proton CEC (rCEC) and the center of mass of the carboxyl groups
of the E203 (r203) and the E148 (r148) residues:The excess proton CEC coordinate, however,
was defined as[53]where the r is the
position of the jth heavy atom in the QM region,
which is either the wateroxygen atoms or the carboxyl oxygen atoms in the E203 or the E148,
and the r is the position of the ith hydrogen
atom bound to those heavy atoms. (It should be noted that this CEC
definition was used for both the QM/MM and MS-RMD simulations, as
opposed to eq for the
latter, in order to have a common definition of the CV in the two
PMFs for comparison.) The weighting factor, w, was set to
be two for all the wateroxygen atoms and zero for carboxyl oxygen
atoms in the E203 and E148, which reflect the hydrogen coordination
number in the deprotonated state of the heavy atom. The term d denotes the distance
between X and H atoms, and the f(d) = 1/(1 + exp[(d – r)/d]) is the switching
function describing the coordination number of H to X, with the parameters chosen as d = 0.04 Å, r = 1.25 Å.[54]The CV ζ varies from 0 to 1
as the excess proton CEC moves from E203 to E148. The centers of the
harmonic umbrella potentials were separated by 0.1–0.2 Å
between adjacent windows. The umbrella sampling force constant for
ζ was chosen to be in the range
of 3000–7000 kcal·mol–1, depending on
the sampling overlap between umbrella windows. Given that the denominator
of ζ, |rCEC – r203| + |rCEC – r148|, is ∼15 Å, the
effective force constant acting on |rCEC – r203| was 15–30 kcal·mol–1·Å–2. Because the central region is accessible
by the water molecules from the intracellular bulk, the CEC can locate
outside the central region at the windows near E203 when a restraint
is only placed on ζ. To avoid sampling
irrelevant positions (leading to the intracellular bulk phase), an
additional restraint defining the upper limit for |rCEC – r148| was applied such
that the CEC was always situated between E148 and E203. The upper
limit was chosen to be 2σ above the average of |rCEC – r148| in each window.The details of the preparation of the initial configuration and
the classical MD equilibration for the ClC-ec1 antiporter are described
in ref (39). The initial
configurations used for the FitRMD umbrella sampling were used to
initiate the MS-RMD umbrella sampling simulations. Near the middle
of the reaction path, when E203 and E148 were both deprotonated, a
water molecule close to the center of each umbrella window was replaced
with a hydronium cation. All windows were first equilibrated for 100
ps, followed by production runs of ∼2 ns. The integration time
step was 1 fs. The CV ζ was collected
every 10 time steps (10 fs), and the PMF was constructed by the weighted
histogram analysis method (WHAM).[55] Statistical
error bars in the PMFs were estimated using the block averaging method
by dividing each trajectory into four consecutive blocks.
1D-PMF Calculation
with QM/MM
The initial configurations
used in the MS-RMD simulations were also used for the QM/MM umbrella
sampling simulations. The simulation details for the QM/MM MD were
kept consistent with the single point force calculations used in FitRMD.
The window spacing and force constants for the umbrella windows were
similar to those used in the MS-RMD simulations. All windows were
equilibrated with QM/MM MD for another ∼5 ps, followed by production
runs of ∼20 ps. (It should be noted that in our experience
if a FitRMD model differs significantly from the underlying QM/MM
forces due to a bad fit, then subsequent QM/MM MD configurations taken
starting with the MS-RMD initial conditions will diverge quickly from
the MS-RMD ones.) The integration time step was 0.5 fs, and the CV
ζ was collected every time step.
The PMF was again constructed by WHAM. All QM/MM MD simulations were
performed by the CP2K package.[52]Upon finding that the PMFs (see Figure ) calculated with QM/MM and MS-RMD MD disagreed
for the Cl–cen absent system in the CV
range 0.18 < ζ < 0.26, a
second set of QM/MM simulations was run in order to investigate the
origin of the PMF discrepancy. The second set of QM/MM umbrella sampling
simulations were initiated from the last MD snapshots of the MS-RMD
umbrella sampling simulations, which had an increased hydration level
compared to that of the original QM/MM simulations. All windows in
the CV range 0.18 < ζ < 0.26
were then sampled for 5 ps. All other simulation details were unchanged
from the original QM/MM umbrella sampling.
Comparative Structural
Analysis in ClC-ec1
Structural
elements of the ClC-ec1 protein are shown in Figures –6. To obtain these structures, the production
run simulation conformations were averaged in each umbrella window
following three steps. First, configurations were aligned based on
α carbon atoms at least 15 Å away from the central region.
Second, the positions of the side chains of the four residues in the
central region (S107, E148, E203, and Y445) and Cl–cen, if present, were averaged, and full moieties representative
of these averages were depicted. Third, the positions of the water
molecules and the excess proton were taken from the last MD frame.
Figure 6
Panel A: The PMF from Figure A for ζ = 0.18–0.26.
The second QM/MM PMF (green) was initiated from the final MS-RMD configuration
in each window. (B) The average and the standard error of c12(r), reflecting the delocalization of the hydrated excess
proton (see text), from the original QM/MM, the MS-RMD, and the second
QM/MM umbrella sampling at the window ζ = 0.24. Panels C–E: The averaged structure of the four
residues and the water structure in the last snapshot from the (C)
the QM/MM, (D) the MS-RMD, and (E) the second QM/MM umbrella sampling
at the window ζ = 0.24. The region
circled by a gray line represents the position of the second water.
(See further description in the main text.) Panel F: The coordination
number of the pivot hydronium to oxygen atoms of surrounding water
molecules or the carboxyl group of E203 in the MS-RMD simulation,
initiated from the configuration at low hydration level, which mimics
the configuration of the QM/MM in C.
In the
system with Cl–cen absent,
RDFs from E148 carboxyl carbon (CD in the PDB) to wateroxygens are
shown in A1 to A3, for the “R”, “T”, and
“P” states (Figure ), respectively. The RDFs from the E203 carboxyl carbon
to wateroxygens are shown in B1 to B3 in the same order. The averaged
positions of the four side chains are shown in C1 to C3 in the same
order, with the QM/MM structures in red and the MS-RMD structures
in blue.In the CLC system as in Figure but with Cl–cen present.
All notation is consistent with Figure .
Figure 4
In the
system with Cl–cen absent,
RDFs from E148 carboxyl carbon (CD in the PDB) to water oxygens are
shown in A1 to A3, for the “R”, “T”, and
“P” states (Figure ), respectively. The RDFs from the E203 carboxyl carbon
to water oxygens are shown in B1 to B3 in the same order. The averaged
positions of the four side chains are shown in C1 to C3 in the same
order, with the QM/MM structures in red and the MS-RMD structures
in blue.
Panel A: The PMF from Figure A for ζ = 0.18–0.26.
The second QM/MM PMF (green) was initiated from the final MS-RMD configuration
in each window. (B) The average and the standard error of c12(r), reflecting the delocalization of the hydrated excess
proton (see text), from the original QM/MM, the MS-RMD, and the second
QM/MM umbrella sampling at the window ζ = 0.24. Panels C–E: The averaged structure of the four
residues and the water structure in the last snapshot from the (C)
the QM/MM, (D) the MS-RMD, and (E) the second QM/MM umbrella sampling
at the window ζ = 0.24. The region
circled by a gray line represents the position of the second water.
(See further description in the main text.) Panel F: The coordination
number of the pivot hydronium to oxygen atoms of surrounding water
molecules or the carboxyl group of E203 in the MS-RMD simulation,
initiated from the configuration at low hydration level, which mimics
the configuration of the QM/MM in C.
Hydration Dynamics in the Central Region of ClC-ec1
To calculate
the time scale for the transition from a low to high
hydration state in the central region of ClC-ec1, the low hydration
state first needed to be created in the MS-RMD structure. To do this,
a high hydration configuration was taken from the last MS-RMD snapshot
at umbrella window ζ = 0.24 (Figure D). An additional
harmonic potential was applied to the water density CV (N) in the predefined box (see refs (23) and (39) and for definition). The
force constant of the harmonic potential was set to be 10 kcal·mol–1, and the center of the potential was at N = 0. The center of the box was chosen
to be the midpoint between the center of mass of each carboxyl group
of E148 and E203, and the box size was set to be 2
× 2 × 2 Å3, which covered
the position of the second water. The second water present in the
initial configuration was expelled from the central region by the
harmonic potential centered at N = 0. The MS-RMD simulation was run for 200 ps, until the system
was equilibrated in the absence of the second water.Ten independent trajectories were initiated from
the last MD frame of the simulation above, after the velocities of
all atoms were randomized and the harmonic potential centered at N = 0 was released. The coordination
number of the pivot hydronium was defined as the number of oxygen
atoms (from water or the E203 carboxyl group) within 3 Å from
the pivot hydroniumoxygen. The time for the second water entering
the central region was estimated by averaging the 10 simulations time
to when there was a productive (not transient) transition of the coordination
number from 2 to 3.
Developing MS-RMD Models Using FitRMD for
CcO
The FitRMD
method was also used to parametrize the MS-RMD models from QM/MM data
for three protonatable sites in the hydrophobic cavity of CcO, including the E286, PRDa3, and PRAa3 (Figure ). Umbrella sampling simulations were first
carried out along the PT pathway identified by the MS-RMD metadynamics
(MTD) simulation.[56,57] Then for each protonatable site,
∼100 configurations were selected from the trajectories from
windows within 4 Å of the transition state of proton dissociation
(defined as the windows with ∼50% of the configurations having
the largest amplitude on the amino acid and the other 50% on the first
water molecule) for each protonatable site. Single point QM/MM calculations
were then performed for each configuration using the B3LYP level density
functional theory.[58] The MM models were
the CHARMM22 and CHARMM36[59] force fields
for the protein and lipids, respectively. The QM region included the
side chain of each protonatable amino acid, the hydrated excess proton,
and water molecules within 3 solvation shells of the carboxyl group
(Figure B). In all
calculations, the QM box size was chosen to be 6–8 Å larger
than the actual size of the QM atoms in each dimension. The GEEP scheme
was used to treat the QM/MM electrostatic coupling with periodic boundary
conditions (PBCs), and the spurious QM/QM periodic image interactions
were decoupled as described in ref (42). The Cα–Cβ chemical bonds that cross the QM/MM boundary were capped with hydrogen
atoms, the forces on which were calculated following the IMOMM scheme
with a scaling factor of 1.50. The forces generated by the QM/MM calculations
were used to parametrize the MS-RMD parameters for the protonatable
sites using FitRMD approach. The MS-RMD simulations were performed
with the RAPTOR software interfaced with the LAMMPS MD package, as
described earlier. The QM/MM calculation was performed with the CP2K
package, and FitRMD was carried out with in-house software, again
as described earlier. Parameters for E286, PRDa3 and PRAa3 are given in Table .
Figure 7
(A) Cytochrome c oxidase from Rhodobacter
sphaeroides. The subunits I, II, III, and IV are colored
in blue, red, gray, and orange, respectively. The yellow rectangular
box highlights the E286 residue and the heme-copper groups that are
essential for proton pumping and reaction. (B) Enlargement of the
region where E286, heme a, heme a3, and BNC are located. E286 is shown in green, the propionate
groups of the PLS in yellow, the hydrated excess proton in purple,
the iron atom of the heme groups in gray, and the copper atom in the
BNC in orange. The heme groups are shown as sticks. In the QM/MM calculation,
the QM atoms include the E286 side chain, the hydronium, and the 3
solvation shells of water molecules around the E286 residue (shown
in VDW representation).
Table 2
MS-RMD Model Parameters of E286, PRD3, and PRAa3 in CcO
E286
PRDa3
PRAa3
B
0.037588
0.995945
0.994509
b
0.208388
1.999090
1.993038
b′
0.533589
0.000009
0.006308
dOO0
2.4
2.4
2.4
C
4.925621
0.988369
0.322227
c
1.975947
1.990330
1.981954
dOH0
1.0
1.0
1.0
rsl
3.5
3.5
3.5
rsh
4.0
4.0
4.0
Vii
–135.809617
–149.99445
–146.174
c1
–21.659933
–31.931186
–38.677822
c2
2.785857
2.689357
1.903222
c3
1.299987
1.147862
1.245548
D
143.003
143.003
143.003
α
1.8
1.8
1.8
r0
0.975
0.975
0.975
(A) Cytochrome c oxidase from Rhodobacter
sphaeroides. The subunits I, II, III, and IV are colored
in blue, red, gray, and orange, respectively. The yellow rectangular
box highlights the E286 residue and the heme-copper groups that are
essential for proton pumping and reaction. (B) Enlargement of the
region where E286, heme a, heme a3, and BNC are located. E286 is shown in green, the propionate
groups of the PLS in yellow, the hydrated excess proton in purple,
the iron atom of the heme groups in gray, and the copper atom in the
BNC in orange. The heme groups are shown as sticks. In the QM/MM calculation,
the QM atoms include the E286 side chain, the hydronium, and the 3
solvation shells of water molecules around the E286 residue (shown
in VDW representation).The QM/MM calculations for CcO were
performed
at the B3LYP DFT level and as such were much more expensive than the
BLYP level DFT QM/MM calculations carried out for ClC-ec1. As a result,
fewer configurations were used for CcO than for ClC-ec1
in the FitRMD, and for the same reason, no explicit PMF was calculated
with the B3LYP level QM/MM for CcO (see below). In
addition, the CcO system is too complex to converge
any reasonable accurate QM/MM PMF, and this fact serves to highlight
the strength of the MS-RMD approach.
2D-PMF for PT in the CcO
Hydrophobic Cavity with MS-RMD
Full details of the CcO simulations and PMF calculations
are presented in ref (60). Some of the relevant details are given here, including the fact
that the MS-RMD umbrella sampling simulations in the hydrophobic cavity
(HC) were carried out by restraining the excess proton CEC position
(eq ) along the PT pathway
defined from the MTD simulations and the water density in a predefined
box that encompasses the HC (see ref (23) for the definition). The force constant for
the harmonic umbrella sampling restraint potential was 10 kcal/mol/Å2 on the proton migration CV and 20 kcal/mol on the water density.
The window spacing was ∼0.5 Å for the CEC and ∼0.5
for the water density. For each umbrella window, the MS-RMD simulation
length was ∼500 ps. The integration time step was 1 fs. The
CV ζ was collected every 10 time
steps (10 fs). The 2D PMF was constructed by the WHAM.
Comparing
the Computational Efficiency of Different Methods
The computational
speed for the MS-RMD, self-consistent density
functional tight binding (SCC-DFTB)-based QM/MM,[61,62] and BLYP-based QM/MM methods were compared for the MD simulation
of the CcO system. (The QM/MM with B3LYP is much
too slow for viable QM/MM MD in these systems.) For MS-RMD, the setup
was the same as described in the above section. For the BLYP QM/MM
method, the QM/MM setup was the same as described in the above section
for FitRMD, except that the QM atoms were treated by the BLYP functional
with empirical dispersion corrections. More details on the BLYP-based
QM/MM MD simulation setup are presented in ref (60). For the SCC-DFTB-based
QM/MM setup, the QM atoms were the same as those in the BLYP-based
QM/MM setup. The point charge based Ewald summation was used to treat
QM/MM electrostatic coupling under PBCs.[63] More details on the SCC-DFTB-based QM/MM setup are described in
ref (64). The MS-RMD
simulation and BLYP-based QM/MM simulations were performed as described
previously. The SCC-DFTB-based QM/MM simulation was performed with
the CHARMM package.[65]
Results and Discussion
Comparing
QM/MM and MS-RMD Free Energy Profiles in the ClC-ec1
Antiporter
An important measure of success of the FitRMD
method is its ability to reproduce results of the reference Hamiltonian
for properties other than those that were fit (atomistic forces),
and there is arguably no property more important than the PT free
energy profile (PMF). Unlike our previous work demonstrating the FitRMD
method for amino acid deprotonation in bulk water,[31] calculating a PMF with QM/MM is possible in a protein environment
because the water molecules are more confined. (In the bulk environment,
the QM/MM boundary issues introduce such large errors that a direct
comparison between MS-RMD and QM/MM PMFs is highly problematic at
best.) However, for many protein cavities the waters involved in PT
between two residues are largely surrounded by protein, and they all
fit into the QM region, making exchange across the QM/MM boundary
less of a complication. Although many protein systems will still require
extensive sampling (beyond the limits of QM/MM) as well as exhibit
water exchange across the QM/MM boundary on longer time scales, the
level of confinement in the ClC-ec1 antiporter system enables a direct
comparison of the QM/MM and MS-RMD PMFs.PT through the central
region of ClC antiporters is one of the essential intermediate steps
in Cl–/H+ exchange. The migration of
an excess proton from the internal E203 to the external E148 through
water molecules in the central region is coupled with migration of
approximately 2 Cl– ions in the opposite direction
(Figure ). Therefore,
the PMFs for PT in the central region of the ClC-ec1 antiporter were
calculated both in the presence and absence of Cl– at the central binding site, Cl–cen.The MS-RMD and QM/MM PMFs (Figures A and B) show excellent agreement for most
parts of
the reaction coordinate and for both states of the system (with and
without Cl–cen), although there are some
differences (discussed below). The free energy barrier of the PT process
is significantly decreased when Cl–cen is present, mainly due to the electrostatic interaction between
the excess proton and the Cl–cen ion.
These results demonstrate that the FitRMD approach is capable of generating
MS-RMD models that reproduce the free energy profile of the reference
QM/MM Hamiltonian, using only forces on the atomic nuclei from a relatively
small set of configurations as input for the fitting. Moreover, the
FitRMD approach is robust enough to quantitatively capture the effect
of the Cl–cen ion on the free energy
surface of the reference QM/MM Hamiltonian. This suggests that the
FitRMD approach is also capable of describing the shift in proton
affinity in different protein environments, which has significant
value for simulating PT in different protein systems with the MS-RMD
method.The hydration structure surrounding the reactive protein
residues
was also compared. The radial distribution functions (RDFs) from the
carboxyl carbon of either E148 or E203 to the water and hydroniumoxygens were calculated with the excess proton CEC restrained in the
reactant, transition, and product states of the PMF (the positions
are labeled as “R”, “T”, and “P”
in Figure A and B).
The RDFs (Figures and 5 with and without Cl–cen, respectively) demonstrate that the solvation structure
around E203 and E148 is quite similar between the QM/MM and MS-RMD
methods. This result provides additional evidence that the model generated
by the FitRMD approach faithfully reproduces the underlying free energy
landscape of the reference QM/MM Hamiltonian, even though only QM/MM
forces on the atom nuclei were used as input for the fitting. The
discrepancies in the RDFs shown in Figures and 5 are likely
due to the different water behavior of QM versus MM water as well
as short QM/MM sampling where the water dynamics are slow in the confined
space of the central region, which will be discussed below. We note,
however, that this level of agreement may not be expected for more
bulk-like water environments since the QM and MM water will have such
different structural properties.
Figure 5
In the CLC system as in Figure but with Cl–cen present.
All notation is consistent with Figure .
Limitations of the QM/MM
Free Energy Profiles
Although
the PMFs in Figure A and B show that the MS-RMD and QM/MM PMFs generally agree well,
some discrepancies appear. Focusing on the most significant, Figure A highlights the
region 0.18 < ζ < 0.26 from Figure A, where the MS-RMDPMF (blue) dips to a modest metastable minimum, but the QM/MM PMF
(red) shows an uphill rise with only a slight dip at ζ ≅ 0.21. This discrepancy between the two
PMFs is caused by the change of the local environment for the excess
proton. The structures of the central region captured from the ζ = 0.24 umbrella windows, where the two PMFs
disagree most, are shown in Figure C (QM/MM) and D (MS-RMD). The MS-RMD structure shows
one additional water (gray circle) compared to the QM/MM structure.
The center of excess charge (yellow sphere) is close to the first
water coordinated to E203 in this window for both the QM/MM and MS-RMD
configurations, meaning that the first water is the most hydronium-like
species (the “pivot” hydronium, which has the largest
MS-RMD state coefficient, c1, in eq ). To better understand
the difference in solvation structures and charge delocalization,
we calculated the average value of c12 for all configurations
in the ζ = 0.24 umbrella windows
from the QM/MM and MS-RMD simulations (Figure B). It should be kept in mind that the c12 value for an Eigen cation (H9O4+) is ∼0.65, which indicates that the pivot hydronium holds
∼65% of the positive charge (hydronium-like) defect, while
that for a Zundel cation (H5O2+)
is ∼0.5 since in that case, excess charge is more equally shared
between two water molecules. The MS-RMD simulations yield larger ⟨c12(r)⟩ values (0.7) because the pivot hydronium
is coordinated to two waters in the central region, in addition to
the carboxyl group, leading to a stable Eigen-like (CO-H7O3+) complex. However, one of these waters
is missing in the QM/MM simulations (Figure C, gray circle), shifting ⟨c12(r)⟩ to a lower value (0.55) and the delocalization
to more of a Zundel-like complex between the carboxyl group and first
water molecule (CO-H3O+). The MS-RMD Eigen-like
complex forms a contact ion pair (CIP) with the carboxyl group of
deprotonated E203 and the stabilizing electrostatic attraction between
oppositely charged ions causes a small energy well (∼1 kcal/mol)
in the MS-RMDPMF. However, in the QM/MM configurations, no hydrogen
bond acceptor is found near one of the hydrogen atoms of the pivot
hydronium, causing the PMF to continue its uphill climb.To
see if the PMF discrepancy was caused by sampling different conformational
phase space, another set of QM/MM umbrella sampling simulations was
run for 5 ps (as described earlier), initiated from the last snapshots
from the MS-RMD umbrella sampling simulations in the CV range 0.18
< ζ < 0.26. The resulting
QM/MM PMF (green in Figure A) shows much better agreement with the MS-RMDPMF. Figure B and E show that
for this second QM/MM simulation the value of ⟨c12(r)⟩ is close to the MS-RMD value, and that the missing second
water is present. Thus, the PMF discrepancy is indeed caused by sampling
different conformational phase space with that sampled by MS-RMD having
one additional water molecule to stabilize the excess proton.To estimate the time scale required for the additional water molecule
(missing in the original QM/MM simulations) to enter the central region,
an MS-RMD configuration was prepared (see Methods) at ζ = 0.24 in which this water
was removed (mimicking the QM/MM solvation structure shown in Figure C). The system was
equilibrated in this low solvation state, and then 10 independent
MS-RMD simulations were run for 500 ps. Figure F shows the coordination number of the pivot
hydronium to the oxygen atoms of the surrounding water molecules or
the carboxyl group of E203 in one of the trajectories. The coordination
number at t = 0 ps is 2, when the pivot hydronium
is coordinated to one wateroxygen and E203’s carboxyl oxygen.
Although there are some transient transitions to a coordination of
3 around t ≃ 30–70 ps, due to an additional
wateroxygen, the second water is not stable for another 100 ps. The
second water then enters the central region from the intracellular
bulk. The coordination number remains 3 for most of the last 400 ps
of the trajectory (∼89% of the time). This trajectory is representative
of the other nine trajectories, in which the second water did not
enter the central region until t = 60–185
ps (⟨t⟩ = 99 ps), after which the coordination
number remained 3 for 88 ± 5% of the rest of the trajectories.
These results suggest that the entrance of the second water is energetically
favored in this umbrella window but that the time scale of this event
exceeds the sampling time possible in the QM/MM simulations.Although the discrepancy that motivated this detailed investigation
was small, it highlights another important limitation of QM/MM MD
PMFs. In addition to the errors and artificial dynamics introduced
by QM/MM boundary issues, limited sampling, especially of slow degrees
of freedom such as changing hydration, can hide insidious PMF errors.
This reinforces the importance of being able to map accurate ab initio forces onto an efficient method that is capable
of extensive sampling. Free energy profiles in complicated condensed
phase environments are generally a balance of enthalpic and entropic
contributions, requiring both accurate potential energy representations
and extensive sampling. This is further explored below.
MS-RMD Can
Capture Coupling between Hydration and PT in CcO
CcO offers another example
of PT through an interior protein region being coupled to hydration
changes. During the reaction cycle of aa3-type CcO, as found in mitochondria, protons from
the intracellular side of the membrane are transported through the
so-called D-channel to the glutamic acidE286 in the middle of the
membrane. The protons are then transported through water molecules
in a hydrophobic cavity (the HC) either to the pump loading site (PLS)
to be further released to the periplasmic side of the membrane or
to the binuclear center (BNC) to react with oxygen and form water.
A particularly interesting aspect of the proton pumping mechanism
in CcO is the role of water molecules in the HC during
PT from amino acid E286 to the PLS or BNC (Figure B). The number of water molecules in the
HC and their role in PT has been the focus of much debate.[21,22,66−70] To further complicate this issue, the water molecules
can move in and out of the HC in CcO in response
to the migration of the excess proton, and the two processes can be
intrinsically coupled to each other. Capturing this type of cooperativity
often requires computationally demanding enhanced sampling of multiple
degrees of freedom. Here, we show that the computationally efficient
MS-RMD method parametrized by the FitRMD approach allowed us to address
this challenge.We calculated 2D PMFs for the PT from E286 to
the PLS in different redox states during the A → F transition of CcO. The collective variables used
to define these 2D PMFs are (1) the progress of the excess proton
CEC through the HC (horizontal axis) and (2) the degree of hydration
of the HC (vertical axis) (Figure ; see Methods for more discussion).
The 2D PMF shown in Figure clearly reveals the cooperativity between the PT and dynamic
hydration in the HC. The minimum free energy pathway (black line)
along the 2D PMF (Figure ) verifies that as the excess proton migrates from E286 to
the PRDa3, the hydration level of HC gradually
increases. The nonhorizontal feature of the minimum free energy pathway
indicates that the HC hydration is intrinsically coupled to the proton
charge defect translocation in this activated process.
Figure 8
Two-dimensional free
energy profiles (2D-PMFs) for PT from the
E286 to the PLS in the PM′ state, as a function
of the CEC coordinate through the hydrophobic cavity (HC) as the horizontal
axis and the water hydration in the HC as the vertical axis. The minimum
free energy pathways (black lines) are diagonal in nature, indicating
that the two processes are coupled. The statistical errors of the
2D-PMFs are ∼0–3 kcal/mol.
Two-dimensional free
energy profiles (2D-PMFs) for PT from the
E286 to the PLS in the PM′ state, as a function
of the CEC coordinate through the hydrophobic cavity (HC) as the horizontal
axis and the water hydration in the HC as the vertical axis. The minimum
free energy pathways (black lines) are diagonal in nature, indicating
that the two processes are coupled. The statistical errors of the
2D-PMFs are ∼0–3 kcal/mol.It is also important to emphasize how the computational efficiency
of the MS-RMD method makes it feasible to calculate such 2D PMFs. Figure shows a relative
speed and scaling over the processor plot of the MS-RMD, SCC-DFTB-based
QM/MM, and BLYP-based QM/MM MD methods for CcO. It
is seen that MS-RMD is at least 3 orders of magnitude faster than
the BLYP-based QM/MM simulation and 2 orders of magnitude faster than
the SCC-DFTB-based QM/MM simulation. MS-RMD also scales better over
processors than the other two methods (the SCC-DFTB CHARMM code is
not scalable at all.)
Figure 9
Scaling plot for the MS-RMD, SCC-DFTB-based QM/MM, and
BLYP-based
QM/MM simulations of CcO. The blue curve is for MS-RMD,
the green is for SCC-DFTB, and the red curve is for BLYP-based QM/MM.
The MS-RMD method is at least 3 orders of magnitude faster than the
BLYP-based QM/MM method and 2 orders of magnitude faster than SCC-DFTB-based
QM/MM, while showing more favorable scaling properties over processors
(the SCC-DFTB CHARMM code is not scalable).
Scaling plot for the MS-RMD, SCC-DFTB-based QM/MM, and
BLYP-based
QM/MM simulations of CcO. The blue curve is for MS-RMD,
the green is for SCC-DFTB, and the red curve is for BLYP-based QM/MM.
The MS-RMD method is at least 3 orders of magnitude faster than the
BLYP-based QM/MM method and 2 orders of magnitude faster than SCC-DFTB-based
QM/MM, while showing more favorable scaling properties over processors
(the SCC-DFTB CHARMM code is not scalable).
Conclusions
Describing reactive processes in biomolecular
systems remains a
challenging domain for molecular simulations. The ab initio methods that are capable of describing chemical reactions generally
do not scale to the size of most biomolecules or to the time scales
needed to converge multidimensional free energy profiles of rare events
in condensed phase environments. Thus, bridging a quantum description
of the reactive processes with more computationally efficient approaches
(e.g., classical dynamics and enhanced sampling methods) that are
capable of extensive phase space sampling is of great utility and
fundamental importance. The work presented herein contributes a multiscale
framework aimed at accomplishing this multiscale bridging, specifically
for the purpose of simulating the protonation/deprotonation of ionizable
moieties such as amino acids in biomolecules. The core of our framework
is the MS-RMD method, which describes reactive processes in classical
MD and explicitly treats the charge defect delocalization and Grotthuss
shuttling of the hydrated excess proton in the PT process. However,
MS-RMD must be carefully parametrized in order to simulate PT faithfully,
and the approach is not yet a “black box”. Considerable
effort has gone into the parametrization of the MS-RMD model for PT
in water[27,29,30,37,38] and amino acids.[31,71,72] Herein, we have extended to proteins
the parametrization scheme from QM/MM data to MS-RMD models (FitRMD)
and demonstrated its use on protonatable amino acids in biomolecular
systems.As presented here, the FitRMD approach variationally
maps quantum
data (DFT-level QM/MM forces in this case) onto the MS-RMD nonlinear
reactive force field. We have demonstrated how FitRMD can be used
to parametrize MS-RMD models for amino acids in proteins with two
example systems, ClC-ec1 and CcO. In ClC-ec1, the
MS-RMD models were shown to faithfully reproduce the PT PMFs of the
reference QM/MM MD, both in the presence and absence of a Cl– ion in the central region. Thus, the FitRMD parametrization was
robust enough to capture quite different free energy profiles due
to the presence or absence of a single ion. Moreover, the local structure
of the protein and confined water was shown to be quite similar between
the MS-RMD and QM/MM simulations. In CcO, the developed
MS-RMD models were shown to be efficient enough to capture the coupling
between PT and hydration changes in the HC region. Two-dimensional
free energy surfaces in which both PT and hydration levels were explicitly
sampled are required for this analysis. Given that individual umbrella
windows often required over 500 ps to converge, these calculations
would not be possible with a QM/MM MD approach. Moreover, QM/MM boundary
issues, such as the lack of exchange of water molecules, can lead
to large systematic errors in such PT PMFs. Even for the ClC-ec1 system
where exchange of water across the QM/MM boundary is minimal due to
protein confinement, lack of water exchange was argued to introduce
errors in the calculated QM/MM free energy profile.It is worth
noting that in this article the original parametrization
method of Nelson et al.[31] was extended
and shown to work in much more complex molecular situations. For example,
in ClC-ec1 the MS-RMD models produce a PMF that is consistent with
the QM/MM Hamiltonian even though the CV used for the configuration
sampling (absolute distance) is different from the CV used for the
PMF calculation (ratio of distances) (see Methods for a full description). Moreover, different models for the same
Glu residues were parametrized in different states, with and without
the central Cl– ion, resulting in quite different
PMFs that were independently consistent with the QM/MM PMFs. It should
also be emphasized that although a low-level QM method (BLYP-D) was
used here in the QM/MM calculations, our goal was to demonstrate that
a reference ab initio Hamiltonian (and free energy
surface) can be reproduced. Thus, an MS-RMD model obtained from FitRMD
using a higher level reference ab initio method can
be expected to also faithfully reproduce the free energy surface from
the latter, if only the latter could be calculated, which it currently
cannot be. In principle, therefore, our multiscale FitRMD with MS-RMD
framework can be used in the future to estimate high-level (e.g.,
MP2) free energy profiles for reactive processes, although the accuracy
of the surrounding environment will remain dependent on the chosen
classical force field.It also seems clear that the MS-RMD model
parameters will depend
on the given system under study and hence not likely to be “transferable”
to some other system. However, this is precisely the point of the
MS-RMD approach (and also the QM/MM approach which it is fitting).
It is very unlikely that it is generally possible to have transferable
potential parameters for reactive processes. However, we note that
the MS-RMD fitting methodology is itself transferable to different
systems and that its use is practical because the total computational
cost for both the model fit and the MD sampling with MS-RMD will be
still much cheaper than directly performing QM/MM MD simulations on
the same system.Our future efforts will focus on improving
the FitRMD protocol
to make it more robust to the choice of the trial MS-RMD model, as
free as possible from the coevolution of parameters, and insensitive
to discrepancies between the QM and MM descriptions of nonreactive
atoms in the system. We anticipate that reactive processes in many
biomolecular and other systems can eventually be studied with the
MS-RMD approach.
Authors: B R Brooks; C L Brooks; A D Mackerell; L Nilsson; R J Petrella; B Roux; Y Won; G Archontis; C Bartels; S Boresch; A Caflisch; L Caves; Q Cui; A R Dinner; M Feig; S Fischer; J Gao; M Hodoscek; W Im; K Kuczera; T Lazaridis; J Ma; V Ovchinnikov; E Paci; R W Pastor; C B Post; J Z Pu; M Schaefer; B Tidor; R M Venable; H L Woodcock; X Wu; W Yang; D M York; M Karplus Journal: J Comput Chem Date: 2009-07-30 Impact factor: 3.376
Authors: Ruibin Liang; Jessica M J Swanson; Jesper J Madsen; Mei Hong; William F DeGrado; Gregory A Voth Journal: Proc Natl Acad Sci U S A Date: 2016-10-24 Impact factor: 11.205