Timothy J Giese1, Maria T Panteva, Haoyuan Chen, Darrin M York. 1. Center for Integrative Proteomics Research, BioMaPS Institute for Quantitative Biology and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854-8087, United States
Abstract
A fully quantum mechanical force field (QMFF) based on a modified “divide-and-conquer” (mDC) framework is applied to a series of molecular simulation applications, using a generalized Particle Mesh Ewald method extended to multipolar charge densities. Simulation results are presented for three example applications: liquid water, p-nitrophenylphosphate reactivity in solution, and crystalline N,N-dimethylglycine. Simulations of liquid water using a parametrized mDC model are compared to TIP3P and TIP4P/Ew water models and experiment. The mDC model is shown to be superior for cluster binding energies and generally comparable for bulk properties. Examination of the dissociative pathway for dephosphorylation of p-nitrophenylphosphate shows that the mDC method evaluated with the DFTB3/3OB and DFTB3/OPhyd semiempirical models bracket the experimental barrier, whereas DFTB2 and AM1/d-PhoT QM/MM simulations exhibit deficiencies in the barriers, the latter for which is related, in part, to the anomalous underestimation of the p-nitrophenylate leaving group pKa. Simulations of crystalline N,N-dimethylglycine are performed and the overall structure and atomic fluctuations are compared with the experiment and the general AMBER force field (GAFF). The QMFF, which was not parametrized for this application, was shown to be in better agreement with crystallographic data than GAFF. Our simulations highlight some of the application areas that may benefit from using new QMFFs, and they demonstrate progress toward the development of accurate QMFFs using the recently developed mDC framework.
A fully quantum mechanical force field (QMFF) based on a modified “divide-and-conquer” (mDC) framework is applied to a series of molecular simulation applications, using a generalized Particle Mesh Ewald method extended to multipolar charge densities. Simulation results are presented for three example applications: liquid water, p-nitrophenylphosphate reactivity in solution, and crystalline N,N-dimethylglycine. Simulations of liquid water using a parametrized mDC model are compared to TIP3P and TIP4P/Ew water models and experiment. The mDC model is shown to be superior for cluster binding energies and generally comparable for bulk properties. Examination of the dissociative pathway for dephosphorylation of p-nitrophenylphosphate shows that the mDC method evaluated with the DFTB3/3OB and DFTB3/OPhyd semiempirical models bracket the experimental barrier, whereas DFTB2 and AM1/d-PhoT QM/MM simulations exhibit deficiencies in the barriers, the latter for which is related, in part, to the anomalous underestimation of the p-nitrophenylate leaving group pKa. Simulations of crystalline N,N-dimethylglycine are performed and the overall structure and atomic fluctuations are compared with the experiment and the general AMBER force field (GAFF). The QMFF, which was not parametrized for this application, was shown to be in better agreement with crystallographic data than GAFF. Our simulations highlight some of the application areas that may benefit from using new QMFFs, and they demonstrate progress toward the development of accurate QMFFs using the recently developed mDC framework.
Molecular simulations
provide atomic-level insight into a wide
range of complex chemical processes, many of which benefit from—if
not outright require—a fully quantum mechanical (QM) description.
There has been considerable interest and effort made toward the development
of linear-scaling quantum mechanical force fields (QMFFs) for molecular
simulations.[1,2] QMFFs achieve their speed through
a partitioning of the energy, whereby localized fragments are treated
using traditional QM methods, such as density-functional[3] or semiempirical[4] models,
and the interactions between the fragments are approximated by classical
electrostatics and empirical potentials. In some QMFFs, such as the
modified “divide-and-conquer” (mDC) method,[5] the electrostatic interactions between the fragments
occur through atomic multipole expansions. Methods that generalize
the linear-scaling evaluation of long-ranged electrostatic interactions
to multipolar charge densities must be available and integrated into
QMFF frameworks to be able to broadly test them within molecular simulations.[6]In the first part of this series,[7] we
described a multipolar Particle Mesh Ewald (PME) method that makes
condensed phase mDC simulations tractable. In the second part, which
is presented here, we explore a series of molecular simulation applications
demonstrating the mDC framework using a preliminary parametrization
of the nonbond interaction model.[8] The
first demonstration involves simulations of liquid water and the properties
of small water clusters. The mDC binding energies and geometries of
water clusters and the properties of bulk water are compared to the
TIP3P and TIP4P/Ew water models, high-level QM benchmark data, and
experimental measurements, where available. The second demonstration
involves the study of the dissociative pathway for dephosphorylation
of p-nitrophenylphosphate. The mDC barrier height
and transition-state structure are compared with combined quantum
mechanical/molecular mechanical (QM/MM) simulations performed with
the DFTB2 and AM1/d-PhoT semiempirical models. The third demonstration
involves simulation of crystalline N,N-dimethylglycine, and comparisons are made to experiment[9] and the general AMBER force field (GAFF).[10] These applications highlight areas that may
benefit from using a QMFF and further serve to demonstrate considerable
progress toward their use in molecular dynamics (MD) simulations.
Methods
The mDC Quantum Force Field
The mDC
method presented in ref (8) is a fully variational algorithm that is conceptually similar to
the embedded X-Pol model pioneered by Gao,[4,11−14] but differs in the quantum methods used, the treatment of electrostatic
interactions, and other technical details involving the interfragment
interaction potential. The underlying goal of fragment-based linear-scaling
methods, such as mDC and X-Pol, is to formulate a quantum model that
is sufficiently fast to be routinely applied to condensed-phase environments.
To achieve this speed, linear-scaling algorithms are used to compute
the Fock matrix and to construct orthonormal molecular orbitals (MOs).
Fragment-based quantum methods distinguish themselves from traditional
QM models by avoiding the construction of globally orthogonal MOs via diagonalization of a matrix whose size is proportional
to the entire system. Instead, mDC and other closely related QMFFs
approximate the wave function using a Hartree product of antisymmetrized
determinants of molecular fragments.[15] In
other words, the system is divided into fragments or molecules, each
of which is treated quantum mechanically using a basis of atomic orbitals
(AOs) for the construction of MOs that are presumed to not overlap
with the MOs of any other fragment. Because the MOs are taken to be
a linear combination of AOs, this presumption also extends to AOs,
so that the Fock or Kohn–Sham and overlap matrices become block
diagonal in their AO representation. The solution of the system’s
MOs from the Roothaan or Kohn–Sham equations consequently simplifies
to the diagonalization of a block-diagonal matrix, which reduces the
computational complexity of this operation from O(N3) to O(NK3), where K is the number of AOs within
a fragment. The above approximations appear, relative to a standard
QM method, as an artificial lowering of the number of MO constraints,
because the MOs of different fragments are now only presumed to not
overlap, as opposed to having been enforced with Lagrange multipliers.
When the molecules are separated by small distances, such that their
AOs would be expected to significantly overlap, the lack of interfragment
MO overlap constraints fails to produce an associated increase in
the electronic energy, which must be modeled through some other means
to recapture the effect. For example, other fragment models have used
perturbative corrections,[16−18] valence bond theory,[19] or empirical models,[5,8,12,13] such as Lennard-Jones
(LJ) or Buckingham potentials. The present work adopts the strategy
of using standard pairwise LJ potentials to empirically account for
the short-ranged repulsions described above, in addition to providing
a treatment for London dispersion forces.Although the fragments
are not directly coupled through MO overlap, they remain coupled via
the interaction of their densities. From this viewpoint, the fragments
are embedded within an effective chemical potential arising from the
other fragments. The mDC total energy iswhere E is the ab initio electronic energy of fragment A with N electrons
and atom positions R subjected
to the external potential p = ∂Einter(R,q)/∂q. The
DFTB3/3OB semiempirical model[20] is used
in the present work to compute E, unless noted otherwise. The interfragment interaction energy Einter depends on the atomic positions and atomic
multipole momentswhich are determined from the nuclear charge Z and the atom-partitioned
electron density ρ(r). The partitioning of the electron density to atomic centers is
a choice, and our choice is to relate the density to the AO basis
functions χ(r) and
AO basis single-particle density matrix P in a manner
analogous to Mulliken partitioningis
a Mulliken bond-order, S is
the AO overlap matrix, and f(b) is a smooth
switching function that allows us to bias the partition between atom
pairs to improve molecular electrostatic potentials.[8] For the water model described in the next section, a standard
Mulliken partitioning is used between H and O [f(b) = 1/2], whereas the parameters used to partition the density
between other elements (C, N, P) are provided in ref (8). The working expression
for the multipole moments is derived by inserting eq 3 into eq 2 while limiting the expansion
of the two-center densities to chargeswhereare radial integrals that are
treated as parameters
to control the magnitude of the multipolar contribution arising from
the single-center AO products. The two-center components of the density
could be expanded to higher-order multipoles without undue computational
effort, but we have not yet found their inclusion to be essential.
When the electrostatics are limited to charges, the largest errors
in DFTB3’s molecular electrostatic potentials occur for sp3 oxygen and sulfur lone-pairs, nitrogen lone-pairs, and sp
and sp2 carbons.[8,21] The one-center AO products’
description of the atom hybridizations provides enough information
to empirically improve upon these deficiencies with the parameters
shown in eq 6.The intermolecular interaction
energy,consists of the Coulomb interaction
between
the point multipoles using the multipolar PME method described in
Part 1[7] and standard LJ interactions,between atoms not within the same molecule.
To ensure smooth atomic forces, the LJ interactions are switched off:where u = (Rcut–R)/(Rcut–Rsw), and a long-range tail correction is smoothly
switched
on:The σ-spin
Fock matrix of fragment A,is obtained from elementary differentiation
of the energy. Upon reaching a global self-consistency, the nuclear
gradients of atom a in fragment A arewhereand nσ, Eσ, and Cσ are
the spin-resolved MO occupation numbers, eigenvalues, and eigenvectors,
respectively.In summary, the self-consistent field (SCF) procedure
is as follows:The LJ interactions can be computed before
entering the SCF
procedure or after the SCF has been completed.Compute the spin-resolved
density
matrix Pσ = ∑nσCσCσ and total density matrix P = Pα + Pβ of each fragment from their spin-resolved
trial MO coefficients Cσ.Evaluate the atomic multipole moments from the density
matrices (see eqs 5 and 6).Compute the interfragment
electrostatic
interaction energy and corresponding multipolar potentials p from the PME method described in Part
1 of this series.[7]Evaluate the DFTB3 energy (E) and Fock matrices (Fσ) of each fragment from the density matrices Pσ, and apply the external potential contribution
to the Fock matrices (see eq 11).If the SCF has not converged, construct
a new set of MO coefficients for each fragment and go back to step
(1).
Water
Model and Simulations
The LJ
parameters Rmin and ε of O and H
and the Msp(1) and Mpp(2) multipole moment parameters
of oxygen (see Table 1) were adjusted
to simultaneously reproduce the geometries and binding energies of
small water clusters (Table 2 and Figure 1) and the condensed phase properties of water (Table 3) using a single set of parameters. The parametrized
mDC model is compared to the standard TIP3P[22] and TIP4P/Ew[23] water models, denoted
by 3P and 4P in the tables and figures.
Table 1
Water Model
Parameters
mDC
3P
4P
ROH (Å)
0.9571
0.9572
0.9572
ROM (Å)
0.125
∠HOH (deg)
110.48
104.52
104.52
qH (a.u.)
[0.36155]
0.417
0.52422
εOO (a.u.)
3.096 × 10–4
2.422 × 10–4
2.594 × 10–4
Rmin,OO (a.u.)
6.669
6.683
6.712
εHH (a.u.)
9.500 × 10–7
Rmin,HH (a.u.)
2.800
Msp(1) (a.u.)
–0.08
Mpp(2) (a.u.)
2.4
Table 2
Water Cluster Binding Energy and Geometry
Statisticsa
Water cluster binding energies. Reference values taken from ref (24).
Table 3
Properties of Liquid Water
Density, ρ (g cm–3)
Heat
Capacity, cp (cal mol–1 K–1)
κT (× 10–6 bar–1)
T (K)
expta
mDC
3P
4P
exptb
mDC
3P
4P
expta
mDC
3P
4P
263
0.9981
0.9971
1.0134
0.9988
21.9
17.5
20.4
56
63
51
50
273
0.9998
0.9995
1.0061
0.9996
18.2
22.9
17.2
20.4
51
57
52
46
285
0.9995
0.9996
0.9971
0.9985
18.1
21.5
16.8
19.5
47
54
54
47
298
0.9970
0.9969
0.9856
0.9951
18.0
21.8
17.4
19.0
45
49
57
47
310
0.9933
0.9920
0.9752
0.9908
18.0
20.5
16.7
19.1
44
50
58
47
323
0.9880
0.9853
0.9628
0.9842
18.0
21.6
17.0
18.8
44
50
63
49
335
0.9822
0.9775
0.9508
0.9771
18.0
20.9
17.4
18.9
45
51
68
50
348
0.9748
0.9674
0.9370
0.9687
18.1
19.4
17.0
19.0
46
53
72
53
373
0.9584
0.9445
0.9077
0.9491
18.2
19.4
17.6
18.3
49
58
90
59
Data taken from
ref (79).
Data taken from ref (80).
Water cluster binding energies. Reference values taken from ref (24).Reference values taken from ref (24).Data taken from
ref (79).Data taken from ref (80).Table 2 and Figure 1 compare the binding energies ΔE of geometry
optimized gas-phase water clusters [(H2O) for n = 2–10] computed with mDC, TIP3P,
and TIP4P/Ew to the ab initio results taken from
ref (24) (RI-MP2/CBS
with a CCSD(T)/aug-cc-pVDZ correction). The coordinate root-mean-square
displacement (crms) is an all-atom comparison of the geometry optimized
model and reference structures. The internal geometry of the mDCwater
was not constrained during the optimization of cluster geometries,
which is consistent with the procedure used to generate the reference
data. The MM water models are inherently rigid because they were designed
to be used in condensed-phase environments only.The condensed-phase
water simulations used to generate Figures 2 and 3, the data shown in
Table 3, and the dielectric constants appearing
in Table 4 were performed on cubic boxes containing
512 waters simulated in the isothermal–isobaric ensemble at
atmospheric pressure with a modified version of PMEMD.[25] The Andersen thermostat[26] (1 ps coupling constant) and Monte Carlo (MC) barostat[25] (0.1 ps per isotropic volume scaling attempt)
were used to control the temperature and pressure of the system. Statistics
were collected from 8 ns of sampling with a time step of 1 fs for
all simulations except the liquid-phase TIP3P and TIP4P/Ew simulations,
which were run for 25 ns. For all models, the LJ interactions were
explicitly computed within a 9 Å cutoff and supplemented with
a tail correction to mimic the long-range interactions. Our implementation
of the mDC model transitions between these two limits (see eqs 7–10) from 8 Å to
9 Å to pedantically guarantee that smooth gradients are rigorously
enforced; however, the difference between smooth and precipitous transitions
at 9 Å does not appear to effect our results beyond the statistical
uncertainties engendered by the simulationsʼ finite sampling.
PME[27,28] and multipolar PME[7] electrostatics were performed with sixth-order Cardinal B-spline
interpolation on a 1 pointt/Å regular grid. The SHAKE algorithm[29,30] is applied to all water models when performing condensed-phase simulations
to avoid the use of excessively small time steps. SHAKE was chosen
to constrain the mDCwater to the DFTB3/3OB[20] gas-phase water structure, instead of the experimental gas phase
structure. This choice was physically motivated by providing a definition
for a zero of potential energy that avoids the need for applying monomer
deformation energies when computing the interactions of water.
Figure 2
Density of
ice (I (dashed line
(- - -)), supercooled water
(dotted line (···)), and liquid water
(solid line (—)). Experimental values taken from
refs (79) and[51].
Figure 3
Water radial distribution functions. Experimental values taken
from ref (83) (Soper
00), ref (84) (Skinner
13), and ref (85) (Soper
86).
Table 4
Water Diffusion Coefficient
and Dielectric
Constant at 298 K
Water
Diffusion Coefficient, D (× 10–5 cm2 s–1)
Avg.
Extrap.
dielectric constant,
ε
expta
2.30
2.30
78
mDC
2.29
2.49
48
3P
5.49
6.32
99
4P
2.42
2.79
63
Experimental diffusion
taken from
ref (81). Experimental
dielectric taken from ref (82).
Density of
ice (I (dashed line
(- - -)), supercooled water
(dotted line (···)), and liquid water
(solid line (—)). Experimental values taken from
refs (79) and[51].Water radial distribution functions. Experimental values taken
from ref (83) (Soper
00), ref (84) (Skinner
13), and ref (85) (Soper
86).Experimental diffusion
taken from
ref (81). Experimental
dielectric taken from ref (82).Several condensed-phase
properties are computed[31] and compared
with the experiment, including the heat of
vaporization, the isothermal compressibility, the isobaric heat capacity,
and the static dielectric constant:orwhere brackets
indicate time-averaged quantities
from the generated ensemble, Upot is the
total potential energy of the system, V is the volume
of the simulation box, H is the enthalpy (H = Upot + Ekin + pV), μ is the
dipole moment of the system, and Epol is
an ad hoc correction applied to the nonpolarizable
water models (Epol = Δμwat2/2α) to
account for the condensed-phase water’s increased self-energy
associated with having used an enhanced dipole moment, relative to
the experimental gas-phase dipole moment (Δμwat). Upon inserting the model and experimental dipole moment and the
experimental gas phase polarizability, the Epol corrections for TIP3P and TIP4P/Ew are 1.18 and 1.05 kcal/mol,
respectively. The Epol correction is not
applied to the mDC model, because of its explicit treatment of polarization
within ⟨Upot⟩. Cni(T), Cvib(T), and ∂Evib(T)/∂T are ad hoc corrections introduced by Horn[23] that
account for the nonideality of the water vapor, the difference between
the vacuum and liquid-phase vibrational energies, and the vibrational
energy contribution to the heat capacity, respectively. These quantities
are provided for several temperatures in (23). We computed the corrections
by spline interpolation of Horn’s pretabulated values at our
simulation temperatures. The ΔHvap comparisons in Table 3 are made with (using
eq 15) and without (using eq 14) these corrections for all models.Table 4 and Figure 4 compare the self-diffusion
coefficients of water at 298 K. The diffusion
coefficients were computed from canonical ensemble simulations, similar
to the protocol performed by others,[32,33] using a series
of system sizes (512, 768, 1024, and 1280 waters) at equilibrated
system densities. Each simulation was run for 1.2 ns using the Berendsen
weak coupling scheme to control the temperature.[34] The diffusion constant was computed from numerical differentiation
of the Einstein relation:In other words, the water-averaged autocorrelation
of the mean-squared displacement of the oxygen position was computed
for a series of time lags, and a linear regression was used to determine
the slope of the resulting line.
Figure 4
Box size dependence on the self-diffusion
coefficient of water
at 298 K. Experimental value taken from ref (81).
Box size dependence on the self-diffusion
coefficient of water
at 298 K. Experimental value taken from ref (81).The self-diffusion coefficient is known to exhibit a dependence
on system size that is almost linear with the reciprocal length of
the simulation box (see Figure 4).[35,36] The values “Avg.” and “Extrap.” in Table 4 are the average of the four diffusion coefficients
and the diffusion constant extrapolated to infinite box size from
linear regression, respectively.Figure 2 compares the experimental density
of ice (I) to mDC and TIP4P/Ew simulations
at 200, 225, and 248 K. For clarity, the properties of ice were not
considered during the parametrization of either model. The ice simulations
were performed in a manner analogous to those described above for
the liquid phase, which we will not fully repeat here. The ice I crystals are composed of 576 waters replicated
from the unit cell to fill an orthorhombic simulation box. The crystals
were equilibrated for 2 ns in the isothermal–isobaric ensemble
with anisotropic volume changes to relax the lattice vectors. The
reported densities are simulation averages from 8 ns of production
in the isothermal–isobaric ensemble. Figure 2 does not display the density of TIP3P, because it has a very
low melting point.[37]
p-Nitrophenyl Phosphate Reaction
Figure 5 compares the free-energy profiles
of the p-nitrophenyl phosphate (pNPP) dissociative reaction in solution using semiempirical QM/MM
and mDC models. The simulations were performed in the isothermal–isobaric
ensemble at 298 K within a box of 1550 waters, following a protocol
that is analogous to the water simulations described in the previous
section. The reaction profile was generated from a series of 56 umbrella
window simulations that force the separation between the phosphorus
and oxygen depicted in Figure 5, using a harmonic
restraining potential with a force constant of 75 kcal mol–1 Å–2. Each window was equilibrated for 300
ps and another 300 ps was simulated for production. The value of the
reaction coordinate was printed every 20 steps, which was then analyzed
using the variational free-energy profile (vFEP) method[38] to produce the profiles displayed in Figure 5. The QM/MM simulations were performed with TIP3P
water in combination with the DFTB2/MIO[39−41] or AM1/d-PhoT[42] semiempirical models, which were readily available
for use in SANDER.[25] The mDC simulations
were performed twice: the “mDC/3OB” profile employed
the 3OB parametrization of DFTB3,[20] whereas
“mDC/OPhyd” made use of the DFTB3/3OB model supplemented
with a modified O–P repulsive potential designed to improve
phosphate hydrolysis reaction energies.[43] The mDC electrostatic and LJ parameters of the solute atoms were
taken from ref (8).
The standard GAFF LJ parameters were used in the QM/MM simulations.
Figure 5
p-nitrophenol phosphate. Experimental value taken
from ref (65).
p-nitrophenol phosphate. Experimental value taken
from ref (65).
N,N-Dimethylglycine
Crystal Simulations
Table 5 compares
experimental[9] isotropic and anisotropic
heavy-atom displacement parameters of crystalline N,N-dimethylglycine (DMG) as a function of temperature
to those obtained from GAFF and mDC simulations. The crystal lattice
was constructed from the experimentally determined asymmetric unit
of the orthorhombic polymorph,[9] which consists
of two DMG molecules. The fundamental unit cell of the crystal contains
16 DMG molecules that are related to the asymmetric unit by fundamental
symmetry operations. A supercell composed of 18 unit cells (288 DMG
molecules), shown in Figure 6, was then constructed
from elementary translations of the unit cell along the lattice vectors.
The crystal supercell was simulated at each temperature in the canonical
ensemble using the experimental density for 325 ps, the first 25 ps
of which was discarded as equilibration, and a trajectory of 1200
frames of production was stored for analysis. The 18 unit cells within
each frame of the trajectory were translated to a common center to
produce 21600 frames of a 16-molecule unit cell. Similarly, the reverse
operations that relate the unit cell to the asymmetric unit were applied
to produce 172 800 frames of the two-molecule asymmetric unit
(see Figure 6). The 3 × 3 covariance matrix
of each heavy-atom position, which can be defined asis diagonalized to produce three
eigenvalues
(u1,, u2,, u3,), whose average value and standard deviation are
taken to be the isotropic and anisotropic “displacement parameters”,
respectively.The values of uiso, and uaniso, of
each heavy element are compared to those reported from
the analysis of experimental diffraction data, and the statistics
arising from these comparisons are shown in Table 5.
Table 5
Pearson Correlation
and Mean Unsigned
Errors of the Heavy-Atom Isotropic and Anisotropic Displacement Parameters
in N,N-Dimethylglycine, Relative
to the Experiment (See Ref (9))
Pearson
Correlation, R
Mean
Unsigned Error, mue (Å2)
T (K)
GAFF
mDC
GAFF
mDC
Isotropic
Displacement Parameters
225
0.79
0.93
0.0466
0.0126
250
0.88
0.94
0.0724
0.0168
275
0.88
0.95
0.0521
0.0234
295
0.90
0.94
0.0529
0.0328
Anisotropic
Displacement Parameters
225
0.76
0.89
0.0550
0.0153
250
0.79
0.91
0.0860
0.0169
275
0.83
0.92
0.0520
0.0199
295
0.85
0.93
0.0492
0.0231
Figure 6
N,N-dimethylglycine crystal lattice
used in the MD simulations. The experimentally observed asymmetric
unit was taken from ref (9). The mDC and GAFF ensemble averaged asymmetric unit structures (black
lines) at 225 K are overlaid on the experimental coordinates (colored
molecules). The mDC heavy atom fluctuations at 225 K are shown by
overlaying the entire ensemble of structures on the experimental coordinates.
The display of so many structures causes them to appear as “fuzzy
clouds”. (Inset shows the correlation between experimental
and calculated heavy atom isotropic and anisotropic displacement parameters.
Each data point corresponds to one heavy atom in the asymmetric unit.)
N,N-dimethylglycine crystal lattice
used in the MD simulations. The experimentally observed asymmetric
unit was taken from ref (9). The mDC and GAFF ensemble averaged asymmetric unit structures (black
lines) at 225 K are overlaid on the experimental coordinates (colored
molecules). The mDC heavy atom fluctuations at 225 K are shown by
overlaying the entire ensemble of structures on the experimental coordinates.
The display of so many structures causes them to appear as “fuzzy
clouds”. (Inset shows the correlation between experimental
and calculated heavy atom isotropic and anisotropic displacement parameters.
Each data point corresponds to one heavy atom in the asymmetric unit.)
Results
and Discussion
mDC Water Model
Simple MM models,
such as TIP4P/Ew, are capable of adequately modeling the condensed-phase
properties of water for most applications; however, the limitations
of simple models often mean that this type of targeted parametrization
limit their ability to simultaneously reproduce ab initio intermolecular interaction energies within small molecule clusters.
The strategy used to develop the parameters listed in Table 1, therefore, was to choose a large set of trial
parameters that each reproduced the ab initio water
binding energies (see Table 2 and Figure 1) to within a mean unsigned error (mue) of 1 kcal/mol,
and then prune and refine the parameters based on the results of short
condensed-phase simulations until the mDC condensed-phase properties
were comparable to TIP4P/Ew results. Specifically, we chose the parameters
to reproduce the location and height of the 3.4 and 4.6 Å peaks
in the experimental O–O RDF at 298 K. Second, we rejected parameters
if they did not reproduce the experimental densities at 273, 298,
and 310 K well. Finally, the heat of vaporization (without Horn’s C(T) and Cvib(T) corrections[23]) were monitored at 298 K and compared with our
TIP4P/Ew results. Considering that TIP4P/Ew is remarkably good model
for condensed-phase properties, we rejected parameters that caused
mDC to deviate from the ΔHvap value
of TIP4P/Ew by more than 0.1 kcal/mol. Automated procedures to aid
parameter development, such as the ForceBalance method[44] used to develop iAMOEBA,[45] have been described in the literature,[46−49] but were not used here. During
this process, we observed a very strong correlation between the mue
values of the water cluster and the ΔHvap value of the condensed phase. Furthermore, the RDFs (Figure 3) were found to be highly sensitive to the value
of Rmin,OO, and the water cluster hydrogen
bond angles were sensitive to oxygen’s M(2) quadrupole parameter. As a result of this procedure, the mDC cluster
mue’s are 10 times smaller than TIP4P/Ew and the agreement
with the ab initio geometries, as measured by the
crms values, improve by a factor of 2.The temperature dependence
of the mDCwater density (Figure 2) is comparable
to TIP4P/Ew, and it exhibits a maximum near 279 K, which is similar
to the experimental maximum of 277 K. The mDC and TIP4P/Ew simulations
yield similar isothermal compressibilities and heats of vaporization,
as shown in Table 3; however, the TIP4P/Ew
heat capacity is closer than mDC to the reference value by 1–2
cal mol–1 K–1. The TIP4P/Ew self-diffusion
coefficient (Figure 4) is 5%–20% too
high, depending on how it is measured, whereas the mDCwater model
is 0 to 8% too high. On the other hand, the experimental dielectric
constant (Table 4) is better reproduced by
TIP4P/Ew than mDC. The underestimation of the dielectric constant
is indicative of either having a dipole moment that is too small,
an orientational interaction that is overstabilizing (the quadrupole-charge
interactions, for example), or some combination of the two. The mDCwater molecule has a gas-phase dipole moment of 1.95 D, which is close
to the experimental value of 1.85 D. However, the gas-phase polarizability
of mDCwater, being based on the minimal valence basis DFTB3 semiempirical
model, is considerably underestimated, relative to ab initio calculations. It is possible that the quadrupolar interactions are
slightly too strong and thereby mask the attraction that would be
incurred from additional dipole polarization.A comparison between
mDC, TIP4P/Ew, and experimental ice I densities is included in Figure 2. The mDC
and TIP4P/Ew densities are too low and
too high, respectively, by approximately equal amounts; however, neither
model was parametrized to reproduce solid-state properties. At 248
K, the density of mDC (0.904 g/cm3) is in good agreement
with the literature value of TIP4P/Ice (0.909 g/cm3).[50] We have not determined the melting point of
mDC ice; however, we note that the enthalpy of mDC ice I is 0.34 kcal/mol more stable than TIP4P/Ew for all
3 temperatures, and both models yield nearly identical supercooled
liquid enthalpies at 263 K (Table 3). We are
encouraged to see mDC’s increased lattice stability in this
comparison, because TIP4P/Ew’s melting temperature is too low.[50] The comparison of calculated sublimation enthalpies
to experimental measurements[51] would likely
require a more rigorous treatment for nuclear quantum effects than
what we have performed here.[52] With the
exception of the ad hoc corrections for liquid water
(eq 15), our calculated enthalpies assume that
the intramolecular vibrational energy of water is unchanged by the
condensed-phase environment, and our classical simulations of ice
I ignore the zero-point vibrational energy
of the lattice. These assumptions are faulty at low temperatures[53] and they can also affect liquid water properties,[54,55] as is evident by comparing ΔHvap to ΔHvap′ in Table 3.
Previous comparisons between Path Integral and classical simulation
methods suggest nuclear quantum phenomena can also subtly lower the
density and increase the self-diffusion coefficient of liquid water
at 298 K.[56,57]The TIP3P and TIP4P/Ew water models
achieve awesome performance
on Graphics Processing Unit accelerated MD programs.[58,59] For example, AMBER’s PMEMD CUDA program[60] can simulate a box of 512 waters using the protocol described
in this manuscript at a rate of 100 ns/day on a NVIDIA GeForce GTX
765M. On an 8-core Intel Xeon workstation, the parallel Central Processing
Unit implementation of PMEMD[25] achieves
22 ns/day, whereas mDC simulates 0.8 ns/day on the same computer.
Our implementation of the brute force O(N3) variant of traditional DFTB3 has not been adapted for
periodic boundary conditions; however, vacuum calculations of 512
waters on 8 cores produces only 10–4 ns/day. A more
detailed analysis of mDC’s performance can be found in Part
1 of this series.[7]Overall, the mDC
simulations are able to model the structure and
energetics of small clusters, relative to high-level ab initio calculations, considerably better than the empirical TIP3P and TIP4P/Ew
models, while, at the same time, reproducing experimental condensed-phase
properties generally better than TIP3P and similar to TIP4P/Ew. It
is a considerably challenging problem in force field development to
design robust models for water that can accurately reproduce both
cluster and condensed-phase properties. The relative ease by which
the mDC model achieved this result has important implications for
simulations in heterogeneous environments where the local electrostatic
environment can vary considerably. One could attribute mDC’s
transferability to its multipolar electrostatics and/or electronic
polarization provided by the underlying QM Hamiltonian’s MOs;
however, care should be taken to not misrepresent mDC as being merely
a multipolar, polarizable force field. In other words, our examination
of water properties, by itself, does not well-characterize the breadth
of benefits offered by QMFFs, because QM Hamiltonians are not really
required to achieve high quality results for bulk water and water
clusters. On the contrary, QMFFs offer the possibility to extend the
scope of applications to include problems that require a detailed
description of the electronic distribution and/or MOs (chemical reactions
and spectroscopy, for example) of large systems in which a localized
“QM region” is either: ill-defined (molecular crystals
and ionic liquids, for example), is sensitive to the QM/MM decomposition
or potential, or when well-established MM force field parameters are
unavailable (nonstandard drug ligands, for example). Moreover, the
efficiency of QMFFs afford the degree of configurational sampling
required to make meaningful comparisons with experiment, and their
nonbond interactions can be improved beyond the capabilities of the
underlying QM Hamiltonian through the tuning of empirical parameters.
Thus, we assert that MO-based QMFFs are arguably better described
as being extraordinarily fast and empirically accurate linear-scaling
quantum methods rather than exotic, polarizable force
fields. Nevertheless, QMFF methodologies live in a “middle
ground” between polarizable force fields and linear-scaling
quantum methods and may appear to take on characteristics of each,
depending on how they are applied.[61]Relative to traditional MM force fields, QMFFs have the advantage
in being capable to model processes where bond formation and cleavage
occur. In this section, we examine the dissociative mechanism of dephosphorylation
of pNPP in aqueous solution, which is a prototype
phosphoryl transfer reaction,[62] as an example
application. Unlike the water cluster and bulk liquid properties discussed
in the previous section, no effort was made to tune the mDC LJ or
electrostatic parameters specifically for this application.The pNPP model system plays an important role in
phosphoryl transfer reaction studies. The pK of p-nitrophenol (∼7.2)
is considerably lower than the native 5′-alkoxide leaving group
in RNA cleavage transesterification. Therefore, pNPP is often used as an “enhanced” phosphoryl transfer
leaving group to study linear free-energy relationships (LFERs)[63] and kinetic isotope effects (KIEs).[62,64] LFERs and KIEs are powerful experimental methods to infer catalytic
mechanisms, because they report directly on the structure, bonding,
and local charge of the transition state.The barrier for this
reaction in solution has been experimentally
estimated to be 29.6 kcal/mol.[65] As a reference
for further comparison, we performed optimizations for the reactant
and transition-state geometries using B3LYP/6-31++G** coupled with
the polarizable continuum implicit solvent model (PCM)[66,67] and UAKS radii.[68] The B3LYP calculations
predict a barrier of 19.5 kcal/mol, which is 10 kcal/mol lower than
experiment. The discrepancy is, to some extent, influenced by the
error in the B3LYP proton affinity,[69] which
is 4.2 kcal/mol lower than high-level reference calculations. Previous ab initio calculations have shown that the barrier for this
reaction is very sensitive to microsolvation.[70] In particular, it was found that B3LYP/PCM calculations supplemented
with 12 and 14 explicit water molecules increase the barrier to 21.1
and 29.3 kcal/mol, respectively[70]The pNPP dephosphorylation dissociative mechanism
free-energy profiles are shown in Figure 5.
Generally, none of the calculated barriers compare well with the experiment.
Nonetheless, the two mDC models bracket the experimental barrier and
are more similar to it than any of the other models. The mDC/3OB and
mDC/OPhyd models underpredict and overpredict the experimental barrier
by 3.2 and 9.1 kcal/mol, respectively. The appearance of an artificial
dip in the mDC/3OB profile at RO–P = 2.5 Å is a symptom that we have encountered with DFTB3/3OB
in other unpublished applications involving phosphorus chemistry.
We hypothesize that this is produced from the increased stabilization
of the d–d tight-binding
diagonal matrix elements that were given to phosphorus in the DFTB3/3OB
parameters, relative to DFTB3/MIO;[71] however,
a more-detailed analysis would be required to support this supposition.
This feature is largely eliminated in the mDC/OPhyd profile.The DFTB2 and AM1/d-PhoT QM/MM simulations produce barriers significantly
lower than the experiment. The AM1/d-PhoT model was originally developed
specifically for phosphoryl transfer reactions; however, its treatment
of nitro groups is poor. For example, the AM1/d-PhoT proton affinity
of p-nitrophenol is 12.7 kcal/mol lower than experiment.
This effect is, to some extent, an amplified byproduct of a systematic
error in the B3LYP proton affinity discussed above that served as
the benchmark reference used during the AM1/d-PhoT parametrization.[42] We suggest that each methods’ capacity
to reproduce the experimental barrier depends on their ability to
simultaneously model the P–O covalent bond strength and the
electron-withdrawing power of the nitro group. Overall, although there
is very considerable room for improvement in all of the simulations,
the simulations with the mDC model perform considerably better than
the QM/MM models considered here, as well as the B3LYP model with
simple PCM-only solvation.Another application area that stands
to benefit from periodic boundary simulations using QMFFs is the study
of small-molecule crystals. Of particular pharmaceutical interest
is the study of polymorphism in molecular crystals of drug compounds.[72−75] A great deal of effort is devoted to the understanding of polymorphism
of pharmaceutical compounds, because different synthetic pathways
can lead to different crystal polymorphs, which, in turn, have dramatically
different bioavailability and biological activity. There has been
much research effort and progress made in recent years in the study
of the stability of organic crystals from molecular simulations, and,
for these applications, many-body polarization effects have been demonstrated
to be very important.[76−78] Here, we apply mDC to crystal simulations of N,N-dimethylglycine and make comparison
to experiment and GAFF simulations.[10]Figure 6 and Table 5 compare the structure and correlation of the experimentally observed
isotropic and anisotropic displacement parameters to those obtained
from GAFF and mDC simulations. A superposition of the average asymmetric
unit structure from the crystal simulation with the crystal structure
gives a root-mean-square (rms) displacement of 0.096 Å for mDC
and 0.15 Å for GAFF. The average structures from both simulations
thus are very close to the experimental structure, and the mDC result
is particularly impressive. The Pearson correlation coefficients of
the mDC displacement parameters are generally greater than 0.9, whereas
the GAFF correlation coefficients range from 0.76 to 0.9. As displayed
in Figure 6, both GAFF and mDC produce large
errors in both the isotropic and anisotropic displacement parameters,
even though their correlation to the experiment is reasonable. Nonetheless,
the mDC errors are two or more times smaller than GAFF, as shown in
Table 5. Overall, the mDC method, without having
specifically tuned the parameters for this application, performs markedly
better than GAFF.
Conclusion
In this
work, we incorporated the recently developed multipolar
PME method into the mDC linear-scaling quantum force field framework
and applied it to condensed-phase simulations of water, the p-nitrophenylphosphate (pNPP) dissociation
reaction free-energy profile, and the reproduction of the experimental
isotropic and anisotropic displacement parameters observed in crystalline N,N-dimethylglycine for several temperatures.
An mDCwater model was parametrized to simultaneously reproduce the
interaction energies and geometries of small water clusters and the
bulk properties of liquid water. The parametrized mDCwater model
is found to be comparable to the TIP4P/Ew at reproducing the bulk
properties of water, including the density, heat of vaporization,
isothermal compressibility, and diffusion coefficient; however, the
static dielectric constant of mDCwater was found to be too low, which
may arise from the underpredicted polarizability inherent within DFTB3.
On the other hand, the mDC model performs substantially better than
TIP4P/Ew, for reproducing the high-level ab initio energetics of small water clusters, and significantly better, with
respect to the geometries. The mDC model was applied to the pNPP dissociation reaction to demonstrate its ability to
model chemical reactions in complex condensed-phase environments.
At the same time, comparisons were made to two other widely available
QM/MM methods. Although the barrier height of the mDC free-energy
profile was substantially closer to the experiment than either DFTB2/TIP3P,
AM1/d-PhoT/TIP3P, or B3LYP with PCM solvation, more effort must be
made to further develop and test the DFTB3 model and mDC parameters
to increase the predictive capability in applications to enzymes and
ribozymes. The DMG crystal simulation was performed to highlight an
application area that extends beyond simple solvation with water.
The mDC displacement parameters were found to correlate better with
the experiment than the general AMBER force field by as much as 15%,
and the mean unsigned errors (mues), relative to the experiment, were
reduced by a factor of 2. In summary, the results suggest that QMFFs
developed within the mDC framework are both computationally tractable
and robust, exceeding the accuracy of standard force fields and QM/MM
models. Full development of new QMFFs promise to add new levels of
accuracy and predictive capability to application areas involving
chemical reactions and processes where inherent limitations in the
form of standard force fields are a bottleneck to accuracy and predictive
capability.
Authors: Romelia Salomon-Ferrer; Andreas W Götz; Duncan Poole; Scott Le Grand; Ross C Walker Journal: J Chem Theory Comput Date: 2013-08-20 Impact factor: 6.006
Authors: Lawrie B Skinner; Congcong Huang; Daniel Schlesinger; Lars G M Pettersson; Anders Nilsson; Chris J Benmore Journal: J Chem Phys Date: 2013-02-21 Impact factor: 3.488