Water models with realistic physical-chemical properties are essential to study a variety of biomedical processes or engineering technologies involving molecules or nanomaterials. Atomistic models of water are constrained by the feasible computational capacity, but calibrated coarse-grained (CG) ones can go beyond these limits. Here, we compare three popular atomistic water models with their corresponding CG model built using finite-size particles such as ellipsoids. Differently from previous approaches, short-range interactions are accounted for with the generalized Gay-Berne potential, while electrostatic and long-range interactions are computed from virtual charges inside the ellipsoids. Such an approach leads to a quantitative agreement between the original atomistic models and their CG counterparts. Results show that a timestep of up to 10 fs can be achieved to integrate the equations of motion without significant degradation of the physical observables extracted from the computed trajectories, thus unlocking a significant acceleration of water-based mesoscopic simulations at a given accuracy.
Water models with realistic physical-chemical properties are essential to study a variety of biomedical processes or engineering technologies involving molecules or nanomaterials. Atomistic models of water are constrained by the feasible computational capacity, but calibrated coarse-grained (CG) ones can go beyond these limits. Here, we compare three popular atomistic water models with their corresponding CG model built using finite-size particles such as ellipsoids. Differently from previous approaches, short-range interactions are accounted for with the generalized Gay-Berne potential, while electrostatic and long-range interactions are computed from virtual charges inside the ellipsoids. Such an approach leads to a quantitative agreement between the original atomistic models and their CG counterparts. Results show that a timestep of up to 10 fs can be achieved to integrate the equations of motion without significant degradation of the physical observables extracted from the computed trajectories, thus unlocking a significant acceleration of water-based mesoscopic simulations at a given accuracy.
Coarse-grained (CG)
molecular dynamics (MD) simulations offer an
efficient way to study the most diverse systems at a mesoscopic scale,
with applications ranging from biochemistry[1,2] to
material science.[3−5] The basic idea behind CG is to decrease the number
of interacting sites describing individual molecules. By reducing
the model resolution, the computational cost and the configuration
space of the system decrease, thus enabling the modeling of larger
and more complex systems compared to atomistic simulations. For some
phenomena, such as the conformation change of enzymes and functional
proteins,[6,7] the limiting factor of all-atom (AA) simulations
is the timescale needed to witness a specific process. In this regard,
CG models enable longer timesteps and thus accessible simulation times
by suppressing the high-frequency motion characteristics of light
atoms and/or averaging out some intramolecular degrees of freedom.
Several CG models of water have been proposed during the last two
decades, following different approaches for the mapping process.[8−12] However, current CG models often do not provide an adequate description
of intermolecular interactions due to the lack of many-body contributions,
which are particularly important for the accurate description of,
for example, water properties.[13] Instead,
the MOLC model uses finite-size aspherical particles[14] connected with directional bonds. The particles are decorated
with virtual sites representing point charges. Differently from previous
approaches, short-range interactions are accounted for with the generalized
Gay–Berne potential,[15] while electrostatic
and long-range interactions with a modified version of the usual Coulomb
pairwise summation and the reciprocal-space Ewald solver, respectively.[16]This work explores the tantalizing possibility
of carrying out
CG-MD simulations of liquid water with near-atomistic quality using
the MOLC model,[16] which is available as
a user-defined package for the popular materials modeling code LAMMPS.[17,18] The CG models of water presented in this work use one site to represent
the whole molecule but with the possibility to host an arbitrary number
of virtual charges to account for realistic electrostatic interactions.
Other one-site water models were described in the literature, e.g.,
using a Stillinger–Weber potential to mimic the effect of anisotropic
hydrogen-bonding interactions,[19] projecting
the many-body interactions into pairwise basis sets,[20] or accounting for electrostatic interactions with a point
dipole,[10] even if significant discrepancies
with AA models have been shown. However, many biological processes
such as electropore formation in membranes are governed by electrostatic
interactions, and thus more detailed modeling of electrostatics is
required to capture the dynamic behavior of water.[21] Our solution is to explicitly include three virtual charges
per water molecule, having a 1-to-1 correspondence between the CG
model and the source AA model. Hence, the aim of this work is to test
that atomistic models of water can be rewritten in terms of finite-size
spheres (thus taking advantage of the longer timesteps allowed by
the Richardson iteration method[22]), without
significantly losing the accuracy of all-atom description for structural
and dynamical properties of liquid water.
Computational Methods
The MOLC force field has been described in detail elsewhere.[16] In the MOLC model, the electrostatic interactions
are accounted for via virtual sites acting as point
charges (see Figure a,b). The sites are placed within a skin distance from the external
surface of a parent CG particle, i.e., an ellipsoid. For the specific
case of water, we use spheres, which are a special class of ellipsoids.
The short- and long-range interactions are evaluated with a custom
algorithm based on the Coulomb pairwise summation in real space and
a particle–particle particle-mesh (PPPM) Ewalds solver in the
reciprocal space.[23] Typically, a reduced
set of charges is used to replace the complete set of all-atom charges.
However, here, we use the three charges of the original water model
without any further modification. The novelty introduced in the MOLC
model is that the point charges are defined in the ellipsoid framework
to which they are related. In other words, each ellipsoid is decorated
with a set of charges whose position is defined with respect to the
three principal axes of the ellipsoid. As the virtual sites are defined
anywhere inside the ellipsoid, we refer to them as “off-center”
charges. At each timestep, the Cartesian coordinates of each virtual
charge are reconstructed by combining the position and orientation
of the parent ellipsoidal particle. In this way, there is no need
to keep track of the position of the virtual charges nor to integrate
it explicitly.
Figure 1
Schematic view of the (a) AA and (b) CG water models.
The dashed
line in (a) represents the Lennard-Jones sphere centered at the oxygen
atom, while the small white point is the center-of-mass of the molecule.
The red circle in (b) represents a finite-size sphere centered at
the cross, while the white dots represent the virtual charges. Pictorial
view of an (c) AA and the equivalent (d) CG sample of bulk water.
The arrows in (d) show the orientation of the three axes of inertia
per each finite-size sphere representing a single water molecule.
Schematic view of the (a) AA and (b) CG water models.
The dashed
line in (a) represents the Lennard-Jones sphere centered at the oxygen
atom, while the small white point is the center-of-mass of the molecule.
The red circle in (b) represents a finite-size sphere centered at
the cross, while the white dots represent the virtual charges. Pictorial
view of an (c) AA and the equivalent (d) CG sample of bulk water.
The arrows in (d) show the orientation of the three axes of inertia
per each finite-size sphere representing a single water molecule.Electrostatic interactions in the MOLC model are
computed by rewriting
the usual Coulombic expression from the charge frame of reference
to the ellipsoid frame of reference. The direct Coulombic potential
in real space is given bywhere C is an energy-conversion
constant, q1 and q2 are the virtual charges, and the norm is the scalar distance. As shown in Figure , the vector distance
between the virtual charges is expressed aswhere R = P + S (i = 1, 2) is the
position of the virtual charge, P is the center of the ellipsoid, and S is the relative position of the virtual
charges in the ellipsoid system of reference. In this notation, R is obtained by rotating the
charge position from the ellipsoid’s frame to the frame of
reference using the quaternion of the parent ellipsoid followed by
a translation. The force that the virtual charge 1 exerts on its parent
ellipsoid bead is computed from the gradient of the potential with
respect to P1 aswhile the force
on the virtual charge 2 is
simply defined as F2 = −F1. Finally, the torque is defined aswhile, in this case, the torque
on the second
particle can be written asAll of the details of the derivation
of forces
and torques are reported in Supporting Information Note 1.
Figure 2
Diagram showing two off-center charges, in blue and red,
defined
for two ellipsoids.
Diagram showing two off-center charges, in blue and red,
defined
for two ellipsoids.The AA water models studied
are based on a three-site or four-site
rigid molecule with a point charge on each site (in the case of Tip4P,
the oxygen charge is displaced by 0.1546 Å), plus a single Lennard-Jones
potential on the oxygen atom. The molecules are kept rigid with the
SHAKE[24] or RATTLE[25] algorithms. In the MOLC force field, three atoms are replaced with
a single sphere, whose mass is that of the water molecule and whose
radius is the Lennard-Jones σ parameter. The virtual site representing
the oxygen charge is placed at the center of the sphere, while the
virtual sites representing the hydrogen atoms are placed on the plane z = 0 in the molecular frame of reference. Despite using
one particle per molecule, the MOLC model of water includes all of
the terms of the corresponding AA model, described with a different
mathematical formalism. Consequently, an arbitrary AA configuration
mapped to its CG counterpart will have the same intermolecular energy
in both representations. A slight difference is expected for torques:
in the AA model, each atom will generate a contribution acting on
the center-of-mass of the rigid molecule; in the CG model, only the
virtual particles carrying the charge of hydrogen atoms will produce
a torque acting on the center of the CG sphere. However, this difference
is limited as the center-of-mass of the water molecule lies some 0.07
Å from the position of oxygen, as shown in Figure a. The CG trajectories can be easily reverse-mapped
to the AA representation using the position and quaternion of each
ellipsoid, fully exploiting the extra rotational degrees of freedom
associated with finite-size particles. The reverse-mapped trajectories
were computed with the open-source Backmap code.[26] For the special cases of the rigid water models (e.g.,
MD simulations integrated with the SHAKE algorithm[24]), the AA trajectories can be directly compared to the reverse-mapped
trajectories using structural observables such as radial distribution
functions (RDFs). All of the details related to the generation and
equilibration of the simulated systems as well as the protocols adopted
for the evaluation of the physical observables are reported in Supporting
Information Note 2. Representative samples
of the LAMMPS codes employed for the simulations are available in
the Zenodo archive at https://doi.org/10.5281/zenodo.5552351.
Results and Discussion
We test the validity of the MOLC representation for three widely
used all-atom water models, namely, SPC-E,[27] Tip3P-Ew,[28] and Tip4P-05,[29] by comparing the computed self-diffusion coefficient,
dynamic viscosity, surface free energy, and enthalpy of vaporization[30−37] between the AA and corresponding CG models at T = 298 K and p = 1 atm (see Supporting Information Note 2 for methodological details). For each
AA and CG model, samples made of 500, 1000, and 5000 water molecules
were studied to evaluate possible size dependence on the calculated
physical observables.[38] For the sake of
simplicity, from this point on, we will refer to the Tip3P-Ew and
Tip4P-05 models as Tip3P and Tip4P, respectively. A side-by-side comparison
of AA and CG samples is shown in Figure c,d.The computed density of SPC-E
and Tip4P CG models is 0.993 ±
0.004 g/cm3, while that of the Tip3P CG model is 0.995
± 0.004 g/cm3, in excellent agreement with the corresponding
AA values (0.994 ± 0.003 g/cm3 for SPC-E and Tip4P
and 0.992 ± 0.004 g/cm3 for Tip3P). Furthermore, the
1-to-1 correspondence between the CG and AA representations allows
us to reintroduce the atomic detail into the CG trajectories without
loss of structural information (and vice-versa): the resulting reverse-mapped
trajectory was then used to compute the oxygen–oxygen (O–O),
oxygen–hydrogen (O–H), and hydrogen–hydrogen
(H–H) radial distribution functions (RDFs) for the CG samples.
The overlap between the RDFs of the AA and CG models in Figure further corroborates the structural
equivalence of the two models. The position and intensity of RDF peaks
from reverse-mapped CG trajectories closely match those from AA simulations.
The first and second peaks in the O–O distance are consistently
broader in the CG samples, resulting also in a smaller depth of the
first well, which is more pronounced for the TIP4P CG model. We attribute
this difference to the shifted reference of the axis of inertia, which
is centered in the center-of-mass in the AA model (Figure a) while in the center of the
bead in the CG one (Figure b). This leads to slightly different dynamics between the
AA and CG models. Such a difference is magnified in the TIP4P model
since it includes an additional interaction center producing a torque
on each molecule.
Figure 3
Radial distribution functions g(r) computed from all-atom (AA) and reverse-mapped coarse-grained
(CG)
samples for different water models and atoms.
Radial distribution functions g(r) computed from all-atom (AA) and reverse-mapped coarse-grained
(CG)
samples for different water models and atoms.The self-diffusion coefficient (D), dynamic viscosity
(μ), and surface free energy (γ) of both the AA and CG
water models evaluated with a timestep of 1 fs are summarized in Supporting
Information Table S1. The results of AA
simulations are in good agreement with reference modeling and experimental
values found in the literature,[10,30,31,34,36] without showing a system size dependence within the computed error
bars (see Supporting Information Note 3 for details). Nevertheless, we observe a progressive reduction of
the error bar of D with larger samples because of
the improved statistics. The self-diffusion coefficient and dynamic
viscosity derived from CG simulations are also plotted against the
AA reference, as shown in Figure a,b. Results show that the self-diffusion coefficient
computed from CG simulations is slightly lower than that from AA ones.
Consistently, μ of CG samples is higher than that of AA ones.
This evidence agrees with the Stokes–Einstein relation[31] (, where kB is
the Boltzmann constant and r is the radius of water
molecules, strictly valid for spherical particles), thus proving the
self-consistency of the proposed model (see Figure c). The transient values of μ and D are reported in Supporting Information Figures S2 and S3, highlighting the substantial convergence
of simulations. Furthermore, all obtained results remain in good agreement
with the experimental values.[30,39] Finally, in Figure d, we compare γ
computed with the AA and CG models. The results show an average reduction
with respect to the AA reference. However, in the specific case of
SPC-E and Tip4P CG models, the calculated γ still represents
a good approximation of the experimental value (72.0 mJ/m2),[40] making the CG water models perfectly
suitable for describing multiphase systems and interfacial phenomena.
The observed differences between the AA and CG results can be explained
by the fact that the moment of inertia of the CG model, based on a
finite-size sphere, is about 10 times larger than that of the AA model.
Figure 4
(a) Dynamic
viscosity, (b) self-diffusion coefficient, and (d)
surface free energy computed with an integration timestep of 1 fs
for different CG and AA water models and system sizes. Error bars
are reported in Supporting Information Table S1 for clarity. (c) Comparison between the self-diffusion coefficient
calculated from the Stokes–Einstein relation () and from the
mean square displacement
(D) for the CG models. Water models are represented
by different symbols, while system sizes are represented by colors.
Experimental values refer to bulk water properties at T = 298 K and p = 1 atm.
(a) Dynamic
viscosity, (b) self-diffusion coefficient, and (d)
surface free energy computed with an integration timestep of 1 fs
for different CG and AA water models and system sizes. Error bars
are reported in Supporting Information Table S1 for clarity. (c) Comparison between the self-diffusion coefficient
calculated from the Stokes–Einstein relation () and from the
mean square displacement
(D) for the CG models. Water models are represented
by different symbols, while system sizes are represented by colors.
Experimental values refer to bulk water properties at T = 298 K and p = 1 atm.The self-diffusion coefficient, dynamic viscosity, and surface
free energy computed from the CG-MD simulations integrated with a
longer timestep of 10 fs are summarized in Figure and fully reported in Supporting Information Table S2. Also, in this case, the evaluated physical
observables are not sensibly dependent on the system size. The viscosity
and self-diffusion coefficients obtained with a longer timestep (Figure a,b) are consistent
with those obtained with a timestep equal to 1 fs. The self-diffusion
coefficients D* computed via the
Stokes–Einstein relation from the μ values are also consistent
with those obtained with the mean square displacement (Figure c). No significant difference
is observed for the computed value of the surface free energy (Figure d). These results
show that no substantial degradation of the computed properties of
CG models is observed using a longer timestep of 10 fs. Furthermore,
for the specific case of SPC-E and Tip4P CG models, we recall the
good accuracy with respect to the experimental values.
Figure 5
(a) Dynamic viscosity,
(b) self-diffusion coefficient, and (d)
surface free energy computed with integration timesteps of 1 fs and
10 fs for different CG models and system sizes. Error bars are reported
in Supporting Information Table S2 for
clarity. (c) Comparison between the self-diffusion coefficient calculated
from the Stokes–Einstein relation () and from the mean square displacement
(D) for the CG models with a 10 fs timestep. Water
models are represented by different symbols, while system sizes are
represented by colors. Experimental values refer to bulk water properties
at T = 298 K and p = 1 atm.
(a) Dynamic viscosity,
(b) self-diffusion coefficient, and (d)
surface free energy computed with integration timesteps of 1 fs and
10 fs for different CG models and system sizes. Error bars are reported
in Supporting Information Table S2 for
clarity. (c) Comparison between the self-diffusion coefficient calculated
from the Stokes–Einstein relation () and from the mean square displacement
(D) for the CG models with a 10 fs timestep. Water
models are represented by different symbols, while system sizes are
represented by colors. Experimental values refer to bulk water properties
at T = 298 K and p = 1 atm.The computed enthalpy of vaporization for samples
of 5000 molecules
is reported in Supporting Information Table S3, showing a substantial agreement between the AA and CG models. A
comprehensive assessment of other thermodynamic properties, such as
the low-temperature density anomaly or the melting temperature,[19,41,42] will be the subject of future
investigations.Considering the Tip3P water as a representative
case study for
assessing the computational performance of the MOLC model, Figure a reports the wall
time ratio (wall time ratio = CG elapsed computational time/AA elapsed
computational time) between CG and AA simulations with the same trajectory
length. At a given timestep (1 fs), the computational cost of the
CG models is approximately the same as the AA ones, the wall time
ratio between CG and AA being in the range of 0.8–1.8. Indeed,
despite reducing the number of particles from 3 to 1, the electrostatic
interactions are identical in the AA and CG force fields and so is
the computational cost. On the other hand, the possibility of using
a timestep of 10 fs leads to a speed-up factor (speed-up factor =
CG performance/AA performance, the performance being evaluated as
nanoseconds of trajectory computed per day) of up to 6× (see Figure b).
Figure 6
(a) Wall time ratio between
the MOLC CG model and the corresponding
AA model considering different boxes of Tip3P water molecules and
a variable amount of computational cores (2 × 24 cores/node Intel
Xeon 8160 at 2.10 GHz), but a fixed timestep of 1 fs for both models.
(b) Speed-up factor of the CG model with respect to AA one, when the
former is simulated with a timestep of 10 fs and the latter of 1 fs.
(a) Wall time ratio between
the MOLC CG model and the corresponding
AA model considering different boxes of Tip3P water molecules and
a variable amount of computational cores (2 × 24 cores/node Intel
Xeon 8160 at 2.10 GHz), but a fixed timestep of 1 fs for both models.
(b) Speed-up factor of the CG model with respect to AA one, when the
former is simulated with a timestep of 10 fs and the latter of 1 fs.The use of a longer timestep has the obvious advantage
of significantly
speeding up the mesoscopic model with respect to the original atomistic
one. As a matter of fact, the timestep of AA water simulations typically
does not exceed 2 fs, even for accelerated MD simulations of biomolecules
where the scale of microseconds is routinely achieved.[43,44] Here, we investigate the longest timestep still able to guarantee
stable CG simulations and realistic water properties. For this, we
compute the viscosity of the Tip3P water model using timesteps of
1, 10, 15, 20, 40, and 80 fs. As the timestep is increased, numerical
instabilities arise from the accumulation of integration errors. To
mitigate the inaccuracies in sampling the higher frequency motions,
the mass of CG water molecules is progressively increased as to have
a larger moment of inertia and slower dynamics (see Supporting Information Figure S4), an expedient already adopted in other
works.[45] The higher mass of CG water molecules
leads to simulation stability of up to 80 fs timestep, but results
show a steady increase of dynamic viscosity with mass (see Supporting
Information Figure S5) in agreement with
literature values.[46] The scaled density
and RDFs, instead, are in excellent agreement with the reference values
even with the largest timestep tested.A core feature of the
MOLC model is to represent the intermolecular
interactions via a Gay–Berne potential and point charges, which
can be seamlessly mixed with standard AA force fields based on Lennard-Jones
potentials. Using this framework, water models based on four or more
point charges can be readily transformed into a CG model without any
reparameterization, and AA force fields can be mixed with CG models
in the same simulation. Another direction for the fine-tuning of CG
water models involves replacing the finite-size isotropic spheres
with finite-size anisotropic ellipsoids to reproduce the anisotropic
inertia tensor of the AA water model. Ultra-coarse-grained models,[47−49] i.e., where a single particle represents a cluster of water molecules,
are also natively supported by the MOLC model and easily implementable
by placing many electrostatic sites inside a large spheroidal particle,
whose position and strength should be optimized to simultaneously
reproduce the packing of pure water and its enthalpy of vaporization.
The implementation of new water models is beyond the scope of this
article but it is mentioned to encourage the translation of existing
force fields in the MOLC representation. Similarly, from an application
point of view, further studies will be necessary to validate the proposed
model in multiphase systems (e.g., for the description of wettability
or adsorption phenomena),[50−52] nanoconfined geometries,[53,54] or heat-transfer processes.[55,56]
Conclusions
In
this work, we have proposed, tested, and validated a new coarse
description of classic water models based on the MOLC force field.
We chose three popular all-atom models (SPC-E, Tip3P-Ew, and Tip4P-05)
and found that their corresponding coarse-grained representations
show accurate structural and dynamical properties: density, radial
distribution functions, self-diffusion coefficient, dynamic viscosity,
surface free energy, and enthalpy of vaporization. We observed a reduction
of the CG self-diffusion coefficient, which is matched by an increase
in the dynamic viscosity, consistent with the Stokes–Einstein
relation. The computed surface free energy is approximately 14% smaller
for all of the CG models. However, the computed surface free energy
is still reasonably close to the experimental value, confirming that
the CG models of water are good enough for describing interface problems
such as material wettability or adsorption phenomena. A speed-up factor
between 3 and 6 times is obtained with respect to the AA model, entirely
being due to the increase in the integration timestep unlocked by
coarse graining.
Authors: Levi C T Pierce; Romelia Salomon-Ferrer; Cesar Augusto F de Oliveira; J Andrew McCammon; Ross C Walker Journal: J Chem Theory Comput Date: 2012-07-27 Impact factor: 6.006