Chengwen Liu1, Rui Qi1, Qiantao Wang2, J-P Piquemal1,3,4, Pengyu Ren1. 1. Department of Biomedical Engineering, The University of Texas at Austin , Austin, Texas 78712, United States. 2. Key Laboratory of Drug Targeting and Drug Delivery System of Education Ministry, West China School of Pharmacy, Sichuan University , Chengdu, Sichuan 610041, China. 3. Laboratoire de Chimie Théorique, Sorbonne Universités, UPMC, UMR 7616 CNRS , Paris 75252, France. 4. Institut Universitaire de France, Paris Cedex 05, 75231, France.
Abstract
The nonadditive many-body interactions are significant for structural and thermodynamic properties of condensed phase systems. In this work we examined the many-body interaction energy of a large number of common organic/biochemical molecular clusters, which consist of 18 chemical species and cover nine common organic elements, using the Møller-Plesset perturbation theory to the second order (MP2) [ Møller et al. Phys. Rev. 1934 , 46 , 618 . ]. We evaluated the capability of Thole-based dipole induction models to capture the many-body interaction energy. Three models were compared: the original model and parameters used by the AMOEBA force field, a variation of this original model where the damping parameters have been reoptimized to MP2 data, and a third model where the damping function form applied to the permanent electric field is modified. Overall, we find the simple classical atomic dipole models are able to capture the 3- and 4-body interaction energy across a wide variety of organic molecules in various intermolecular configurations. With modified Thole models, it is possible to further improve the agreement with MP2 results. These models were also tested on systems containing metal/halogen ions to examine the accuracy and transferability. This work suggests that the form of damping function applied to the permanent electrostatic field strongly affects the distance dependence of polarization energy at short intermolecular separations.
The nonadditive many-body interactions are significant for structural and thermodynamic properties of condensed phase systems. In this work we examined the many-body interaction energy of a large number of common organic/biochemical molecular clusters, which consist of 18 chemical species and cover nine common organic elements, using the Møller-Plesset perturbation theory to the second order (MP2) [ Møller et al. Phys. Rev. 1934 , 46 , 618 . ]. We evaluated the capability of Thole-based dipole induction models to capture the many-body interaction energy. Three models were compared: the original model and parameters used by the AMOEBA force field, a variation of this original model where the damping parameters have been reoptimized to MP2 data, and a third model where the damping function form applied to the permanent electric field is modified. Overall, we find the simple classical atomic dipole models are able to capture the 3- and 4-body interaction energy across a wide variety of organic molecules in various intermolecular configurations. With modified Thole models, it is possible to further improve the agreement with MP2 results. These models were also tested on systems containing metal/halogen ions to examine the accuracy and transferability. This work suggests that the form of damping function applied to the permanent electrostatic field strongly affects the distance dependence of polarization energy at short intermolecular separations.
Molecular
simulations are widely used to study the structural,
energetic, and thermodynamic properties of condensed phase systems.
It is essential for underlying classical potentials, or force fields,
to accurately describe the intra- and intermolecular interactions. In the past decade, a
major advancement in classical force fields has been the explicit
incorporation of the many-body effect, to improve their accuracy and
transferability.[2] The many-body interactions
can significantly affect various chemical/physical properties of matters.
The importance of incorporating explicit polarization has been demonstrated
on studying the well-known many anomalous properties of water at various
physical conditions;[3−5] molecular structure and polarizability of organic
compounds;[6−10] solvation free energy of inorganic salt ions;[11−14] and binding affinity of ligand-protein
complexes.[11,15]In the current
polarizable force fields, the many-body interaction
energy is explicitly treated through the introduction of electronic
polarization mainly in three different ways: (1) fluctuating charge
models,[5,16−18] where charge fluctuating
is applied to represent the system response to the electrostatic potential;
(2) Drude oscillator models,[4,19−21] where the Drude particles on polarizable sites are used to describe
the response to the surroundings; and (3) atomic induced dipole models,[3,6,22−24] where induced
dipoles are employed to respond to surrounding electrostatic field.
Force fields based on these models have been developed for general
molecular simulations.[2,25−27]The AMOEBA
(atomic multipole optimized energetics for biomolecular
simulation) force field utilizes the induced dipole model, where a
point dipole is induced at each polarizable site according to the
electric field experienced by that site. Molecular polarization is
achieved via iterative dipole induction with distributed atomic polarizabilities
based on Thole’s damping scheme.[3,6] In this simple
yet effective scheme, damping is crucial as it prevents polarization
catastrophe at a close distance of two polarizable sites and allows
reproducing anisotropic molecular polarizability. The Thole-based
polarization model in the AMOEBA potential was parametrized to reproduce
molecular polarizability as well as total molecular interaction energy
(e.g., water cluster association energy).[3] It was shown for water clusters that the current AMOEBA polarization
energy gives a reasonable trend of many-body interactions while overestimating
3- and 4-body interaction energy (E3B and E4B) consistently.[28,29]Here
we systematically examine the performance of classical polarization
models to explore the possibility to improve the polarization energy
in the current AMOEBA potential. The nonadditive many-body interaction
energy can arise from induction (polarization), exchange-repulsion,
and dispersion.[30] The nonadditivity of
exchange-repulsion is often believed to be less prominent in most
cases[30] except for the hydrated metal cation
complexes.[31] While there have been efforts
to model the E3B with polarization and
dispersion in organic molecular systems,[32−34] improvements
of biomolecular force fields have been focusing on incorporating many-body
polarization. In this study, we directly compare the many-body
polarization energy from classical models to the MP2 total Eand Eof common molecular systems including ions. In addition to the current AMOEBA polarization model, two new models
were examined: (1) a variation of the current AMOEBA model where the
damping parameters have been directly optimized to MP2 many-body interaction
energy and (2) a model where the permanent field damping function
is modified from that of Thole.The rest of this paper is organized
as follows: in the Methods section, AMOEBA
polarization framework will
be briefly revisited. The many-body decomposition databases that are
used to parametrize and validate these models, together with two new
models, will also be described in this section. Model parametrization
and comprehensive comparisons of the polarization models with MP2
results are made in the Results and Discussion section. The Conclusion is drawn in the
last section.
Methods
AMOEBA
Polarization Framework
Different
from the simple fixed-point-charge (partial charge) force fields,
such as AMBER[35] and CHARMM,[36] the AMOEBA force field uses atomic point multipoles
(monopole, dipole, and quadrupole) to describe the electrostatic potential
and field on each atom site.Polarization is explicitly treated
by mutual induction of dipoles at polarizable sites (located at atomic
centers). A point dipole moment is induced at each polarizable site
according to the electric field experienced by that sitewhere α is the atomic polarizability on site i; E is the “direct” electric field generated by permanent multipoles
of other sites; and E is
the “mutual” field generated by induced
dipoles of other sites. The E and E are expressed
aswhere T in eq is the
multipole-multipole interaction matrix;[6] and M is the polytensor
of permanent multipoles (charge, dipole, and quadrupole). In eq T11 is the dipole–dipole interaction matrix, and μ is the induced dipole moment of site j. The induced dipole on each polarizable site is solved iteratively
to obtain the converged dipole values.With self-consistent
field (SCF) converged induced dipole, the
polarization energy can be obtained throughTo ensure the finite nature of the intermolecular
induction effect, Thole used a damping scheme in which a “smeared”
charge distribution replaces one of the point dipoles, and thus dipole
interactions are damped.[37] As a result,
the dipole interaction energy approaches a finite value instead of
becoming infinite as the atomic separation approaches zero. Thole’s
scheme is very successful in reproducing dipole molecular polarizability
tensors for a broad range of organic molecules using element-based
isotropic atomic polarizabilities.[37,38] This scheme
has been adopted by the AMOEBA force field to model polarization energy.[3,6] The electric fields due to both the permanent multipoles and the
induced dipoles are damped using the same function in the current
AMOEBA model. This is accomplished by modifying the interaction T matrices in the corresponding orders (only Tα is shown here; see the higher order T matrices in the Supporting Information, SI).The
form of λ3 that the current
AMOEBA uses iswhere u(r) = r/(αα)1/6 is the scaled distance between sites i and j; and r and α are the real distance
and atomic polarizability, respectively. The factor a is the dimensionless width parameter of the smeared charge distribution
and effectively controls the damping strength. The universal damping
factor was determined to be 0.39 for both the direct and mutual part in the current AMOEBA by considering
the molecular polarizabilities and total associate energy of water
clusters up to hexamers.[3]
Many-Body Decomposition Databases
Databases of a series
of molecular clusters were constructed to parametrize
and validate the polarization models. The E3B and E4B, obtained from MP2 based ab initio decomposition (see the SI), were employed in this work.
Chemical Species Involved
in Databases
In total 25 representative chemical species
are included in these
databases (Figure ), containing common solvents, organic compounds, amino acid side-chain
analogs, and inorganic salt ions. Nine common chemical elements (C,
H, O, N, P, S, F, Cl, and Br) and seven monoatom ions (Na+, K+, Mg2+, Ca2+, Zn2+, Cl–, and Br–) are covered.
Figure 1
An overview
of all the chemical species to form molecular clusters
of many-body decomposition databases. Trimers and tetramers were constructed
from these species. For water, larger clusters up to octamer were
included. The composition of molecular clusters can be seen in the
following sections and the SI. These chemical
formula images were generated via MOLVIEW (http://molview.org).
An overview
of all the chemical species to form molecular clusters
of many-body decomposition databases. Trimers and tetramers were constructed
from these species. For water, larger clusters up to octamer were
included. The composition of molecular clusters can be seen in the
following sections and the SI. These chemical
formula images were generated via MOLVIEW (http://molview.org).
Database Overview
Here we give
an overview of the databases (see Table ), and a detailed description can be seen
in the SI. The parametrization databases
include Water-4568 and Tetramers sets. They are composed of the first
11 chemical species listed in Figure . The equilibrium geometry clusters are formed in the
parametrization databases. The validation databases include the Trimers-Distance
set, where the intermolecular separations were systematically varied
to generate closer distance configurations (Figure ), as well as other molecular clusters formed
by the last 14 chemical species listed in Figure .
Table 1
Overview of the Many-Body
Decomposition
Databasesc
purpose of use
database
species
data point
brief description
parametrization
Water-4568
1
4
water clusters; cyclic tetramer, cyclic pentamer,
prism hexamer,
and S4-symmetric octamer
Tetramers
11
26
organic tetramers; 10 homo-
and 16 heteroclustersa
validation
Trimers-Distanceb
11
124
organic trimers; 11 homotrimers with various separationsa
Water6-Extra
1
7
water hexamers; 7 structures other than cyclic configuration
3B-69 setb
5
14
organic
trimers of Beran and co-workers[32,33]a
Trimers-Transferabilityb
8
15
organic trimers used
to test model transferabilitya
metal/halogen ionsb
8
39
metal and halogen ion–water clusters;
11 tetramers and
7 trimers with each having 4 intermolecular separations
Hereafter “organic”
means all species in Figure except for ions.
Only the E3B was provided for the trimers
in these sets.
The number
of chemical species
and data points (each point includes both E3B and E4B except for trimers) was listed.
A detailed description of the database, molecular graphics, and XYZ
coordinates were given in the SI.
Figure 2
A schematic illustration of water trimers with
different intermolecular
distances. Three oxygen atoms of waters were denoted as O1, O2, and O3. d1 is the distance between O1 and O2, and d2 is the distance between O3 and
the midpoint of O1 and O2. These two distances
were systematically varied.
Hereafter “organic”
means all species in Figure except for ions.Only the E3B was provided for the trimers
in these sets.The number
of chemical species
and data points (each point includes both E3B and E4B except for trimers) was listed.
A detailed description of the database, molecular graphics, and XYZ
coordinates were given in the SI.A schematic illustration of water trimers with
different intermolecular
distances. Three oxygen atoms of waters were denoted as O1, O2, and O3. d1 is the distance between O1 and O2, and d2 is the distance between O3 and
the midpoint of O1 and O2. These two distances
were systematically varied.
Procedure
To Generate Short-Distance Trimers
Trimers with short intermolecular
distances were generated according
to the following procedure. To take water as an example, water trimer
was first fully optimized to its MP2 equilibrium geometry (shown in Figure ). By stepwise reducing
the “equilibrium” intermolecular distances, d1 and d2, a series
of water trimers were generated. For other molecules, the same procedure
was applied to generate the configurations with varied intermolecular
distances. For the M(H2O)2 trimer (where M =
Na+, K+, Mg2+, Ca2+, and
Zn2+), d1 and d2 were selected as the two M···O distances.
Considering the computational cost of MP2 many-body decomposition
calculations, a smaller number of representative intermolecular separations
were selected to generate the trimers for other molecules than water.
Finally, 124 trimers were constructed in total for the Trimers-Distance
set.
Variation of AMOEBA Polarization
Models
The current AMOEBA polarization model emphasizes the
aspect of
molecular polarizability and can well capture the total interaction
energy of water clusters. However, it has not been systematically
examined beyond water clusters. In addition, the current implementation
was shown to overestimate E3B and E4B for water clusters according to MP2/aug-cc-pvtz
calculations.[29] Therefore, we explore possible
improvements of the current model that can better capture the many-body
interactions, including the distance and oriental dependence. We propose
two new variations of AMOEBA polarization model:Model
1: this model is the same as the current AMOEBA but with a
damping parameter (a in eq ) “reoptimized” to match the
MP2 E3B and E4B energy. A “universal”
damping factor of 0.34 is used by Model 1 (except for divalent metal
ions). The molecular polarizabilities are rechecked since the mutual damping is affected during this process.Model 2: this model utilizes a new damping function
for direct (permanent) field and keeps the current
Thole damping function of AMOEBA for the mutual induction.
This was done as the distance dependence of total polarization energy
is mostly determined by the direct induction. On
the other hand, in the Thole damping scheme, the anisotropic molecular
response is determined by the mutual induction alone,
with just a few isotropic atomic polarizabilities.For both
models, we retain the same atomic polarizability
parameters used by the current AMOEBA force field. The only
exceptions are from CH3COOH and CH3CONH2, where the atomic polarizabilities of carbon and oxygen on
the carbonyl group need to be slightly modified to better match the
molecular polarizability and MP2 many-body energy.By examining
the distance dependence behavior of the E3B on the Trimers-Distance set, we found that the following
function gave the best performance with an appropriate damping factor.
Thus, the damping function for the direct induction
in Model 2 iswhere u(r) is the scaled distance and the same as that
in eq . The damping
functions for higher
order multipole interactions can be easily derived through the chain
rule relationships, and their mathematical forms are given in the SI.These new models were trained on the
parametrization database described
above, to mainly identify a single best “universal”
damping factor. All these parameters are described in the Parametrization section and Table .
Table 2
Parameters Used in
the Three Polarization
Models for Organic Molecules and Metal/Halogen Ionsc
AMOEBAa
Model
1a
Model
2a
αatom
adir(=αmut)
αatom
adir
amut
αatom
adir
amut
organicsb
0.390
0.340
0.750
0.390
Na+
0.120
0.390
0.060
0.340
0.060
0.750
0.390
K+
0.780
0.390
0.780
0.340
0.780
0.750
0.390
Mg2+
0.080
0.095
0.040
0.060
0.340
0.040
0.350
0.390
Ca2+
0.550
0.159
0.750
0.280
0.340
0.750
0.750
0.390
Zn2+
0.260
0.210
0.260
0.340
0.260
0.750
0.390
Cl–
4.000
0.390
4.000
0.340
4.000
0.450
0.390
Br–
5.650
0.390
5.650
0.340
5.650
0.450
0.390
AMOEBA and Model
1 use the same
damping function in both direct and mutual induction (eq ); Model
2 uses a different damping function for the direct induction (eq ).
Molecules 1–18 in Figure . All atomic polarizabilities
were kept the same as AMOEBA09[6] except
for O and C in O=C, which were increased from 0.837/1.334 to
1.45/1.75 Å3 in Model 1s and 2.
These parameters include atomic
polarizability (α) (in Å3) and damping factors (a and a).
AMOEBA and Model
1 use the same
damping function in both direct and mutual induction (eq ); Model
2 uses a different damping function for the direct induction (eq ).Molecules 1–18 in Figure . All atomic polarizabilities
were kept the same as AMOEBA09[6] except
for O and C in O=C, which were increased from 0.837/1.334 to
1.45/1.75 Å3 in Model 1s and 2.These parameters include atomic
polarizability (α) (in Å3) and damping factors (a and a).
Computational Details
Structures
of the pure water clusters, including (H2O) (n = 3, 4, and 5, the cyclic configurations
were selected), (H2O)8 (S4-symmetric),
and eight (H2O)6, were taken from Wang et al.[39] without further optimization. The initial structures
of ammonia, benzene, and methanol trimers and tetramers and their
mixed tetramers were from the literature.[40−45] The remaining clusters were generated by duplicating and translating
monomers in three-dimensional Cartesian space. These structures were
optimized by using HF/3-21G followed by the MP2/6-31+G(d) level of
theory in the GAUSSIAN 09 package.[46] The
regularized SAPT(DFT) calculations were performed on 14 dimer pairs
using CAMCASP[47,48] and NWCHEM 4.2[49] packages. The PBE0[50] functional
and the aug-cc-pvtz[51] basis set were employed.
The regularized parameter η was chosen to be 3.0 as suggested
by Missquitta.[48] For water dimers, the
absolutely localized molecular orbital (ALMO)[52,53] EDA was carried out in the QCHEM 4.2 package[54] at the HF/6-311++G(2d,2p) level of theory.The many-body
interaction energy decomposition was performed in the QCHEM 4.2 package.[54] Single point energy of each cluster and its
subclusters at the MP2/aug-cc-pvtz[51] level
were calculated for pure water clusters. Due to the expensive computational
cost, for the clusters other than pure water, the RI-MP2 (TRIM)[55] method combining with the cc-pvtz[56] basis set was used. The auxiliary basis set
was chosen to be rimp2-cc-pvtz as recommended.[54] Due to the availability of the basis set, 6-311++G(2d,2p)
was used for K+-water and Ca2+-water clusters,
and the G3Large basis set[57] was used for
Zn2+-water clusters. The molecular polarizabilities for
the 36 molecules from the S101 database[58] were calculated using GAUSSIAN 09 at the ωB97XD[59]/aug-cc-pvtz[51] level
of theory. The exact polarizability tensors from
GAUSSIAN were diagonalized to obtain three eigenvalues.For
the classical models, the polarization energy
from AMOEBA calculations was used to compute the E3B and E4B as the remaining
contribution is pairwise additive. Model 2 was embedded in TINKER
source files by modifying the interaction T matrices
according to the damping functions we proposed. The ANALYZE program
was used to obtain the polarization energy of each molecular cluster.
The atomic multipole (charge, dipole, and quadrupole) and polarizability
parameters were taken from previous charge penetration work.[58] These parameters were constructed/assigned by
POLTYPE,[60] which is an automatic tool to
generate the AMOEBA force field parameters for small molecules. The
same molecular geometries were used in both MP2 and classical model
calculations to obtain the E3B and E4B.
Results and Discussion
The performance of
the three polarization models will be examined
based on the MP2 many-body interaction energy in section Parametrization and section Validations.
In section , the
polarization energy from our models and two quantum EDA methods will
be compared.
Parametrization
The parameters of
the current AMOEBA and two new models are tabulated in Table . The main parameter in these
classical polarization models is the damping factor (a), which is determined by comparing the E3B from the models to MP2 values of the parametrization databases (Water-4568
and Tetramers sets). Root mean square error (RMSE) of the E3B from each model with respect to the MP2 was
used to optimize the parameters. The atomic polarizabilities were
mostly kept the same as those in the current AMOEBA (Table ).In Model 1, the optimal
damping factor was determined to be 0.34, which means a slightly stronger
damping on the electrostatic field (both permanent and mutual) is
applied than that of the current AMOEBA. The molecular polarizabilities
of all 36 molecules from the S101 database[58] were re-examined with the new damping factor in the mutual induction. The molecular polarizabilities from ωB97XD/aug-cc-pvtz
were employed as the reference since they well reproduce the available
experimental data. For all 36 molecules, this damping factor (0.34)
gives an RMSE value of 1.03 Å3 in terms of the three
components with respect to those of DFT, while the RMSE value of the
current AMOEBA is 0.98 Å3 (see the SI for each component value). This is expected as the molecular
polarizability, unlike polarization energy, has a relatively weak
dependence on the damping factor.[3]Other parametrization strategies were also considered. In Model
1, instead of using the same damping parameter for both direct and mutual induction, we also experimented: 1)
two different damping parameters for direct and mutual induction and 2) the element-based damping factor
for the direct induction with a common damping parameter
for the mutual induction. Based on the Tetramers
set, these two strategies essentially gave the same results as the
simpler one-parameter Model 1, and thus only Model 1 is presented
here.In Model 2, the damping of mutual induction
(both
the function and parameter) was kept the same as the current AMOEBA,
and thus the molecular polarizabilities remain unchanged. Only the direct damping factor was parametrized, and the optimal
damping factor was determined to be 0.75.For the metal/halogen
ion systems, especially the divalent cations,
both the damping factors, of two models (Models 1 and 2) and atomic
polarizabilities, were optimized to capture the many-body interaction
energy.Using the MP2 E3B and E4B as the reference, the performance of three
polarization
models is compared in Table . The individual E3B and E4B are plotted in Figure . Model 1 and Model 2 both show better agreement
with the MP2 E3B and E4B than the current AMOEBA. For example, for the Water-4568
set, the E3B RMSE value is reduced from
1.34 (the current AMOEBA) to 0.17 kcal/mol for both Model 1 and Model
2, with an ∼8-fold of improvement. Other statistics such as
mean unsigned error (MUE) and mean signed error (MSE) are all reduced.
For the Tetramers set, the improvement is not as apparent, largely
due to the relatively weak many-body interaction energy. As for the
individual E3B and E4B in these two sets (Figure ), significant improvement of two new models can be
seen on the clusters containing water and/or methanol molecules. For
the E4B there is still noticeable deviation,
but they are very small in magnitude with the majority of them being
less than 0.5 kcal/mol. While it is not surprising that the two new
models did well as they are now explicitly optimized to the E3B and E4B, the
current AMOEBA model also shows a great overall trend across the organic
set.
Table 3
Statistics Evaluation of the Three
Polarization Models on the E3B and E4B of Water-4568 and Tetramers Setsb
E3B
E4B
database
statisticsa
AMOEBA
Model 1
Model 2
AMOEBA
Model 1
Model 2
Water-4568
MUE
1.27
0.12
0.17
0.56
0.35
0.36
MSE
–1.27
–0.11
–0.02
–0.56
–0.35
–0.36
RMSE
1.34
0.17
0.17
0.58
0.37
0.37
Tetramers
MUE
0.37
0.22
0.30
0.18
0.13
0.13
MSE
–0.25
0.09
0.19
–0.11
–0.08
–0.07
RMSE
0.45
0.30
0.36
0.23
0.17
0.17
MUE: mean unsigned error; MSE: mean
signed error; RMSE: root-mean-square error.
All statistics was calculated
using the E3B and E4B from MP2/aug-cc-pvtz (for Water-4568) and RI-MP2/cc-pvtz
(for Tetramers) as the references.
Figure 3
Plots of (a) the E3B and (b) E4B calculated using the three polarization models
and QM methods (MP2/aug-cc-pvtz for Water-4568 and RI-MP2/cc-pvtz
for the Tetramers set). Each (MeOH)4 and (MeOH)2(H2O)2 cluster has two different structures.
MUE: mean unsigned error; MSE: mean
signed error; RMSE: root-mean-square error.All statistics was calculated
using the E3B and E4B from MP2/aug-cc-pvtz (for Water-4568) and RI-MP2/cc-pvtz
(for Tetramers) as the references.Plots of (a) the E3B and (b) E4B calculated using the three polarization models
and QM methods (MP2/aug-cc-pvtz for Water-4568 and RI-MP2/cc-pvtz
for the Tetramers set). Each (MeOH)4 and (MeOH)2(H2O)2 cluster has two different structures.
Validations
Organic Clusters
Trimers-Distance Set
With the damping
parameters derived
above, the distance dependence of the E3B was examined using the three polarization models on the Trimers-Distance
set, which contains 124 configurations of 11 homotrimers. We found
here that the E3B distance dependence
strongly relies on the direct damping function. Both
the current AMOEBA and Model 1 have similar distance dependence behavior
of the E3B due to the same damping function
in both (Figure ).
The smaller damping factor (0.34) in Model 1 leads to less negative E3B than the current AMOEBA with a damping factor
of 0.39 for all intermolecular separations. Thus, Model 1 better captures
the E3B of equilibrium clusters than the
current AMOEBA does (consistent with Table and Figure ). Model 1 was parametrized on the MP2 E3B data of equilibrium-geometry clusters (Water-4568 and
Tetramers sets). Thus, the E3B deviates
from that of MP2 for the short intermolecular separations. Both AMOEBA
and model display incorrect distance dependence behavior. Model 2,
with a modified direct damping function (eq ), is able to reproduce
the MP2 (or RI-MP2) E3B with noticeable
improvement. This can be reflected by the smaller error and better
correlation between Model 2 and MP2 results than other two models
(Table ).
Figure 4
Plots of the E3B distance dependence
calculated from three polarization models, MP2/aug-cc-pvtz (for water)
and RI-MP2/cc-pvtz (for other molecules) for selected trimer systems:
(a) Water, (b) Ammonia, (c) Methanol, and (d) Imidazole. The right-most
structure indices represent the equilibrium structure for each trimer.
The left side of the x-axis indicates the smaller
intermolecular distances than the right.
Table 4
Statistics Evaluation of the Three
Polarization Models on the E3B and E4B of the Trimers-Distance Setb
statisticsa
AMOEBA
Model 1
Model 2
MUE
0.41
0.51
0.24
MSE
–0.02
0.43
0.09
RMSE
0.67
1.01
0.37
R2
0.95
0.92
0.98
R2:
Pearson correlation coefficient. All statistic errors are in kcal/mol.
All statistics was calculated
using the E3B from MP2/aug-cc-pvtz (for
water trimers) and RI-MP2/cc-pvtz (for other trimers) as the references.
Plots of the E3B distance dependence
calculated from three polarization models, MP2/aug-cc-pvtz (for water)
and RI-MP2/cc-pvtz (for other molecules) for selected trimer systems:
(a) Water, (b) Ammonia, (c) Methanol, and (d) Imidazole. The right-most
structure indices represent the equilibrium structure for each trimer.
The left side of the x-axis indicates the smaller
intermolecular distances than the right.R2:
Pearson correlation coefficient. All statistic errors are in kcal/mol.All statistics was calculated
using the E3B from MP2/aug-cc-pvtz (for
water trimers) and RI-MP2/cc-pvtz (for other trimers) as the references.It is interesting to note that
the experimented models use direct damping functions
with power to the distance being
2 or 1, and appropriate damping parameters will also provide good
distance dependence. The corresponding RMSE values are 0.55 and 0.60
kcal/mol for the power being 2 and 1, respectively (see the SI).
Water6-Extra Set
Seven additional water hexamers were
employed to evaluate the three polarization models. These isomers
have different hydrogen bonding networks that represent the majority
of the hydrogen bonding patterns in bulk water and thus are suitable
for testing the directionality and transferability of the models.
Comparing to the MP2/aug-cc-pvtz E3B data
(Table ), the RMSE
values of the current AMOEBA, Model 1, and Model 2 are 1.36, 0.58,
and 0.14, respectively, where an ∼10-fold of improvement can
be obtained from AMOEBA to Model 2. While Model 1 shows better agreement
than AMOEBA on the absolute magnitudes of E3B and E4B, both Models 1 and 2 show similar
correlation with MP2 results, due to the use of the same damping function
but a different damping factor. The data suggests that Model 2 systematically
improved the description of the MP2 E3B among different water hexamer configurations.
Table 5
E3B and E4B Obtained from MP2/aug-cc-pvtz and Three Polarization
Models on Seven Water Hexamersa
E3B
E4B
hexamers
MP2
AMOEBA
Model 1
Model 2
MP2
AMOEBA
Model 1
Model 2
cage
–9.22
–10.63
–9.92
–9.39
–0.65
–1.02
–0.94
–0.86
bag
–10.41
–11.59
–10.78
–10.37
–1.35
–1.73
–1.60
–1.52
cyclic-chair
–11.59
–12.85
–12.01
–11.36
–2.05
–2.41
–2.24
–2.10
book-1
–10.35
–11.78
–10.99
–10.40
–1.33
–1.72
–1.60
–1.49
book-2
–10.11
–11.58
–10.80
–10.23
–1.23
–1.64
–1.52
–1.42
cyclic-boat-1
–11.20
–12.55
–11.74
–11.06
–1.88
–2.27
–2.11
–1.98
cyclic-boat-2
–11.19
–12.58
–11.78
–11.10
–1.87
–2.26
–2.10
–1.97
RMSE(kcal/mol)
1.36
0.58
0.14
0.38
0.25
0.15
R2
0.984
0.981
0.995
0.999
0.999
0.999
The RMSE and Pearson correlation
coefficient (R2) values were calculated
using MP2 data as the reference.
The RMSE and Pearson correlation
coefficient (R2) values were calculated
using MP2 data as the reference.
3B-69 Set
Fourteen clusters consisting of 5 molecules
from the 3B-69 database were selected to test the new polarization
models. Note that the RMSD of the E3B given
by MP2 and CCSD methods[32,33] is only 0.06 kcal/mol
for the 14 clusters (Table ). Overall, all three models display excellent transferability
on this set. It is interesting to point out that several sets of trimers
show positive E3B, and all three Thole-based
classical models are able to capture the sign (Table , numbers in bold). Similar conclusions are
found when the CCSD values are used as the reference (see parentheses).
Our models fitted to MP2 E3B data capture
not only the E3B for low/medium dispersion
molecules but also high dispersion molecules (such as Benzene).[32,33] The performance of these three models on the 3B-69 set is consistent
with that on the Tetramers set.
Table 6
E3B of
Five Trimers in the 3B-69 Set Calculated from the Three Polarization
Models and Two Ab Initio Methods (MP2/CBS and CCSD/CBS)b
molecules
isomers
MP2
CCSD
AMOEBA
Model 1
Model 2
Water
a
–1.39
–1.39
–1.56
–1.40
–1.36
b
1.07
1.08
1.23
1.07
1.05
c
–2.47
–2.42
–2.90
–2.66
–2.48
AcOH
a
0.52
0.54
0.42
0.37
0.42
b
–0.94
–0.92
–1.01
–0.81
–1.06
c
–0.25
–0.21
–0.19
–0.13
–0.20
AcNH2
a
–0.24
–0.09
–0.16
–0.07
–0.10
b
0.53
0.58
0.79
0.68
0.69
c
–0.85
–0.86
–0.71
–0.62
–0.64
Imidazole
a
–0.77
–0.66
–0.76
–0.69
–0.70
b
0.21
0.27
0.20
0.20
0.20
c
–1.63
–1.64
–1.70
–1.53
–1.54
Benzene
a
–0.05
0.05
a
a
a
b
0.08
0.15
0.09
0.09
0.08
c
–0.06
–0.03
–0.09
–0.09
–0.07
RMSE(D) (kcal/mol)
0.06
0.16(0.17)
0.12(0.12)
0.10(0.09)
For isomer a of
the Benzene trimer, the coordinates are incomplete in the original
paper.[33]
RMSE values were calculated using
MP2 or CCSD (in parentheses) data as the reference. Isomers with positive E3B values were highlighted in bold.
For isomer a of
the Benzene trimer, the coordinates are incomplete in the original
paper.[33]RMSE values were calculated using
MP2 or CCSD (in parentheses) data as the reference. Isomers with positive E3B values were highlighted in bold.
Trimers-Transferability
Set
Transferability of the
polarization models, mainly the universal damping parameter, was evaluated
using equilibrium trimers of additional organic molecules. In addition
to the C, H, O, and N elements that appear in our parametrization
databases, S, P, F, Cl, and Br elements were also included in this
test. Polarizability parameters of these molecules were taken from
AMOEBA09 without changes, and thus the damping factor was tested for
transferability to these new elements. Charge states of the clusters
were not limited to neutral, and both positively and negatively charged
trimers were included (see Figure ). The RMSE values of three models with respect to
the MP2 E3B are 0.31, 0.25, and 0.22 kcal/mol
for the current AMOEBA, Model 1, and Model 2, respectively. The largest
error for all three models is on the anionic (AcO–)(H2O)2 cluster: 1.04, 0.85, and 0.73 kcal/mol
for three models (see the SI). For Model
2, it was found that using a smaller damping parameter (stronger damping)
between the anionic carbonyl oxygen and other sites can greatly improve
the agreement. For example, using a direct damping
of 0.45 (instead of 0.75) between the anionic oxygen atoms of AcO– and other atoms reduced the error to 0.06 kcal/mol.
The same trend was found for other monovalent anions Cl– and Br– as discussed next (also see Table ). In summary, the models developed
from a small set of molecular clusters show satisfactory accuracy
and transferability on a wide range of chemical species.
Figure 5
E3B calculated from three polarization
models and RI-MP2/cc-pvtz for the Trimers-Transferability set. The
solid blue dot on (AcO)−(H2O)2 shows the E3B calculated from Model
2 with a damping factor of 0.45 for the direct induction
between the oxygen in the carbonyl group and other sites.
E3B calculated from three polarization
models and RI-MP2/cc-pvtz for the Trimers-Transferability set. The
solid blue dot on (AcO)−(H2O)2 shows the E3B calculated from Model
2 with a damping factor of 0.45 for the direct induction
between the oxygen in the carbonyl group and other sites.
Metal/Halogen Ions
A previous study[61] based on a Gaussian
electron density force field
indeed demonstrated that for metal ions, where strong fields are involved,
the many-body interactions are from both electron polarization and
Pauli exclusion effects. It is, therefore, interesting to examine
the capability of our classical polarization models, with limited
modifications, to capture the many-body interactions in such systems.Here we focus on the clusters formed by metal/halogen ions and
water molecules. The current AMOEBA model overall matches MP2 results
on the E3B and E4B of these tetramers with better agreements on monovalent
ions than on divalent cations (Table ). The two new models, where the ionic polarizability
and damping factors were modified (see Table for parameters), improved the overall performance
on divalent ion systems. For example, Model 1 and Model 2 reduce the
RMSE of the E3B from 2.95 (the current
AMOEBA) to 2.44 and 1.81 kcal/mol, respectively. The only exception
lies in the Zn2+-water clusters, where our models give
less positive E3B and less negative E4B. The reason probably can be ascribed to the
lack of charge-transfer term in our models. Our previous study shows
that the charge-transfer of Zn(H2O) complexes decreases (less negative) as the number of water
molecules increases.[14] Based on the E3B and E4B decomposition
approaches described in the SI (eqs S4–S6),
there would be a positive correction for the E3B and a negative correction for the E4B if the charge-transfer contribution were included. However,
it was found that in the bulk simulations of the Zn-water system,
the many-body interactions are dominated by polarization. Without
the inclusion of an explicit CT term, one should avoid overfitting
to the QM charge-transfer energy in small clusters.[14]
Table 7
E3B and E4B Obtained from the RI-MP2 and Polarization
Modelsb
E3B (kcal/mol)
E4B (kcal/mol)
clusters
QM
AMOEBA
Model 1
Model 2
QM
AMOEBA
Model 1
Model 2
Na+(H2O)3
4.26
4.85
4.63
4.61
–0.08
–0.14
–0.13
–0.13
K+(H2O)3
3.04
2.98
3.09
2.85
–0.06
–0.07
–0.08
–0.07
Mg2+(H2O)3
26.17
23.36
22.41
21.58
–1.53
–0.92
–0.87
–0.84
Ca2+(H2O)3
13.10
10.89
12.90
12.27
–0.59
–0.32
–0.43
–0.42
Zn2+(H2O)3_a
46.66
31.73
34.48
32.05
–3.48
–1.55
–1.77
–1.65
Zn2+(H2O)3_b
43.87
24.70
25.63
24.24
–3.99
–1.11
–1.17
–1.11
Cl–(H2O)3
1.05
2.75
2.73
2.31
–0.03
–0.09
–0.11
–0.08
Br–(H2O)3
0.44
1.47
1.53
1.49
0.00
–0.01
–0.03
–0.02
(NaCl)2
26.69
34.25
32.19
24.58
–0.49
1.36
1.89
1.69
NaCl(H2O)2_a
5.57
6.29
6.55
5.81
–0.68
–0.41
–0.41
–0.39
NaCl(H2O)2_b
12.10
14.05
14.13
12.66
–0.67
–0.33
–0.38
–0.38
RMSE(kcal/mol)a
2.95
2.44
1.81
0.67
0.84
0.78
Zn-water clusters are excluded in
RMSE calculations; Zn2+(H2O)3 and
NaCl(H2O)2 each has two structures each; Zinc
ion is in the plane composed of three oxygen atoms in Zn2+(H2O)3_a while out of that plane in Zn2+(H2O)3_b. See the graphics of these
structures in the SI.
RMSE values were given using
QM (RI-MP2) data as the reference.
Zn-water clusters are excluded in
RMSE calculations; Zn2+(H2O)3 and
NaCl(H2O)2 each has two structures each; Zinc
ion is in the plane composed of three oxygen atoms in Zn2+(H2O)3_a while out of that plane in Zn2+(H2O)3_b. See the graphics of these
structures in the SI.RMSE values were given using
QM (RI-MP2) data as the reference.All three models well capture the distance dependence
of the E3B for monovalent ions, such as
Na+, K+, Cl–, and Br– (Figure ). It is
noted that in the linear structure of M(H2O)2 (M = Mg2+, Ca2+, and Zn2+) clusters (where O···M···O
is linear), the E3B can be well captured
by classical models for Mg2+ and Ca2+ but not for Zn2+. In our model, nearly zero induced dipole was found on the metal site in M(H2O)2 (M = Mg2+, Ca2+, and Zn2+) trimer. The difference between Zn2+ and Mg2+(Ca2+) is likely due to the lacking of hyperpolarizability
(such as induced quadrupole) and charge transfer of transition metals.[62]
Figure 6
Plot of the E3B distance dependence
calculated by RI-MP2 and the three polarization models for M(H2O)2 systems, where M = Na+, K+, Mg2+, Ca2+, Zn2+, Cl–, and Br– as labeled in the figure. For M(H2O)2 (M = Mg2+, Ca2+, and
Zn2+), the angle ∠OMO is 109°28′, which
is a tetrahedral angle.
Plot of the E3B distance dependence
calculated by RI-MP2 and the three polarization models for M(H2O)2 systems, where M = Na+, K+, Mg2+, Ca2+, Zn2+, Cl–, and Br– as labeled in the figure. For M(H2O)2 (M = Mg2+, Ca2+, and
Zn2+), the angle ∠OMO is 109°28′, which
is a tetrahedral angle.
Polarization Energy from EDA
Through
the above discussion of molecular clusters beyond dimer, we have seen
that the many-body interactions can be well captured by the classical
polarization models, with Model 2 showing a better distance dependence
behavior of the E3B. Via many EDA methods,
the intermolecular interaction energy is usually decomposed into physically
meaningful components, such as electrostatics, exchange-repulsion,
dispersion, and induction. For the induction component, it remains
debated to be further separated into polarization and charge transfer
term,[63] although there have been such attempts
through EDA methods such as ALMO (absolutely localized molecular orbital),[52,53] BLW (block-localized wave function),[64,65] CSOV (constrained
space orbital variation),[66,67] RVS (reduced variational
space),[68] and regularized SAPT.[48] It is interesting to compare, in retrospect,
the results of classical models with these EDA methods.Here
we compared the polarization energy calculated using our models with
the “true induction” from regularized SAPT(DFT) and
polarization energy from ALMO method. It is noted that the results
from regularized SAPT(DFT) and ALMO agree with each other in the near-equilibrium
intermolecular distances but deviate in short separations, where ALMO
gives more negative polarization energy than the regularized SAPT(DFT).
It should be pointed out that the δHF correction,
a term to capture some higher-order terms not explicitly evaluated
by SAPT,[69] is excluded in regularized SAPT(DFT)
calculations. In our models, the current AMOEBA and Model 1 give incorrect
distance dependence behavior especially at the short O···H
intermolecular distances (Figure ). Model 2 shows the best trend comparing to the results
from the regularized SAPT(DFT). The polarization energy from Model
2 includes part of the δHF correction energy, as
the δHF term from SAPT2+/aug-cc-pvtz (see the SI for details) can be as large as −5.70
kcal/mol for the water dimer whose O···H distance is
1.41 Å and 0.71 kcal/mol for equilibrium distance (2.01 Å).
Additional comparisons of the three models and regularized SAPT(DFT)
were made for small organic molecules and ion–water systems
(see the SI). Again, among the three models,
Model 2 shows the best trend over the regularized SAPT(DFT) “true
induction” energy on these dimers. In summary, Model 2 qualitatively
agrees with the regularized SAPT(DFT) on the polarization term, while
the current AMOEBA and Model 1 show much deviation to the regularized
SAPT(DFT) as well as ALMO EDA.
Figure 7
Plot of the polarization (or induction)
energy obtained from the
three polarization models and two QM EDA methods (regularized SAPT(DFT)
and ALMO) for water dimer at 7 separations. The δHF correction is excluded in the regularized SAPT(DFT) calculation.
Plot of the polarization (or induction)
energy obtained from the
three polarization models and two QM EDA methods (regularized SAPT(DFT)
and ALMO) for water dimer at 7 separations. The δHF correction is excluded in the regularized SAPT(DFT) calculation.
Conclusion
In this work, we show that the many-body interactions (E3B and E4B) of organic
molecular clusters can be well captured by classical polarization
models. Overall, the current AMOEBA model and two new models are able
to reasonably describe the three- and four-body energy for a wide
range of organic molecular clusters in different configurations. In
these simple models, universal parameters controlling the damping
strength perform well for all organic species tested in this study.
Comparing to the current AMOEBA, as expected, the two new models that
have been explicitly fitted to the E3B show better agreement with the MP2 results. Model 2, where the damping
function for direct induction due to permanent field
was modified, best reproduced the distance dependence of the E3B. These models are also able to capture all
the positive many-body interactions according to the MP2 calculations.
In a physical sense, these results clearly show that instead of using
the smeared charge distribution given by the damping function of the
current AMOEBA, a different charge distribution is needed to well
capture the distance dependence of many-body interactions.The
point dipole induction models were also applied to the molecular
clusters involving monovalent ions and divalent cations (Mg2+, Ca2+, and Zn2+). We found that, as with previous
studies, the divalent cations required different damping strengths
from those of organic clusters. Furthermore, results on Zn2+-water clusters indicate the importance of the charge-transfer and
hyperpolarizability (induced quadrupole) which are currently missing
in our treatment. For highly ionic species and molecular systems such
as van der Waals clusters, additional many-body contributions should
be considered explicitly.
Authors: Zhifeng Jing; Joshua A Rackers; Lawrence R Pratt; Chengwen Liu; Susan B Rempe; Pengyu Ren Journal: Chem Sci Date: 2021-06-02 Impact factor: 9.825