Hsing-Hsiang Huang1, Yi-Siang Wang2, Sheng D Chao1. 1. Institute of Applied Mechanics, National Taiwan University, Taipei 10617, Taiwan R.O.C. 2. School of Chemistry & Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States.
Abstract
We extend our previous quantum chemistry calculations of interaction energies for 31 homodimers of small organic functional groups (the SOFG-31 data set) by including 239 heterodimers with monomers selected within the SOFG-31 data set, thus resulting in the SOFG-31+239 data set. The minimum-level theoretical scheme contains (1) the basis set superposition error corrected supermolecule (BSSE-SM) approach for intermolecular interactions; (2) the second-order Møller-Plesset perturbation theory (MP2) with the Dunning's aug-cc-pVXZ (X = D, T, Q) basis sets for the geometry optimization and correlation energy calculations; and (3) the single-point energy calculations with the coupled cluster with single, double, and perturbative triple excitations method at the complete basis set limit [CCSD(T)/CBS] using the well-tested extrapolation methods for the MP2 energy calibrations. In addition, we have performed a parallel series of energy decomposition calculations based on the symmetry adapted perturbation theory (SAPT) in order to gain chemical insights. That the above procedure cannot be further reduced has been proven to be very crucial for constructing reliable data sets of interaction energies. The calculated CCSD(T)/CBS interaction energy data can serve as a benchmark for testing or training less accurate but more efficient calculation methods, such as the electronic density functional theory. As an application, we employ a segmental SAPT model previously developed for the SOFG-31 data set to predict binding energies of large heterodimer complexes. These model energy "quanta" can be used in coarse-grained molecular dynamics simulations by avoiding large-scale calculations.
We extend our previous quantum chemistry calculations of interaction energies for 31 homodimers of small organic functional groups (the SOFG-31 data set) by including 239 heterodimers with monomers selected within the SOFG-31 data set, thus resulting in the SOFG-31+239 data set. The minimum-level theoretical scheme contains (1) the basis set superposition error corrected supermolecule (BSSE-SM) approach for intermolecular interactions; (2) the second-order Møller-Plesset perturbation theory (MP2) with the Dunning's aug-cc-pVXZ (X = D, T, Q) basis sets for the geometry optimization and correlation energy calculations; and (3) the single-point energy calculations with the coupled cluster with single, double, and perturbative triple excitations method at the complete basis set limit [CCSD(T)/CBS] using the well-tested extrapolation methods for the MP2 energy calibrations. In addition, we have performed a parallel series of energy decomposition calculations based on the symmetry adapted perturbation theory (SAPT) in order to gain chemical insights. That the above procedure cannot be further reduced has been proven to be very crucial for constructing reliable data sets of interaction energies. The calculated CCSD(T)/CBS interaction energy data can serve as a benchmark for testing or training less accurate but more efficient calculation methods, such as the electronic density functional theory. As an application, we employ a segmental SAPT model previously developed for the SOFG-31 data set to predict binding energies of large heterodimer complexes. These model energy "quanta" can be used in coarse-grained molecular dynamics simulations by avoiding large-scale calculations.
Molecular modeling of complex materials has been a very useful
tool of computational chemistry in gaining better understanding of
intricate experimental observations. At the atomic level, the techniques
are mainly concerned with developing classical force fields to model
both chemical (covalent) bonding and intermolecular or noncovalent
interactions. Traditional empirical force fields (EFFs)[1−5] have long been used in molecular mechanics, Monte Carlo simulations,
and molecular dynamics simulations. Because many popular EFFs utilize
extensive experimental data in their model constructions, the efficacy
in reproducing experiments deteriorates very quickly once the models
are used outside the original training sets. More fundamental chemical
models (usually called ab initio force fields (AIFFs), to distinguish
them from EFFs) are mainly based on quantum chemistry calculations
with, hopefully, minimum inputs from experiments.[6−18] Most current generation force fields have employed various levels
of potential energy data from electronic structure calculations, which
are usually collected in the form of numerical data sets. These interaction
energy data sets not only are useful for designing universal force
fields but also serve as a benchmark for testing and/or training lower-level
but more computationally efficient calculation methods, such as the
electronic density functional theory (DFT).[19−24] Therefore, it is a continuing effort to develop comprehensive data
sets of accurate intermolecular interaction energies based on high-level
quantum chemistry calculations.[25−32]A reliable quantum chemistry calculation for interaction energies
requires a size-consistent correlation method and a sizable basis
set for error tolerance. An improper combination of method and basis
set would render misleading, if not false, conclusions, making the
human efforts wasteful and the calculated data futile. The issue of
choosing a proper combination of a correlation method and a basis
set has been carefully examined by previous database constructions,
notably those by the Hobza group, the Sherrill group, and the Grimme
group, independently and respectively.[33−35] Thanks to these strenuous
efforts, a consensus has been reached among active researchers in
determining a minimum level theoretical scheme for the calculated
interaction energy to bear a “sub-chemical accuracy”
(ca. 0.1 kcal/mol).[36−38] It contains (1) the basis set superposition error[39,40] corrected supermolecule (BSSE-SM) approach[41−43] for intermolecular
interactions; (2) the second-order Møller–Plesset perturbation
theory (MP2)[44] with the Dunning’s
aug-cc-pVXZ (X = D, T, Q) basis sets[45] for
the geometry optimization and correlation energy calculations; and
(3) the single-point energy calculations with the coupled cluster
with single, double, and perturbative triple excitations method at
the complete basis set limit [CCSD(T)/CBS] using the well-tested extrapolation
methods for the MP2 energy calibrations.[46,47] Complementary energy dissection methods, such as the symmetry adapted
perturbation theory (SAPT), are often required in order to gain physical
understanding of the calculated interaction energy.One of the
earlier efforts of collecting the benchmark interaction
energy data into well-edited data sets was attributed to the Hobza
group.[48] For example, the S22 data set[49] and its subsequent refinements[37,50] have served as a paradigm of first initiating a data set at a minimum
theoretical level and subsequently extending the original scope. Indeed,
because of their feeble magnitudes, the calculation of accurate interaction
energies is a daunting task. For large noncovalent bounded systems,
the above standard procedure is usually not feasible because of the
enormous increase of the computational cost. Currently, for small
complexes with less than 50 atoms, this line of practice is continued
and being gradually revised.[51,52] For example, the Řezač
group has recently launched the ATLAS project.[53] More comprehensive “super” data sets are
also collected and maintained, notably by the Head-Gordon group,[22,54] the Grimme group,[55,56] the Shaw group,[57] and the QCArchive database.[58]In a previous study, we constructed an interaction energy
data
set for the homodimers of 31 small organic functional groups (the
SOFG-31 data set).[59] The SOFG-31 data set
is a minimum CCSD(T)/CBS data set in the sense that these energies
are calculated at the minimum-level theoretical scheme described above.
In this paper, we extend the study to the heterodimers with the dimeric
pair monomers selected from the SOFG-31 data set. Because there are
239 (out of 465) heterodimers considered in this work, the resulting
data set is called the SOFG-31+239 data set. The other part of this
paper is organized into the following sections. In section we briefly describe the theoretical
considerations and computational details. Our main results and discussions
are shown in section . We conclude this work in section , and numerical data of reference value are available
in the Supporting Information.
Quantum Chemistry Calculations
The theoretical scheme is
similar to that used in the construction
of the SOFG-31 data set.[59] Briefly, the
basis set superposition error corrected supermolecule (BSSE-SM) approach
was employed for calculating the interaction energies. The second-order
Møller–Plesset perturbation theory (MP2) with the Dunning’s
aug-cc-pVXZ (X = D, T, Q) basis sets has been employed in the geometry
optimization and energy calculations. The MP2 calculated energies
have been calibrated by using the coupled cluster with single, double,
and perturbative triple excitations method at the complete basis set
limit [CCSD(T)/CBS]. All the molecular orbital calculations and the
Berny geometry optimization tasks were performed using the Gaussian
09 suite of programs.[60] No symmetry or
rigid molecule constraints were imposed in the geometry optimization
calculations. The normal-mode frequency analysis has been performed,
and the found equilibrium complexes were carefully checked to ensure
that all the obtained configurations are true energy minima on the
respective potential energy surfaces.For benchmark data calibrations,
the CCSD(T)/CBS energy is the
well-recognized “gold standard”. However, directly calculating
the CCSD(T) energies at increasingly large basis sets is very computationally
intense work. It is more feasible to first optimize the dimer structure
using the MP2 method at a series of good-quality basis functions (such
as Dunning’s) and then use the well-tested extrapolation methods
to obtain the CCSD(T)/CBS values. There are two standard ways for
obtaining the complete basis set limit values. The first method of
Helgaker et al.[47] is based on the theoretically
justified power-law dependence of the energy on the aug-cc-pVXZ (X
= D, T, Q, etc.) basis set. Using the calculated data at two basis
functions of different X’s, one can extrapolate the energy
to the CBS value as X approaches infinity. On the other hand, the
focal-point extrapolation method[61] is used
to estimate the CBS value by considering the difference between the
CCSD(T) and the MP2 interaction energies calculated at the same (smaller)
basis set. It is assumed that although the absolute values of interaction
energy converge very slowly, the difference between the values calculated
by the two correlation methods is negligibly dependent on the basis
set size, as long as a minimum basis function is used. This assumption
has been thoroughly tested in the previous database constructions
and is known to be reliable for a variety of noncovalently bonded
complexes. In this work, the MP2/CBS binding energies were obtained
from the extrapolation method of Helgaker et al.[47] with Dunning’s correlation consistent basis set
(aug-cc-pVXZ, X = D, T, and up to Q). The CCSD(T)/CBS binding energies
were obtained using the focal point extrapolation method.[61]The calculated interaction energies are
further analyzed by the
symmetry-adapted perturbation theory (SAPT0) with the jun-cc-pVXZ
(X = D, T) basis set[62] as implemented in
the PSI4 program.[63] The model segmental
SAPT analysis was discussed in our previous paper and is illustrated
and used here as an application of the SAPT data.[59]
Results and Discussion
The SOFG-31
data set contains 31 homodimers with monomers distributed
in three subsets. The alkane–alkene–alkyne (AAA) subset
contains 6 alkanes (methane to hexane), 4 alkenes (ethene to 1-pentene),
and 4 alkynes (ethyne to 1-pentyne). The alcohol–aldehyde–ketone
(AAK) subset includes 4 alcohols (methanol to 1-butanol), 4 aldehydes
(formaldehyde to butanal), and 3 ketones (acetone to 2-pentanone).
The carboxylic acid–amide (CAA) subset consists of 3 carboxylic
acids (formic acid to propanoic acid) and 3 amides (formamide to propanamide).
With the intended heterodimers in mind, we consider 239 cross-group
combinations with the pair monomers selected from respective subgroups.
More specifically, we classify the binary complexes according to the
following subsets. The AAA–AAA set contains 12 alkane–alkane
(Aa–Aa), 16 alkane–alkene (Aa–Ae), and 6 alkene–alkene
(Ae–Ae) heterodimers (34 in total). The AAA–AAK set
contains 16 alkane–alcohol (Aa–Ac), 16 alkane–aldehyde
(Aa–Ad), 12 alkane–ketone (Aa–K), 16 alkene–alcohol
(Ae–Ac), 16 alkene–aldehyde (Ae–Ad), and 12 alkene–ketone
(Ae–K) heterodimers (88 in total). The AAA–CAA contains
12 alkane–carboxylic acid (Aa–Ca), 12 alkane–amide
(Aa–Am), 12 alkene–carboxylic acid (Ae–Ca), and
12 alkene–amide (Ae–Am) heterodimers (48 in total).
The AAK–AAK set contains 6 alcohol–alcohol (Ac–Ac),
and the AAK–CAA set contains 12 alcohol–carboxylic acid
(Ac–Ca), 12 alcohol–amide (Ac–Am), 12 aldehyde–carboxylic
acid (Ad–Ca), and 12 aldehyde–amide (Ad–Am) heterodimers
(54 in total). Finally, we consider 15 binary complexes in the CAA–CAA
set.
Data Set for the AAA–AAA Heterodimers
Alkane–Alkane (Aa–Aa) Heterodimers
Figure shows the
optimized structures of the studied alkane–alkane heterodimers.
Notice that in the data set only the all-transn-alkanes are considered. We expect that the complexes are
stabilized in a regular pattern due to their homology. Similar to
their homodimer counterparts, larger heterodimers exhibit a binding
pattern where the pair monomers are aligned in parallel with an inverse
zigzag (staggered) contact geometry to avoid the stereorepulsion frustrations.
This avoided stereorepulsion principle was first demonstrated clearly
by Tsuzuki et al.[64,65] for the alkane homodimers and
then verified by other groups, including ours.[66−69] Here we show that this principle
also works for heterodimers.
Figure 1
Optimized structures of the dimers in the Aa–Aa
series.
Optimized structures of the dimers in the Aa–Aa
series.In Table we summarize
the calculated MP2 and CCSD(T) energy data with the aug-cc-pVXZ (X
= D, T, Q) basis sets (denoted as aDZ, aTZ, and aQZ, respectively)
and the CBS extrapolation values for the alkane–alkane heterodimers.
We see that the MP2 energy exhibits a systematic converging trend
as the basis size increases. This indicates the good quality of Dunning’s
basis sets and the theoretically justified extrapolation rules. Our
calculated energy data are consistent with previous benchmark CCSD(T)/CBS
calculations for specific dimers. For example, the binding energy
of the methane–ethane dimer is −0.825 (in kcal/mol),
as compared to −0.827 (the A24 data set).[52]
Table 1
Binding Energies of the Dimers in
the Aa–Aa Series
Figure shows the
optimized structures of the studied alkane–alkene heterodimers.
Notice that for larger alkenes only the 1-alkenes are considered,
so we will omit the numeral tag for brevity’s sake. Overall,
the complexes are stabilized in a regular pattern. For both short-chain
monomers, such as the methane–ethene and ethane–propene
dimers, the alkane tends to incline toward the end double-bond of
the paired alkene. As the chains become longer, they tend to align
in parallel as in the alkane–alkane (Aa–Aa) heterodimers.
In contrast to the latter, where the σ–σ interaction
(or dihydrogen bond) plays the role for stereorepulsions, the short-chain
alkane–alkene heterodimers employ the σ–π
interaction to avoid the orbital overlapping. For long-chain complexes,
the alkyl tails tend to stabilize again with the σ–σ
interactions, thus yielding the binding patterns as shown in Figure .
Figure 2
Optimized structures
of the dimers in the Aa–Ae series.
Optimized structures
of the dimers in the Aa–Ae series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the alkane–alkene heterodimers.
Looking at the specific values calculated from the aDZ to the aQZ
basis sets, the binding energy follows a systematic converging trend
as the basis size increases. This suggests the necessity of using
at least the aTZ basis function for this series. The MP2/aQZ energy
data are consistent with MP2/CBS calculations for specific dimers.
For example, the binding energy of the methane–ethylene dimer
is −0.863 kcal/mol, as compared to −0.889 kcal/mol.
Table 2
Binding Energies of the Dimers in
the Aa–Ae Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methane–ethylene
–0.702
–0.827
–0.863
–0.906
[−0.706]
[−0.842]
[−0.880]
methane–propylene
–0.920
–1.041
–1.081
–1.057
[−0.870]
[−0.988]
methane–butylene
–1.016
–1.168
–1.230
[−1.012]
[−1.166]
methane–pentylene
–1.253
–1.422
–1.489
[−1.249]
ethane–ethylene
–1.111
–1.302
–1.356
–1.362
[−1.074]
[−1.280]
[−1.323]
ethane–propylene
–1.543
–1.745
–1.808
–1.682
[−1.380]
[−1.573]
ethane–butylene
–1.652
–1.874
–1.900
[−1.581]
[−1.806]
ethane–pentylene
–1.886
–2.129
–2.139
[−1.794]
propane–ethylene
–1.362
–1.552
–1.610
–1.515
[−1.237]
[−1.415]
propane–propylene
–2.009
–2.252
–2.076
[−1.760]
[−1.974]
propane–butylene
–2.106
–2.353
–2.222
[−1.871]
propane–pentylene
–2.279
–2.549
–2.543
[−2.159]
butane–ethylene
–1.615
–1.806
–1.608
[−1.378]
butane–propylene
–2.355
–2.628
–2.435
[−2.047]
butane–butylene
–2.469
–2.737
–2.534
[−2.153]
butane–pentylene
–2.690
–3.009
–2.995
[−2.542]
Alkene–Alkene
(Ae–Ae) Heterodimers
Figure shows the
optimized structures of the studied alkene–alkene heterodimers.
For the alkene–alkene series, the functional active sites (heads)
tend to form a T-shape cross pattern with respect to each other in
order to keep the π bonds as far as possible and minimize the
repulsion. This avoided stereorepulsion principle serves as a general
stabilization mechanism for hydrocarbons. In this case, it is the
π–π interaction which plays the role of stereorepulsion.[70−72] For long-chain complexes, such as the butene–pentene heterodimer,
the alkyl tails tend to stabilize using the σ–σ
interaction, thus competing with the functional heads.
Figure 3
Optimized structures
of the dimers in the Ae–Ae series.
Optimized structures
of the dimers in the Ae–Ae series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the Ae–Ae heterodimers. Similar
to the Aa–Aa and Aa–Ae series, the aTZ basis is suggested
for the binding energy calculations in this category. The MP2/aQZ
energy data of ethylene–propylene is only 0.05 kcal/mol different
from its MP2/CBS energy.
Table 3
Binding Energies
of the Dimers in
the Ae–Ae Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
ethylene–propylene
–1.687
–1.893
–1.962
–1.737
[−1.438]
[−1.618]
ethylene–butylene
–1.734
–1.937
–1.753
[−1.490]
[−1.668]
ethylene–pentylene
–2.008
–2.261
–2.150
[−1.790]
[−2.030]
propylene–butylene
–2.409
–2.665
–2.388
[−2.024]
propylene–pentylene
–2.810
–3.125
–2.863
[−2.415]
Data Set for the AAA–AAK
Heterodimers
The molecules in the AAK groups contain an oxygen
atom in the functional
active site which introduces the possibility to form a hydrogen bond
in a heterodimer. The strengths of such formed hydrogen bonds are
expected to be weaker than the corresponding AAK–AAK homodimers.
The details are discussed along with the following further specific
complexes.
Alkane–Alcohol (Aa–Ac) Heterodimers
Figure shows the
optimized structures of the studied alkane–alcohol heterodimers.
In this series, the functional hydroxyl end in the alcohol group tends
to attract one carbon in the alkane group and the peripheral hydrogen
atoms around the carbon are repelled from each other in order to minimize
the repulsion. For long-chain complexes, the alkyl tails employ the
same avoided stereorepulsion principle for hydrocarbons. We see the
stabilized complexes are consistent with these two principles.
Figure 4
Optimized structures
of the dimers in the Aa–Ac series.
Optimized structures
of the dimers in the Aa–Ac series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the alkane–alcohol heterodimers.
We see the energy follows a systematic converging trend as the basis
size increases. This demonstrates the good quality of Dunning’s
basis set and the theoretically justified extrapolation rules, especially
for larger alkyl groups. For example, The MP2/aQZ energy data of methane–methanol
is only 0.037 kcal/mol different from its MP2/CBS energy.
Table 4
Binding Energies of the Dimers in
the Aa–Ac Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methane–methanol
–0.957
–1.185
–1.236
–1.352
[−1.006]
[−1.259]
[−1.315]
methane–ethanol
–1.119
–1.331
–1.382
–1.498
[−1.171]
[−1.410]
methane–propanol
–1.242
–1.382
–1.433
–1.514
[−1.265]
[−1.426]
methane–butanol
–1.448
–1.639
–1.746
[−1.458]
[−1.666]
ethane–methanol
–1.396
–1.697
–1.773
–1.904
[−1.443]
[−1.783]
[−1.849]
ethane–ethanol
–1.481
–1.687
–1.751
–1.821
[−1.477]
[−1.710]
ethane–propanol
–1.693
–1.951
–2.022
–2.144
[−1.727]
[−2.021]
ethane–butanol
–1.882
–2.127
–2.228
[−1.861]
[−2.125]
propane–methanol
–1.592
–1.912
–1.993
–2.132
[−1.646]
[−1.992]
propane–ethanol
–2.059
–2.333
–2.410
–2.450
[−2.018]
[−2.317]
propane–propanol
–2.212
–2.462
–2.565
[−2.191]
[−2.460]
propane–butanol
–2.434
–2.721
–2.784
[−2.374]
[−2.663]
butane–methanol
–1.722
–2.050
–2.133
–2.260
[−1.767]
[−2.116]
butane–ethanol
–2.277
–2.566
–2.650
[−2.220]
[−2.528]
butane–propanol
–2.454
–2.808
–2.973
[−2.454]
[−2.824]
butane–butanol
–2.822
–3.152
–3.201
[−2.732]
Alkane–Aldehyde
(Aa–Ad) Heterodimers
Figure shows the
optimized structures of the studied alkane–aldehyde heterodimers.
Compared to alcohols, the carbonyl oxygen in an aldehyde tends to
attract one hydrogen in the paired alkane group. Therefore, there
is one hydrogen in the alkane pointing to the oxygen in the aldehyde.
We see that the local −C=O–H structure appears in all
the stabilized complexes. Again, for long-chain complexes, the alkyl
tails tend to align in parallel. We see the stabilized complexes are
consistent with these observations.
Figure 5
Optimized structures of the dimers in
the Aa–Ad series.
Optimized structures of the dimers in
the Aa–Ad series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the alkane–aldehyde heterodimers.
The property of the systematic converging trend of basis size can
also be seen here. Most MP2/aQZ data are presented here for the first
time, and their CCSD(T)/CBS data can serve as benchmark values for
comparison.
Table 5
Binding Energies of the Dimers in
the Aa–Ad Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methane–formaldehyde
–0.830
–0.981
–1.026
–1.125
[−0.867]
[−1.042]
[−1.090]
methane–acetaldehyde
–0.966
–1.111
–1.156
–1.182
[−0.959]
[−1.106]
[−1.149]
methane–propionaldehyde
–1.208
–1.385
–1.438
–1.430
[−1.177]
[−1.355]
methane–butyraldehyde
–1.336
–1.522
–1.560
[−1.297]
[−1.482]
ethane–formaldehyde
–1.074
–1.257
–1.319
–1.296
[−1.011]
[−1.203]
[−1.251]
ethane–acetaldehyde
–1.477
–1.712
–1.782
–1.751
[−1.394]
[−1.630]
ethane–propionaldehyde
–1.843
–2.090
–2.167
–2.127
[−1.742]
[−1.994]
ethane–butyraldehyde
–1.911
–2.174
–2.197
[−1.824]
[−2.086]
propane–formaldehyde
–1.456
–1.702
–1.776
–1.779
[−1.402]
[−1.651]
propane–acetaldehyde
–1.866
–2.141
–2.221
–2.145
[−1.742]
[−2.007]
propane–propionaldehyde
–2.121
–2.400
–2.387
[−2.006]
[−2.270]
propane–butyraldehyde
–2.425
–2.726
–2.712
[−2.284]
butane–formaldehyde
–1.582
–1.843
–1.923
–1.849
[−1.464]
[−1.711]
butane–acetaldehyde
–2.150
–2.432
–2.407
[−2.018]
[−2.288]
butane–propionaldehyde
–2.676
–3.021
–2.992
[−2.502]
butane–butyraldehyde
–2.990
–3.356
–3.305
[−2.785]
Alkane–Ketone
(Aa–K) Heterodimers
Figure shows the
optimized structures of the studied alkane–ketone heterodimers.
Similar to the aldehydes, the ketone functional oxygen also tends
to attract one hydrogen in the alkane group. The side methyl group
does not cause too much distortion of the local −C=O–H
structures as this pattern is quite directional. As the alkyl chain
gets long, the complexes tend to align in parallel.
Figure 6
Optimized structures
of the dimers in the Aa–K series.
Optimized structures
of the dimers in the Aa–K series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the alkane–ketone heterodimers.
For the methane–acetone and the ethane–acetone dimers,
where the MP2/aQZ optimization is converged, we see the energy follows
a systematic converging trend as the basis size increases. Similar
to the previous series, the improvement using the aTZ with respect
to the aDZ basis sets is more significant than that using the aQZ
with respect to the aTZ basis sets. For example, for the methane–acetone
dimer, the energy difference MP2/aDZ-aTZ is 0.160 kcal/mol, while
the MP2/aTZ-aQZ is only 0.049 kcal/mol. Thus, at least the aTZ basis
function should be used for the geometry optimization.
Table 6
Binding Energies of the Dimers in
the Aa–K Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methane–acetone
–1.186
–1.346
–1.395
–1.402
[−1.160]
[−1.317]
methane–butanone
–1.464
–1.658
–1.698
[−1.417]
[−1.608]
methane–pentanone
–1.579
–1.781
–1.801
[−1.523]
[−1.716]
ethane–acetone
–1.857
–2.104
–2.177
–2.115
[−1.746]
[−1.989]
ethane–butanone
–2.259
–2.534
–2.507
[−2.114]
[−2.391]
ethane–pentanone
–2.231
–2.506
–2.506
[−2.119]
[−2.390]
propane–acetone
–2.529
–2.854
–2.781
[−2.336]
[−2.644]
propane–butanone
–2.528
–2.821
–2.782
[−2.366]
propane–pentanone
–2.672
–2.979
–2.954
[−2.518]
butane–acetone
–2.838
–3.175
–3.091
[−2.632]
[−2.949]
butane–butanone
–3.332
–3.717
–3.624
[−3.077]
butane–pentanone
–3.342
–3.733
–3.644
[−3.088]
Alkene–Alcohol
(Ae–Ac) Heterodimers
Figure shows the
optimized structures of the studied alkene–alcohol heterodimers.
For this series, the functional −OH end in the alcohol group
tends to attract the nucleophilic region in the alkene group. Therefore,
the local −OH−π pattern is found in all the stabilized
complexes. This hydrogen-mediated bonding is of similar higher directionality
to a hydrogen bond so when the chains get longer, the alkyl tails
yield to this dominant structural pattern but do not always align
in parallel. This subtle point can be seen very clearly in Figure .
Figure 7
Optimized structures
of the dimers in the Ae–Ac series.
Optimized structures
of the dimers in the Ae–Ac series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the alkene–alcohol heterodimers.
The systematic converging trend with increasing basis size provides
confidence of the CCSD(T)/CBS calculations. In general, the larger
the alkyl group is, the higher the binding energy.
Table 7
Binding Energies of the Dimers in
the Ae–Ac Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
ethylene–methanol
–2.614
–2.934
–3.018
–2.835
[−2.394]
[−2.711]
[−2.784]
ethylene–ethanol
–2.786
–3.129
–3.217
–3.026
[−2.537]
[−2.874]
ethylene–propanol
–2.917
–3.262
–3.131
[−2.651]
[−2.986]
ethylene–butanol
–3.043
–3.392
–3.266
[−2.776]
[−3.119]
propylene–methanol
–2.228
–2.471
–2.556
–2.526
[−2.114]
[−2.379]
propylene–ethanol
–3.645
–4.053
–3.934
[−3.356]
[−3.762]
propylene–propanol
–3.878
–4.306
–4.142
[−3.545]
[−3.962]
propylene–butanol
–4.104
–4.542
–4.359
[−3.737]
butylene–methanol
–3.602
–4.045
–4.167
–4.046
[−3.384]
[−3.835]
butylene–ethanol
–3.864
–4.321
–4.278
[−3.627]
[−4.086]
butylene–propanol
–4.049
–4.511
–4.428
[−3.771]
butylene–butanol
–4.307
–4.790
–4.701
[−4.015]
pentylene–methanol
–3.531
–3.914
–3.834
[−3.288]
[−3.673]
pentylene–ethanol
–3.828
–4.309
–4.303
[−3.619]
pentylene–propanol
–4.047
–4.471
–4.298
[−3.695]
pentylene–butanol
–4.592
–5.121
–5.076
[−4.324]
Alkene–Aldehyde
(Ae–Ad) Heterodimers
Figure shows the
optimized structures of the studied alkene–aldehyde heterodimers.
The aldehyde functional oxygen tends to attract one hydrogen in the
alkene group. However, the attraction is largely reduced by the confronting
π–π repulsion, which is similar to the alkene–alkene
heterodimers. The two double bonds are thus tending to avoid each
other. Therefore, we see the local T-shape structure appear in all
the stabilized complexes. This bonding pattern is of high directionality,
so when the chains get longer, the alkyl tails yield to this dominant
structural pattern but not always align in parallel. This subtle point
can be seen very clearly in Figure .
Figure 8
Optimized structures of the dimers in the Ae–Ad
series.
Optimized structures of the dimers in the Ae–Ad
series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the alkane–aldehyde heterodimers.
Both the energy converging trends with respect to the basis size and
the alkyl group size can be seen in this series. In comparison to
the corresponding Ae–Ac serious, the Ae–Ad series has
lower binding energy. This might be due to the confronting π–π
repulsion as described in above.
Table 8
Binding Energies of the Dimers in
the Ae–Ad Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
ethylene–formaldehyde
–1.629
–1.845
–1.924
–1.781
[−1.468]
[−1.661]
[−1.725]
ethylene–acetaldehyde
–2.038
–2.287
–2.370
–2.226
[−1.851]
[−2.082]
[−2.165]
ethylene–propanal
–2.302
–2.576
–2.413
[−2.049]
[−2.298]
ethylene–butanal
–2.437
–2.719
–2.517
[−2.144]
[−2.398]
propylene–formaldehyde
–2.536
–2.845
–2.964
–2.668
[−2.204]
[−2.462]
propylene–acetaldehyde
–2.767
–3.091
–2.884
[−2.466]
[−2.748]
propylene–propanal
–2.842
–3.158
–2.962
[−2.544]
[−2.829]
propylene–butanal
–3.027
–3.360
–3.208
[−2.735]
butylene–formaldehyde
–2.515
–2.816
–2.930
–2.662
[−2.213]
[−2.465]
butylene–acetaldehyde
–3.034
–3.350
–3.105
[−2.701]
[−2.974]
butylene–propanal
–2.990
–3.300
–3.158
[−2.717]
butylene–butanal
–3.217
–3.535
–3.359
[−2.907]
pentylene–formaldehyde
–2.555
–2.851
–2.620
[−2.248]
[−2.495]
pentylene–acetaldehyde
–2.924
–3.228
–3.011
[−2.623]
[−2.883]
pentylene–propanal
–3.165
–3.479
–3.313
[−2.867]
pentylene–butanal
–3.441
–3.767
–3.574
[−3.111]
Alkene–Ketone
(Ae–K) Heterodimers
Figure shows the
optimized structures of the studied alkene–ketone heterodimers.
Similar to aldehydes, the ketone functional oxygen also tends to attract
one hydrogen in the alkene group but is hindered by the confronting
π–π repulsion. This is further complicated by the
side methyl group which tends to tilt the local perpendicular structures.
Similar to the alkene–aldehyde heterodimers, we observe the
local T-shape structure in all the stabilized complexes, and for longer
chains, the alkyl tails do not always align in parallel. We see in
this case there are several competing mechanisms for stabilizing the
overall conformations.
Figure 9
Optimized structures of the dimers in the Ae–K
series.
Optimized structures of the dimers in the Ae–K
series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the alkene–ketone heterodimers.
The computational cost when enlarging the alkyl groups on both the
alkene and ketone sites is significant for this series, mainly because
there is no regular expected configurations to initiate the optimization.
Therefore, a proper choice of a smaller basis size for balancing the
computational cost is necessary. For example, the MP2 energy of the
ethylene–acetone dimer follows a systematic converging trend
as the basis size increases, and the aTZ basis set is suggested.
Table 9
Binding Energies of the Dimers in
the Ae–K Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
ethylene–acetone
–2.618
–2.914
–3.011
–2.771
[−2.334]
[−2.603]
ethylene–butanone
–2.773
–3.086
–2.870
[−2.457]
[−2.738]
ethylene–pentanone
–2.880
–3.194
–2.952
[−2.542]
[−2.820]
propylene–acetone
–3.516
–3.863
–3.543
[−3.094]
[−3.397]
propylene–butanone
–3.896
–4.280
–3.949
[−3.394]
propylene–pentanone
–4.050
–4.432
–4.061
[−3.518]
butylene–acetone
–3.623
–3.965
–3.679
[−3.193]
butylene–butanone
–4.010
–4.387
–4.041
[−3.505]
butylene–pentanone
–4.171
–4.546
–4.171
[−3.638]
pentylene–acetone
–3.731
–4.073
–3.776
[−3.290]
pentylene–butanone
–4.048
–4.420
–4.083
[−3.554]
pentylene–pentanone
–4.027
–4.462
–4.365
[−3.747]
Data Set for the AAA–CAA
Heterodimers
The molecules in the CAA groups all contain
two functional active
sites for the hydrogen bond donor and acceptor, respectively. However,
the paired monomers are hydrocarbons, so the possibility of forming
the double hydrogen bonding pattern decreases as the chains get longer.
A more generally expected pattern would be the formation of a weak
hydrogen bond, similar to the AAA–AAK heterodimers discussed
in section . The
strengths of interaction are expected to be stronger than the corresponding
AAA–AAK heterodimers but may compete with those of the AAK–AAK
homodimers. The details are discussed along with the following further
specific complexes.
Alkane–Amide (Aa–Am)
and Alkane–Carboxylic
Acid (Aa–Ca) Heterodimers
Figure shows the optimized structures of the studied
Aa–Am and Aa–Ca heterodimers in the alkane–CAA
series. For this series, the functional hydrogen donor (acceptor)
site tends to attract one carbon (hydrogen) in the alkane group. Overall
the pattern is dominated by a major single hydrogen bond with a compromise
in balancing a weaker electrostatic interaction and a van der Waals
bond. The peripheral hydrogen atoms are repelled from each other so
as to minimize the repulsion. For long-chain complexes, the alkyl
tails employ the same avoided stereorepulsion principle for hydrocarbons.
We see the stabilized complexes are consistent with these principles.
Figure 10
Optimized
structures of the dimers in the Aa–Am and Aa–Ca
series.
Optimized
structures of the dimers in the Aa–Am and Aa–Ca
series.Tables and 11 summarize
the MP2 and CCSD(T) energy data with
different basis sets and their CBS extrapolation values for the Aa–Am
and Aa–Ca heterodimers, respectively. The basis set effect
can clearly be seen in these two categories. The energy follows a
systematic converging trend as the basis size increases, especially
for the energy difference calculated between the aDZ and the aTZ basis
sets. This demonstrates the good quality of Dunning’s basis
sets and the theoretically justified extrapolation rules. Because
of the competition mechanism, the hydrogen bonding pattern is not
significant in this series, so the binding energy is lower than the
usual strength of a hydrogen bond. This implies that the electrostatic
interaction is not the dominate attraction term. The MP2/aQZ energy
data of the methane–formic acid dimer is only 0.047 kcal/mol
different from its MP2/CBS energy.
Table 10
Binding Energies of the Dimers in
the Aa–Am Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methane–formamide
–1.170
–1.398
–1.453
–1.585
[−1.220]
[−1.483]
[−1.542]
methane–acetamide
–1.263
–1.405
–1.462
–1.515
[−1.282]
[−1.425]
[−1.473]
methane–propanamide
–2.432
–2.534
–2.587
–2.664
[−2.479]
[−2.572]
ethane–formamide
–1.563
–1.792
–1.857
–2.015
[−1.630]
[−1.903]
ethane–acetamide
–1.849
–2.073
–2.146
–2.174
[−1.816]
[−2.044]
ethane–propionamide
–3.125
–3.292
–3.361
[−3.126]
[−3.291]
propane–formamide
–2.131
–2.451
–2.539
–2.614
[−2.113]
[−2.462]
propane–acetamide
–2.647
–2.987
–3.026
[−2.567]
[−2.883]
propane–propanamide
–3.604
–3.808
–3.831
[−3.561]
butane–formamide
–2.376
–2.709
–2.823
[−2.326]
[−2.683]
butane–acetamide
–2.869
–3.179
–3.202
[−2.761]
butane–propanamide
–4.305
–4.593
–4.634
[−4.225]
Table 11
Binding Energies of the Dimers in
the Aa–Ca Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methane–formic acid
–1.238
–1.564
–1.628
–1.759
[−1.274]
[−1.644]
methane–acetic
acid
–1.210
–1.517
–1.580
–1.733
[−1.269]
[−1.620]
methane–propanoic
acid
–1.390
–1.593
–1.651
–1.674
[−1.368]
[−1.574]
ethane–formic acid
–1.598
–1.904
–1.979
–2.148
[−1.661]
[−2.018]
ethane–acetic acid
–1.607
–1.850
–1.922
–1.933
[−1.560]
[−1.808]
ethane–propanoic acid
–1.859
–2.114
–2.176
[−1.808]
[−2.069]
propane–formic acid
–1.950
–2.287
–2.376
–2.536
[−2.005]
[−2.382]
propane–acetic acid
–2.194
–2.510
–2.557
[−2.111]
[−2.424]
propane–propanoic acid
–2.237
–2.525
–2.565
[−2.156]
butane–formic acid
–2.130
–2.551
–2.748
[−2.117]
[−2.571]
butane–acetic acid
–2.397
–2.732
–2.758
[−2.289]
[−2.617]
butane–propanoic acid
–2.927
–3.308
–3.345
[−2.804]
Alkene–Amide (Ae–Am) and Alkene–Carboxylic
Acid (Ae–Ca) Heterodimers
The optimized heterodimers
paired by the alkene–amide (Ae–Am) and the alkene–carboxylic
(Ae–Ca) groups are shown in Figure . The dimers are bonded together by an −H−π
interaction with a −C=O–H side interaction, where the
former comes from the −NH in the Ae–Am and the −OH
in the Ae–Ca groups, respectively. The functional hydrogen
bond donor site (−OH for the carboxylic acid and – NH
for the amide) tends to attract the electrophilic region in the alkene
group. Therefore, the local −OH−π or −NH−π
pattern is found in all the stabilized complexes. This hydrogen bonding
is of high directionality, so when the chains get longer, the alkyl
tails yield to this dominate structural pattern but do not always
align in parallel.
Figure 11
Optimized structures of the dimers in the Ae–Am
and Ae–Ca
series.
Optimized structures of the dimers in the Ae–Am
and Ae–Ca
series.Tables and 13 summarize
the MP2 and CCSD(T) energy data with
different basis sets and their CBS extrapolation values for the Ae–Am
and the Ae–Ca heterodimers, respectively. The improvement using
the aTZ with respect to the aDZ basis sets is more significant than
that of using the aQZ with respect to the aTZ basis sets. Generally,
the binding energy of an Ae–Ca dimer is slightly larger than
that of the corresponding Ae–Am dimer.
Table 12
Binding
Energies of the Dimers in
the Ae–Am Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
ethylene–formamide
–3.266
–3.630
–3.736
–3.653
[−3.081]
[−3.470]
ethylene–acetamide
–3.327
–3.673
–3.779
–3.721
[−3.169]
[−3.538]
ethylene–propionamide
–4.261
–4.502
–4.512
[−4.149]
[−4.411]
propylene–formamide
–4.134
–4.506
–4.624
–4.464
[−3.864]
[−4.260]
propylene–acetamide
–4.092
–4.499
–4.474
[−3.868]
[−4.293]
propylene–propionamide
–5.043
–5.279
–5.189
[−4.842]
[−5.090]
butylene–formamide
–4.212
–4.657
–4.790
–4.632
[−3.971]
[−4.436]
butylene–acetamide
–4.311
–4.733
–4.692
[−4.095]
butylene–propionamide
–5.288
–5.603
–5.557
[−5.109]
pentylene–formamide
–4.300
–4.664
–4.559
[−4.012]
[−4.397]
pentylene–acetamide
–4.323
–4.671
–4.560
[−4.065]
pentylene–propionamide
–5.788
–6.143
–6.063
[−5.559]
Table 13
Binding Energies of the Dimers in
the Ae–Ca Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
ethylene–formic
acid
–3.875
–4.392
–4.516
–4.348
[−3.581]
[−4.134]
ethylene–acetic
acid
–3.715
–4.222
–4.346
–4.230
[−3.475]
[−4.016]
ethylene–propanoic
acid
–3.700
–4.205
–4.223
[−3.473]
[−4.010]
propylene–formic
acid
–4.921
–5.466
–5.606
–5.349
[−4.530]
[−5.107]
propylene–acetic
acid
–4.564
–5.131
–5.064
[−4.229]
[−4.825]
propylene–propanoic
acid
–4.687
–5.205
–5.130
[−4.368]
[−4.912]
butylene–formic
acid
–4.946
–5.481
–5.619
–5.365
[−4.564]
[−5.126]
butylene–acetic
acid
–4.726
–5.237
–5.130
[−4.404]
butylene–propanoic
acid
–4.720
–5.227
–5.116
[−4.396]
pentylene–formic
acid
–5.212
–5.867
–5.765
[−4.834]
pentylene–acetic
acid
–5.118
–5.754
–5.687
[−4.783]
pentylene–propanoic
acid
–4.800
–5.305
–5.204
[−4.486]
Database for the AAK–AAK
Heterodimers
Figure shows
the optimized structures of the stabilized dimers in the AAK groups.
As expected, all bonding patterns show the single −O–H–O
hydrogen-bonded configurations. When the alkyl group gets longer,
there is a competition between the other attractive components and
the hydrogen bonding. The alcohol series is clearly dominated by the
electrostatic energy, while the dispersion energy due to the alkyl
group adds up to modify the configuration. As can be seen in Figure , the electrostatic
energy and the dispersion energy compete in these dimers.
Figure 12
Optimized
structures of the dimers in the Ac–Ac series.
Optimized
structures of the dimers in the Ac–Ac series.Table summarizes
the MP2 and CCSD(T) energy data with different basis sets and their
CBS extrapolation values for the AAK groups. The data exhibit a systematic
convergence trend, which again shows the calculations are of high-level
quality. Overall, our MP2/aQZ calculated energy data are within 0.1
kcal/mol difference from the MP2/CBS calculations for specific dimers
(e.g., the methanol–ethanol and methanol–propanol dimers).
Table 14
Binding Energies of the Dimers in
the Ac–Ac Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methanol–ethanol
–5.852
–6.226
–6.402
–6.512
[−5.759]
[−6.208]
methanol–propanol
–5.629
–6.008
–6.186
–6.292
[−5.555]
[−5.984]
methanol–butanol
–5.721
–6.121
–6.266
[−5.646]
[−6.098]
ethanol–propanol
–5.997
–6.382
–6.518
[−5.905]
[−6.356]
ethanol–butanol
–6.528
–6.968
–7.123
[−6.436]
[−6.938]
propanol–butanol
–6.591
–7.070
–7.232
[−6.530]
[−7.030]
Database
for the AAK–CAA Heterodimers
Alcohol–Amide
(Ac–Am) and
Alcohol–Carboxylic Acid (Ac–Ca) Heterodimers
Figure shows the
optimized structures of the studied Ac–Am and Ac–Ca
heterodimers. In these two series, there are one hydrogen bond donor
and one hydrogen bond acceptor on each monomer site, which offers
the opportunity of forming double hydrogen bonds within the pairs.
For the Ac–Am dimers, the two hydrogen bonds stem from an −OH
on the alcohol with an oxygen on the amide, and an −NH on the
amide with an oxygen on the alcohol. On the other hand, for the Ac–Ca
dimers, the two hydrogen bonds stem from an −OH on the alcohol
with an oxygen on the carboxylic acid and an −OH on the carboxylic
acid with an oxygen on the alcohol. If the alcohol is methanol, the
double hydrogen bond forms a planar ring. The short alkyl tail in
methanol does not alter the double hydrogen bond. However, as the
tail of alcohol gets longer, the alkyl group comes into play and leads
to more complicated structures.
Figure 13
Optimized structures of the dimers in
the Ac–Am and Ac–Ca
series.
Optimized structures of the dimers in
the Ac–Am and Ac–Ca
series.From Tables and 16, where we
summarize the calculated MP2 and CCSD(T)
energy data for the Ac–Am and the Ac–Ca heterodimers,
respectively, we can see that the hydrogen bond dominates the binding
energy of these heterodimers (ca. 10–11 kcal/mol). However,
because of the competition mechanism, larger alkyl groups on the alcohols
do not always render larger binding energies. For example, the binding
energy of propanol–formic acid is actually larger than that
of butanol–formic acid. Similarly, longer alkyl groups on the
carboxylic acids do not guarantee larger binding energies either.
This is because the partial charges on the oxygen atom, both in the
alcohols and the carboxylic acids, are modified by the longer alkyl
groups.
Table 15
Binding Energies of the Dimers in
the Ac–Am Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methanol–formamide
–9.091
–9.714
–10.003
–10.277
[−9.052]
[−9.765]
[−10.066]
methanol–acetamide
–9.443
–10.064
–10.360
–10.651
[−9.429]
[−10.139]
methanol–propionamide
–9.587
–10.225
–10.578
[−9.581]
[−10.309]
ethanol–formamide
–9.236
–9.863
–10.154
–10.412
[−9.186]
[−9.909]
ethanol–acetamide
–9.612
–10.255
–10.554
–10.837
[−9.585]
[−10.320]
ethanol–propionamide
–10.501
–11.021
–11.321
[−10.518]
[−11.102]
propanol–formamide
–9.468
–10.158
–10.451
–10.683
[−9.404]
[−10.176]
[−10.469]
propanol–acetamide
–9.715
–10.330
–10.649
[−9.690]
[−10.390]
propanol–propionamide
–10.593
–11.075
–11.365
[−10.614]
[−11.162]
butanol–formamide
–9.349
–10.208
–10.569
[−9.292]
[−10.207]
butanol–acetamide
–9.749
–10.374
–10.601
[−9.716]
butanol–propionamide
–10.779
–11.421
–11.671
[−10.762]
Table 16
Binding Energies of the Dimers in
the Ac–Ca Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
methanol–formic acid
–9.828
–10.59
–10.917
–11.147
[−9.681]
[−10.559]
[−10.908]
methanol–acetic acid
–9.743
–10.514
–10.851
–11.134
[−9.673]
[−10.551]
methanol–propanoic acid
–9.69
–10.455
–10.836
[−9.643]
[−10.514]
ethanol–formic acid
–10.09
–10.883
–11.212
–11.407
[−9.923]
[−10.838]
ethanol–acetic acid
–10.094
–10.822
–11.158
–11.372
[−9.949]
[−10.791]
ethanol–propanoic acid
–9.954
–10.759
–11.143
[−9.890]
[−10.805]
propanol–formic acid
–10.09
–10.883
–11.365
–11.831
[−10.021]
[−10.997]
propanol–acetic
acid
–10.094
–10.822
–11.285
[−10.000]
[−10.978]
propanol–propanoic acid
–9.954
–10.759
–11.302
[−9.991]
[−10.963]
butanol–formic acid
–10.163
–10.914
–11.178
[−9.996]
[−10.862]
butanol–acetic acid
–10.079
–10.845
–11.081
[−9.993]
butanol–propanoic acid
–10.035
–10.795
–11.052
[−9.972]
Aldehyde–Amide (Ad–Am) and
Aldehyde–Carboxylic Acid (Ad–Ca) Heterodimers
Figure shows the
optimized structures of the studied Ad–Am and Ad–Ac
heterodimers. In the two series, an Ad–Ac or an Ad–Am
dimer tends to form a planar double hydrogen bonded ring with the
corresponding carbonyl functional groups. However, there is a subtle
difference between these two series. The Ad–Am dimer indeed
remains the planar pattern for shorter chains, but the carbonyl oxygen
interacts with the other hydrogens on the longer alkyl groups (e.g.,
contrast panels 15, 17, and 19 and panels 16, 18, and 20 in Figure ).
Figure 14
Optimized structures
of the dimers in the Ad–Am and Ad–Ca
series.
Optimized structures
of the dimers in the Ad–Am and Ad–Ca
series.In Tables and 18 we summarize
the calculated MP2 and CCSD(T) energy
data for the Ad–Am and the Ad–Ca heterodimers, respectively.
We can see that the hydrogen bond dominates the binding energy of
these heterodimers (ca. 8–9 kcal/mol). However, because of
the competition mechanism, larger alkyl groups on both chains do not
always render larger binding energies. This is also because the partial
charges on the involved atoms are modified by the longer alkyl groups.
The binding energies in the series are generally less than those of
the corresponding Ac–Am and Ac–Ca series.
Table 17
Binding Energies of the Dimers in
the Ad–Am Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
formaldehyde–formamide
–7.215
–7.587
–7.814
–8.179
[−7.324]
[−7.785]
[−8.013]
formaldehyde–acetamide
–7.140
–7.504
–7.728
–8.126
[−7.288]
[−7.739]
formaldehyde–propionamide
–7.997
–8.233
–8.597
[−8.193]
[−8.498]
acetaldehyde–formamide
–7.173
–7.513
–7.724
–8.014
[−7.226]
[−7.649]
acetaldehyde–acetamide
–7.452
–7.825
–8.056
–8.270
[−7.650]
[−8.109]
acetaldehyde–propionamide
–7.889
–8.088
–8.380
[−8.034]
[−8.296]
propionaldehyde–formamide
–7.566
–7.946
–8.176
–8.606
[−7.736]
[−8.205]
propionaldehyde–acetamide
–7.181
–7.636
–7.893
[−7.216]
[−7.701]
propaldehyde–propionamide
–8.304
–8.682
–8.868
[−8.331]
butyraldehyde–formamide
–7.510
–8.023
–8.197
[−7.426]
[−7.981]
butyraldehyde–acetamide
–7.568
–8.056
–8.214
[−7.521]
butyaldehyde–propionamide
–8.508
–8.884
–9.028
[−8.494]
Table 18
Binding Energies of the Dimers in
the Ad–Ca Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
formaldehyde–formic acid
–8.442
–9.054
–9.329
–9.653
[−8.460]
[−9.166]
[−9.452]
formaldehyde–acetic acid
–8.064
–8.654
–8.927
–9.332
[−8.160]
[−8.842]
[−9.125]
formaldehyde–propion
acid
–7.994
–8.576
–9.028
[−8.109]
[−8.783]
acetaldehyde–formic acid
–9.220
–9.871
–10.156
–10.499
[−9.257]
[−10.006]
acetaldehyde–acetic acid
–8.717
–9.341
–9.623
–10.057
[−8.851]
[−9.569]
acetaldehyde–propion acid
–8.641
–9.254
–9.761
[−8.796]
[−9.503]
propanal–formic acid
–9.257
–9.905
–10.187
–10.551
[−9.314]
[−10.063]
propanal–acetic acid
–8.737
–9.358
–9.871
[−8.894]
[−9.610]
propanal–propion acid
–8.661
–9.273
–9.710
[−8.840]
butanal–formic acid
–9.257
–9.940
–10.395
[−9.359]
[−10.107]
butanal–acetic acid
–8.766
–9.385
–9.813
[−8.933]
butanal–propion acid
–8.690
–9.301
–9.749
[−8.881]
Database for the CAA–CAA Heterodimers
Figure shows
the optimized structures of the studied Am–Am, Am–Ca,
and Ca–Ca heterodimers. For these dimers, there are clearly
one hydrogen bond donor and one hydrogen bond acceptor on the paired
monomers, respectively. It is expected to form a double hydrogen bond
pattern. Three types of double hydrogen bonds are shown in Figure , namely, two N–H–O
hydrogen bonds in the Am–Am dimers, two O–H–O
hydrogen bonds in the Ca–Ca dimers, and one N–H–O
hydrogen bond and one O–H–O hydrogen bond in the Am–Ca
dimers. The double hydrogen bonded functional groups invariantly form
a planar ring structure which represents a significant feature for
such complexes.[73]
Figure 15
Optimized structures
of the dimers in the Am–Am, Ca–Ca,
and Am–Ca series
Optimized structures
of the dimers in the Am–Am, Ca–Ca,
and Am–Ca seriesFrom Table ,
where we summarize the calculated MP2 and CCSD(T) energy data for
the CAA–CAA heterodimers, we can see that the double hydrogen
bond dominates the binding energy of each heterodimer. Although a
longer alkyl tail does not imply a larger binding energy, the contribution
from the alkyl tails is less significant than the other series discussed
in the above.
Table 19
Binding Energies of the Dimers in
the Am–Am, Ca–Ca, and Am–Ca Series
MP2 [CCSD(T)]
CCSD(T)
aDZ
aTZ
aQZ
CBS
formamide–acetamide
–13.692
–14.318
–14.677
–15.139
[−13.761]
[−14.518]
formamide–propionamide
–13.847
–14.487
–14.845
–15.315
[−13.923]
[−14.696]
acetamide–propionamide
–13.958
–14.590
–15.110
[−14.081]
[−14.844]
formamide–formic acid
–14.314
–15.278
–15.688
–16.206
[−14.269]
[−15.377]
[−15.817]
formamide–acetic acid
–14.015
–14.952
–15.364
–15.854
[−14.065]
[−15.141]
formamide–propanoic acid
–13.919
–14.846
15.450
[−13.994]
[−15.060]
acetamide–formic acid
–14.997
–15.986
–16.399
–16.807
[−14.958]
[−16.093]
acetamide–acetic
acid
–14.598
–15.557
–16.168
[−14.669]
[−15.764]
acetamide–propanoic acid
–14.496
–15.444
–16.081
[−14.594]
[−15.682]
propanoamide–formic
acid
–15.270
–16.283
–16.822
[−15.232]
[−16.395]
propanoamide–acetic acid
–15.431
–16.273
–16.847
[−15.546]
[−16.492]
propanoamide–propic
acid
–14.742
–15.711
–15.924
[−14.848]
formic acid–acetic
acid
–14.334
–15.581
–16.041
–16.508
[−14.324]
[−15.712]
formic acid–propanoic acid
–14.296
–15.533
–16.208
[−14.308]
[−15.687]
acetic acid–propanoic acid
–14.449
–15.683
–16.437
[−14.545]
[−15.917]
Segmental SAPT Energy Decomposition Analysis
In order to gain chemical insights of the calculated (total) interaction
energies, we perform an energy decomposition analysis based on the
symmetry-adapted perturbation theory (SAPT0/jun-cc-pVXZ, X = D and
T).[74] Here, the full interaction energy
is decomposed into four components: electrostatic, induction, dispersion,
and exchange. Table S1 (see the Supporting Information) lists the four components
of the SAPT binding energies for all the studied dimers. The attractive
energy is composed of the electrostatic energy, the induction energy,
and the dispersion energy, whereas the repulsive energy stems from
the exchange term. To see the interesting interplay of the attractive
energy components, we present the relative percentage contribution
of each component shown in the parentheses. We see that there is a
crossing of the relative electrostatic and dispersion components around
the AAK–AAK groups. The main stabilization attractive energy
contributions shift from the AAA–AAA groups (mainly dispersion
bound) to the CAA–CAA groups (mainly hydrogen bonded).As an application of the calculated SAPT energy data, we have proposed
a segmental model where we further dissect a functional group molecule
into chemically identified segments, each as an effective united atom.
To each segment we attribute electric features such as effective charges
and geometrical features such as molecular volumes, so that the pair
summed intersegment interactions can reproduce the SAPT component
energies for the dimers. For repulsion and electrostatic interactions,
formally the charge pair (+, −) is counted as electrostatic,
while the charge pairs (+, +) and (−, −) are counted
as exchange. The induction energy is modeled as a charge–dipole
interaction; that is, the charge at one segment interacts with the
closest dipole at the other segment. The dispersion interaction is
modeled by a power law with respect to the molecular volume[75,76] (see also the Supporting Information).
In this way our model is similar to the usual fragment-based energy
partition schemes, such as the recent A-SAPT and F-SAPT methods,[77−80] where the goal is to construct an effective two-body partition model
of the SAPT energy components to localized chemically recognizable
segments.Let us illustrate the assignment of segments using
a butane molecule
(Figure S1). We dissect a butane molecule
into four segments of two types, that is, A is the methyl radical
(CH3−) and B is the methylene radical (−CH2−), so a butane molecule is represented by A+B–B+A–, where we have
assigned symbolically the alternating (positive–negative) charges
on each segment. Next for each energy component, we simply count the
suitable paired segments and list the interactions. Let us first consider
the ethane–propane dimer. Here we can count the suitable pairs
for the electrostatic, induction, and exchange energies (in kcal/mol)
as follows:Here we have used the
previously determined
segmental component energies from the SOFG-31 data set.[59] Only one unknown variable is to be determined,
so we used the SAPT energy to obtain EindB,A-A = −0.032. Continuing this procedure
we can list similar energy equations with unknown intersegment energies
to be determined sequentially. Please refer to the Supporting Information for the full list of supplementary
figures, tables, and equations. It is found that for the alkane heterodimer
series using the energy data up to the ethane–pentane dimer
(the training set) is sufficient to sort out all the intersegment
interactions for each energy component. The detailed analysis and
calculations for the other molecules are shown in the Supporting Information. In Table S6 we summarize the resulting segmental SAPT energies.
We can see that the model works surprisingly well. For most cases,
we can reproduce the corresponding SAPT energies to an accuracy of
about 10% errors. As a further test of the validity of this model,
let us predict the binding energies for larger heterodimers. For example,
consider the undecane–dodecane heterodimer. We see that there
are an additional 3 pairs of (A, B) and 17 pairs of (B, B) electrostatic
interactions, 4 pairs of (A, A-B)/(B, A-B) and 33 pairs of (B, B–B)
charge–dipole interactions, and 1 pair of (A, B) and 9 pairs
of (B, B) exchange interactions. By simply counting the pairs we can
list the energy equations as follows:Our predicted SAPT energy is −8.62, which can be compared
with the MP2/CBS value of −8.33 calculated by the Hobza group.[81] As can be seen, in this case we accidentally
obtain a very accurate energy with only a 3% error off the reference
value. The application of this model to other heterodimers is shown
and compared to their MP2/CBS values in Table . We see that the overall performance is
very good. Therefore, it is promising to utilize this model in coarse-grained
molecular modeling for larger molecules.
Table 20
Comparison
of the Reference MP2/CBS
Energy Data[81] and the Model Predicted Energies
(kcal/mol) Using the Segment SAPT Analysis
complex
MP2/CBS
model
error %
heptane–octane
–5.44
–5.77
–6.1%
octane–nonane
–5.81
–6.13
–5.5%
nonane–decane
–6.66
–6.43
+3.1%
decane–undecane
–6.98
–6.74
+3.4%
undecane–dodecane
–8.33
–8.62
–3.4%
Conclusion
and Outlook
We have constructed a minimum-level CCSD(T)/CBS-calculated
interaction
energy data set with the MP2/aug-cc-pVXZ (X = D, T, and up to Q) optimized
geometries for 239 heterodimers of small organic functional groups.
The monomers are selected from the SOFG-31 data set, including the
alkane, alkene, alkyne, alcohol, aldehyde, ketone, carboxylic acid,
and amide groups. Together with the SOFG-31 set, this extended set
is called the SOFG-31+239 (SOFG-270) data set. The MP2/aug-cc-pVTZ
level of theory is reliable for the geometry optimization, and the
CCSD(T)/CBS binding energies can serve as benchmark reference data
which supplements and/or complements existing data sets. Overall,
a chemical accuracy (∼0.1 kcal/mol), consistent for each individual
noncovalent complex, can be assigned with this data set. A comprehensive
SAPT analysis is also performed in order to gain more chemical insights
into the calculated full interaction energies. A further segment modeling
provides finer details of the segmental contributions for each molecule.
These segmental energy “quanta” can then be used to
predict intermolecular interaction energies for large molecules and
in the construction of coarse-grained force fields for molecular simulations.This minimum set of energy data can be enlarged along with available
computer resources. However, the scope is limited with the most stable
conformations of each pair of monomers. To reach our goal of constructing
a universal force field without empirical inputs, we need the full
potential energy surfaces. One standard way is to sample a set of
relative orientations of the paired monomers and scan the corresponding
potential energy curves at a sequence of distance points along the
dissociation coordinates. The computational cost is roughly proportional
to the sample number of orientations and the number of scanning points,
respectively. Within current computer capacity attained similar in
this work, this task can be routinely studied. A further complication
for larger organic functional groups is the issue of isomers, both
for monomers and dimers. It is well-known that the most stable dimer
is not necessarily formed by the most stable monomers. Therefore,
there may exist several local stable complexes which are related through
isomerization pathways. The computational costs are expected to be
quite intense because the number of isomers for a specific pair of
monomers increases combinatorially fast. Apparently we are just toeing
the (starting) line. A considerable amount of computer resources and
human collaboration is required in this fundamental and important
subfield of computational chemistry.
Authors: Robert T McGibbon; Andrew G Taube; Alexander G Donchev; Karthik Siva; Felipe Hernández; Cory Hargus; Ka-Hei Law; John L Klepeis; David E Shaw Journal: J Chem Phys Date: 2017-10-28 Impact factor: 3.488
Authors: Sergio Pérez-Conesa; Francisco Torrico; José M Martínez; Rafael R Pappalardo; Enrique Sánchez Marcos Journal: J Chem Phys Date: 2019-03-14 Impact factor: 3.488
Authors: Robert Sedlak; Tomasz Janowski; Michal Pitoňák; Jan Rezáč; Peter Pulay; Pavel Hobza Journal: J Chem Theory Comput Date: 2013-08-13 Impact factor: 6.006
Authors: Hatice Gökcan; Eric Kratz; Thomas A Darden; Jean-Philip Piquemal; G Andrés Cisneros Journal: J Phys Chem Lett Date: 2018-05-23 Impact factor: 6.475