Literature DB >> 30735388

Semiempirical Quantum-Chemical Methods with Orthogonalization and Dispersion Corrections.

Pavlo O Dral1, Xin Wu1, Walter Thiel1.   

Abstract

We present two new semiempirical quantum-chemical methods with orthogonalization and dispersion corrections: ODM2 and ODM3 (ODM x). They employ the same electronic structure model as the OM2 and OM3 (OM x) methods, respectively. In addition, they include Grimme's dispersion correction D3 with pan class="Disease">Becke-Johnson damping and three-body corrections E ABC for Axilrod-Teller-Muto dispersion interactions as integral parts. Heats of formation are determined by adding explicitly computed zero-point vibrational energy and thermal corrections, in contrast to standard MNDO-type and OM x methods. We report ODM x parameters for hydrogen, carbon, nitrogen, oxygen, and fluorine that are optimized with regard to a wide range of carefully chosen state-of-the-art reference data. Extensive benchmarks show that the ODM x methods generally perform better than the available MNDO-type and OM x methods for ground-state and excited-state properties, while they describe noncovalent interactions with similar accuracy as OM x methods with a posteriori dispersion corrections.

Entities:  

Year:  2019        PMID: 30735388      PMCID: PMC6416713          DOI: 10.1021/acs.jctc.8b01265

Source DB:  PubMed          Journal:  J Chem Theory Comput        ISSN: 1549-9618            Impact factor:   6.006


Introduction

Semiempirical quantum chemistry (SQC) methods based on the neglect of diatomic difpan class="Chemical">ferential overlap (n>n class="Chemical">NDDO) integral approximation[1,2] enable computationally efficient calculations of ground-state and excited-state electronic structure properties.[3,4] They are widely used when computational time becomes a major issue, i.e. in calculations of very large systems, e.g. of fullerenes,[5−9] nanotubes,[5,8] long polyynes,[10] proteins,[3,11−17] and others,[5,18,19] in real-time quantum chemistry studies,[20−24] and in simulations requiring a very large number of electronic structure calculations. The latter applications include high-throughput screening in drug[5,25−33] and materials[34,35] design, high-throughput pKa calculations,[36,37] ground-state molecular dynamics (MD) simulations,[38,39] excited-state nonadiabatic MD simulations,[3] quantum mechanics/molecular mechanics (QM/MM) MD and Monte Carlo studies,[3,12−16,40] and mass spectra simulations.[41−44] There are two classes of modern pan class="Chemical">NDDO-based SQC methods: 1) orthogonalization-corrected methods (n>n class="Chemical">OMx),[45−50] which account for repulsive orthogonalization effects, attractive penetration effects, and repulsive core–valence interactions via explicit corrections; 2) MNDO-type methods without such corrections, which ignore the overlap matrix while solving the Roothaan–Hall equations and also ignore penetration integrals and core–valence interactions. The first class comprises the OM1,[45,46,50] OM2,[47,48,50] and OM3[49,50] methods; somewhat related is the NO-MNDO method, which solves the Roothaan–Hall equations taking overlap into account explicitly.[51] Generally, the OMx methods are more accurate than the MNDO-type methods both for ground-state and excited-state properties, because they are based on a better physical model.[51−56] The MNDO-type methods include MNDO,[57,58] MNDO/d,[59−61] AM1,[62] RM1,[63] AM1*,[64] PM3,[65,66] the PDDG-variants of MNDO and PM3,[67,68] PM6,[69] and PM7.[70] They are popular and useful for many applications, especially because parameters are available for many elements and because they are often reasonably accurate thanks to an elaborate parametrization and fine-tuning via empirical core–core repulsion functions. A common problem of SQC methods is that they do not properly describe noncovalent complexes with significant dispersion interactions.[71] This problem is often ameliorated by adding explicit empirical dispersion corrections.[18,72−80] pan class="Chemical">OMx methods augmented with such explicit dispersion corrections describe various large noncovalent complexes with an accuracy comparable to density functional theory (DFT) methods with dispersion corrections[18,19] that are computationally much more expensive. Noncovalent interactions with hydrogen bonds are also often described poorly with SQC methods. This issue has been addressed by including special hydrogen bond corrections in MNDO-type methods.[70,72−75,77] In contrast, the OMx methods treat hydrogen-bonding interactions even without such corrections reasonably well,[50,54,81,82] while inclusion of dispersion corrections generally further improves the accuracy.[50,54] One should note, however, that the addition of empirical attractive dispersion corrections to any semiempirical Hamiltonian parametrized without such corrections will inevitably deteriorate the accuracy of the computed heats of formation (which will become too small), while the computed relative energies may become more or less accurate.[52,54] Hence, it is more consistent to reparametrize the Hamiltonian with inclusion of dispersion corrections. This has so far been done only in PM7,[70] which however suffers from error accumulation in very large noncovalent complexes,[19,54] and in the proof-of-principle MNDO-F method,[83] which still has large errors in heats of formation. Another problem of modern pan class="Chemical">NDDO-based SQC methods is that all of them conventionally treat atomization energies calculated at the SCF level as atomization enthalpies at 298 K, i.e. heats of formation are obtained without explicitly computing zero-point vibrational energies (ZPVEs) and thermal enthalpic corrections from 0 to 298 K.[50,54,57,84] This convention was useful for parametrizing SQC methods against experimental heats of formation in early times, when accurate theoretical ren>n class="Chemical">ference data were not yet available and when it was computationally unfeasible to calculate ZPVE and thermal corrections during parametrization. It is debatable whether this convention contributes much to the errors in SQC methods.[84,85] Benchmark studies show that it often has only a small effect on reaction energies,[54] but it may be problematic when comparing ZPVE-exclusive energies at 0 K with differences in semiempirical heats of formation for reactions with large changes in bonding.[54] Nowadays this convention is no longer justified, and it should be avoided in new methods.[84] As already mentioned, general-purpose SQC methods are often used for excited-state calculations, yet they are typically parametrized on ground-state properties only. On the other hand, there are special-purpose semiempirical methods such as INDO/S[86,87] and INDO/X[88] that were parametrized to reproduce electronic spectra. They can be applied for predicting such spectra but are less suitable for other purposes. It would clearly be desirable to develop general-purpose SQC methods that describe ground-state and excited-state properties in a balanced manner; this will require including both during parametrization. In this work, we report two new orthogonalization- and dispersion-corrected SQC methods, ODM2 and ODM3 (pan class="Chemical">ODMx). They are based on OM2 and OM3, respectively. They difn>n class="Chemical">fer from the underlying OMx methods in the following aspects: (a) They include explicit dispersion corrections as an integral part. (b) They are parametrized against much larger sets of diverse, state-of-the-art reference properties, with special emphasis on a balanced treatment of both ground-state and excited-state properties as well as noncovalent interactions. (c) Atomization energies calculated from total energies are treated consistently as ZPVE-exclusive atomization energies at 0 K, while heats of formation are determined by adding ZPVE and thermal corrections obtained within the harmonic-oscillator and rigid-rotor approximations. This Article is structured as follows. First, we discuss the theoretical formalism of the pan class="Chemical">ODMx methods (Section ). We then describe the parametrization procedure and present the optimized values of the ODM2 and ODM3 parameters (Section ). Thereafter, we validate the new methods on a huge collection of benchmark sets and compare their performance to that of the underlying OMx and dispersion-corrected OMx methods (Section ). Finally, we offer conclusions.

Methodology

The ODM2 and ODM3 methods employ the same electronic structure model as OM2[47,48,50] and OM3,[49,50] respectively. The OM2 and OM3 electronic structure models have been described in detail elsewhere[50] and will therefore not be explained again here. Instead we focus on the formal difpan class="Chemical">ferences between the n>n class="Chemical">ODMx and OMx methods. The pan class="Chemical">ODMx methods incorporate Grimme’s dispersion correction D3[89,90] with the Becke–Johnson (BJ) damping function[91−93] as an integral part (unlike the pan class="Chemical">OMx methods). They also include explicit three-body corrections E for the Axilrod–Teller–Muto dispersion interaction,[89] which are necessary for a better description of large dense systems.[54,94,95] We denote these D3(BJ)+E corrections as D3T in the following. The ODMx total energy (E) is defined as the SCF total energy plus the post-SCF D3T dispersion energy. The same definition holds for OMx methods with a posteriori D3T corrections (the OMx-D3T methods), which have been shown to describe noncovalent interactions well.[50,54] Consistent with the definitions in ab initio methods, the ZPVE-exlusive atomization energy at 0 K (ΔE) can be written as the difpan class="Chemical">ference between the sum of the total n>n class="Chemical">ODMx energies of N constituent atoms and the ODMx total energy of a molecule (E): This definition is difpan class="Chemical">ferent from that used in earlier n>n class="Chemical">NDDO-based SQC methods (including the OMx methods) where ΔE is assumed to be the atomization enthalpy at 298 K (ΔH) and is directly used in evaluating the heat of formation at 298 K without ever calculating ZPVE and thermal corrections explicitly.[50,54,57,84] By contrast, in the ODMx methods, heats of formation (ΔH) include ZPVE and thermal corrections from 0 K to a given temperature T computed explicitly within the harmonic-oscillator and rigid-rotor approximations (as in ab initio methods). More specifically, ΔH is defined in the pan class="Chemical">ODMx methods aswhere ΔH(A) denotes the heats of formation of the constituent atoms at temperature T. At 298 K we use the same experimental heats of formation of atoms as in the pan class="Chemical">OMx methods. The atomization enthalpy at temperature T (ΔH) is determined from the absolute enthalpies H of a molecule and its constituent atoms: Absolute enthalpies are defined aswhere U is the internal energy, E denotes the thermal corrections from 0 K to T, E, E, and E are translational, rotational, and vibrational contributions at T, and R is the gas constant. For pan class="Disease">atoms ZPVE, E and E vanish. The pan class="CellLine">chosen definition of the n>n class="Chemical">ODMx total energy has also implications on how ZVPE-exclusive proton affinities at 0 K (PAs) should be calculated at the ODMx level. The quantities PAODM can be formally expressed as However, eq does not take into account that the electron-accepting properties of the bare proton, which are quantified by the pan class="Disease">ionization potential of the n>n class="Chemical">hydrogen atom IP(H), are often severely underestimated by SQC methods. This is also true for the ODMx methods. The ionization potential of hydrogen at the ODMx level (IPODM(H)) is equal to the negative of the U parameter of hydrogen (−U(H)). This parameter is optimized for molecular reference systems (Section ) and turns out to be much lower than the experimental value of the ionization potential of the hydrogen atom IPexp(H) of 313.5873 kcal/mol,[96] by 25–29 kcal/mol. The impact of this underestimation becomes evident when considering the thermochemical cycle in Figure , which offers an alternative way to calculate PAODM:
Figure 1

Thermochemical cycle for calculating proton affinities (PAs). IP(H) and IP(M) are the ionization potentials of the hydrogen atom and of molecule M, respectively, and ΔEdiss is the dissociation energy.

Thermochemical cycle for calculating proton affinities (pan class="Chemical">PAs). IP(H) and IP(M) are the n>n class="Disease">ionization potentials of the hydrogen atom and of molecule M, respectively, and ΔEdiss is the dissociation energy. It is obvious from eq that pan class="Chemical">PAODM will be strongly underestimated. In a semiempirical context, it is more reasonable to substitute IPODM(H) with IPexp(H) in this equation to obtain corrected n>n class="Chemical">PAs (PAcorr), which do not suffer from the inadequate use of the same hydrogen U parameter in the hydrogen atom and in molecules: The correction ΔPAcorr required to calculate PAcorr from PAODM is thus given by Hence, we use the following expression to calculate corrected ZPVE-exclusive proton affinities at 0 K at the pan class="Chemical">ODMx level: In the following we repan class="Chemical">fer to these quantities simply as proton affinities. We note that this convention is consistent with that adopted in previous MNDO-type and n>n class="Chemical">OMx methods, which employ the experimental heat of formation of the proton when converting the computed heats of formation of the molecule and the protonated molecule to the corresponding proton affinity.

Parametrization

In this Section we specify the pan class="CellLine">chosen training sets, describe the parametrization procedure, and provide the list of final ODM2 and ODM3 parameters for the elements n>n class="Chemical">carbon, hydrogen, nitrogen, oxygen, and fluorine (CHNOF).

Training Sets

The quality of SQC methods strongly depends on the training sets, which should satisfy two requirements: 1. Molecules and properties of interest should be covered in a balanced manner. 2. Repan class="Chemical">ference data should be very accurate. In the present general-purpose parametrization, we aim at covering the entire space of pan class="Chemical">CHNOF-containing molecules with regard to ground-state and excited-state properties as well as noncovalent interactions. Concerning ground-state energies we want to describe both ZPVE-exclusive atomization energies at 0 K and heats of formation at 298 K as accurately as possible. To satisfy these requirements, we have n>n class="CellLine">chosen the following training sets with state-of-the-art reference data: Our own CHNO set of energies (heats of formation at 298 K, pan class="Disease">ionization potentials, vibrational energies, relative energies, and barriers), geometries (bond lengths, bond angles, and dihedral angles), and dipole moments.[50] Our own FLUOR set of energies (heats of formation at 298 K, pan class="Disease">ionization potentials, and vibrational energies), geometries (bond lengths and bond angles), and dipole moments.[50] The pan class="Chemical">MGAE109 set with ZPVE-exclusive atomization energies at 0 K,[97,98] which is part of the CE345 database.[99,100] The pan class="Chemical">TAE140 set with ZPVE-exclusive atomization energies at 0 K, which is part of the W4-11 benchmark set.[101] Our own set of vertical excitation energies (called VEE set in the following)[102] with updated theoretical best estimates from ref (55). The S66 set of interaction energies and geometries of 66 noncovalent complexes.[103,104] We found it beneficial to also include the ZPVE-exclusive atomization energy at 0 K of cubane calculated at the W2-F12 level.[105]

Parametrization Procedure

pan class="Chemical">Parametrization of SQC methods is a very complicated task in itself and is as much an art as a science. One of the challenges is the large number of parameters (usually more than a dozen per element), which makes it difficult to find the optimum set of parameters. The large parameter span>ce provides enormous flexibility: in the words of n>n class="Disease">John von Neumann, ‘with four parameters I can fit an elephant, and with five I can make him wiggle his trunk’. However, this flexibility should not be mistaken as a sign that the parametrization can achieve perfect accuracy for general-purpose SQC methods. The underlying physical model strongly limits what can be achieved for different sets of molecules and properties. While special-purpose SQC parametrizations can yield highly accurate results for certain classes of molecules and/or specific properties,[106] they will often fail outside their range of validity. Extending the metaphor, one may specifically ‘fit an elephant’, but such ‘an elephant model’ would be useless for describing the locomotion of both an elephant and a car. In our experience, there is plenty of subjective judgment involved in all stages of a general-purpose parametrization, up to evaluating and choosing among a number of reasonable candidate parameter sets. In the present work, we discarded well over 99.9% of the parameter sets considered. This underlines that a general-purpose parametrization is very demanding both in terms of human effort and computational costs. As already mentioned above, we target a balanced treatment of a large number of diverse molecules and properties. There are two issues: 1. Errors in difpan class="Chemical">ferent properties with difpan class="Chemical">ferent units cannot be directly compared, e.g. the numerical errors of heats in formation at 298 K (in kcal/mol) cannot be directly compared to the errors in bond lengths (in Å). 2. Errors in difpan class="Chemical">ferent molecules and properties may be chemically of difn>n class="Chemical">ferent importance, e.g. parameter sets giving a planar hydrogen peroxide geometry may be deemed unacceptable even if the total error for heats of formation is very low. Usually, both these problems are addressed by weighting the errors (Err) for the properties and for specific molecules or types of molecules that enter the overall sum of squares (SSQ) of errorswhere Nset is the number of training sets, Nprop is the number of properties in a training set, Nentry is the number of entries for a property in a training set, and wprop and wentry are weighting factors that are specific for a property and an entry in a training set, respectively. The error (Err) is defined aswhere PODM is the value calculated for a given property at the pan class="Chemical">ODMx level with the current parameters, and Pref is the corresponding ren>n class="Chemical">ference value. In previous general-purpose parametrizations of MNDO-type and pan class="Chemical">OMx methods, the SSQ value of weighted absolute errors was minimized during the optimization. In practice, this was found to be a slow, iterative, trial-and-error procedure, as described in detail elsewhere.[46,48,49,69] In our present work, we initially also applied this conventional approach, which however turned out to be too tedious for our broad diversity of training sets (much broader than in the n>n class="Chemical">OMx case). Thus, we designed an alternative parametrization procedure specifically tuned for ODMx methods to reach the following objectives: 1. Aim for an accuracy that is better than or close to the accuracy of the corresponding pan class="Chemical">OMx methods after redefining the SQC total energy. 2. Aim for an accuracy that is better than or close to the accuracy of the corresponding pan class="Chemical">OMx methods for ground-state properties. 3. Aim for an accuracy that is better than or close to the accuracy of the corresponding pan class="Chemical">OMx-D3T methods for noncovalent interactions. 4. Improve the accuracy of the pan class="Chemical">ODMx methods in compan>rison with the corresponding pan class="Chemical">OMx methods for excited-state properties. In short, our goal is to obtain unified methods, which preserve good and eliminate bad qualities of the pan class="Chemical">OMx and n>n class="Chemical">OMx-D3T methods for ground-state properties and noncovalent interactions, while improving the description of the excited-state properties and using the proper definition of SQC total energies. This clear breakdown of objectives allows for a systematic step-by-step parametrization of the pan class="Chemical">ODMx methods. The key to simplifying the complicated optimization problem is to n>n class="CellLine">choose the proper error measure to be minimized. Since we deal with many diverse properties, we have chosen to focus on the ODMx errors relative to the corresponding errors of a reference SQC method (usually the corresponding OMx or OMx-D3T method)where Pref SQC is the value of the property calculated with the reference SQC method. This approach obviously meets objectives 1–3 in a straightforward manner. Moreover, it also resolves the two issues discussed above: relative errors are unitless and normalized for each individual property by definition, and the parametrization with regard to relative errors will tend to retain the performance of the underlying pan class="Chemical">OMx or n>n class="Chemical">OMx-D3T methods for chemically important molecules and properties. To allow for larger flexibility and specific tuning, we still keep the option to adjust the weights for individual properties and molecules, if necessary. These conventions make parameter optimization much easier, because parameter changes that lead to very large errors of ODMx relative to OMx or OMx-D3T are easily identified and avoided by the optimizer. We note that in the final stage of the parametrization, we also ran conventional parameter optimizations minimizing the SSQ value of the absolute errors (starting from the best candidate parameter sets), which however did not lead to further improvement. Equation requires that the denominator is not close to zero, which was therefore set to a small value (typically 0.1) whenever the repan class="Chemical">ference SQC errors were very small. During the parametrization runs, the SQC calculations sometimes failed due to convergence problems, which is usually an indication of entering an unphysical region of parameter span>ce. In such cases, the SSQ value was set to an arbitrarily huge number, and the parametrization was continued with a modified set of parameter estimates. This procedure was repeated until there were no remaining convergence problems (or otherwise the parametrization was terminated). Such a fail-san>n class="Chemical">fe approach is necessary for a numerically stable parametrization algorithm. A good initial guess for the parameters is very important. In our case, the corresponding standard pan class="Chemical">OMx-D3T parameters are expected to provide an excellent guess. For ODM2 we optimized three element-independent parameters and 17 parameters per element (8 for n>n class="Chemical">hydrogen). ODM3 has two less parameters per element. We decided to retain the one-center two-electron integrals derived from experimental atomic spectra, which are used in all OMx methods[50] and in MNDO.[57] We also decided to keep the standard OMx parameters for the effective core potentials.[50] During parametrization we paid special attention to large changes in the parameters to make sure that they do not assume unphysical values. For this purpose we normally imposed strict limits on the allowed range of parameter values. For parameter optimization we employed our own in-house parametrization program, which calls the development version of the MNDO program[107] for the pan class="Chemical">ODMx calculations. The detailed step-by-step protocol that was adopted for optimizing the pan class="Chemical">ODMx parameters is documented in the Supporting Information. The final parameter values are presented in the next subsection.

Parameter Values

The values of the final ODM2 and ODM3 parameters are listed in Tables and 2, respectively. The largest changes relative to the corresponding standard pan class="Chemical">OMx-D3T parameters are found in the orthogonalization correction parameters F1, F2, G1, and G2. The smallest changes (all below 10%) occur for the U, U, and ζ parameters. The dispersion correction parameters are also changed only slightly in ODM2 relative to OM2-D3T, but there are larger shifts in ODM3 relative to OM3-D3T. However, detailed numerical tests show that the actual values of the dispersion corrections are similar for noncovalent complexes at the ODM3 and OM3-D3T levels. Moderate changes are observed in the resonance integral parameters. In the Supporting Information (SI) we present plots of resonance integrals of various types and for all combinations of diatomics as a function of the internuclear distance (Figures S1–S55 for ODM2 vs OM2 and Figures n>n class="CellLine">S56–S110 for ODM3 vs OM3). Inspection of these plots reveals that most of the ODMx resonance integrals are very similar to their OMx counterparts.
Table 1

ODM2 Parametersa

 HCNOF
Orbital Exponent: Scale Factor
ζ1.380284501.381136621.299583841.498807291.48240692
(au)(−6.3%)(−2.8%)(−2.4%)(−3.4%)(2.1%)
One-Center One-Electron Energies
Uss–12.48937475–51.23186063–73.38299795–102.59234481–117.89864612
(eV)(−1.3%)(−0.8%)(−1.3%)(0.8%)(−2.3%)
Upp –39.34045880–57.70164912–78.80113288–107.04309769
(eV) (−1.0%)(0.2%)(−0.2%)(−0.2%)
Resonance Integrals
βs–3.40625426–7.50866135–13.71974340–10.44716267–3.58269305
(eV bohr–1/2)(−0.4%)(4.1%)(26.5%)(−1.9%)(−42.7%)
βp –4.13979929–8.25662188–9.52283396–13.71612319
(eV bohr–1/2) (−0.1%)(8.3%)(10.3%)(−1.6%)
βπ –6.21901456–10.06577228–10.00369376–15.32274646
(eV bohr–1/2) (4.2%)(8.5%)(8.6%)(−18.2%)
αs0.065568880.094317280.121101390.134924820.25992551
(au)(−0.8%)(4.3%)(34.9%)(3.3%)(−2.4%)
αp 0.053723800.087424450.098740260.12102605
(au) (−1.5%)(−0.2%)(2.6%)(−1.3%)
απ 0.103705420.138274270.138614270.20389243
(au) (1.6%)(5.0%)(6.0%)(−6.0%)
βs(X–H) –6.58983035–9.28605650–6.12279073–7.73941491
(eV bohr–1/2) (4.6%)(−2.2%)(−6.4%)(23.8%)
βp(X–H) –4.53309066–9.59598789–9.78113133–11.60040200
(eV bohr–1/2) (12.1%)(12.7%)(−3.3%)(−16.8%)
αs(X–H) 0.100478640.115125370.139493130.74053556
(au) (3.9%)(0.7%)(25.5%)(65.6%)
αp(X–H) 0.057543900.116253750.106844540.13290605
(au) (8.9%)(8.9%)(−10.2%)(−15.1%)
Orthogonalization Factors
F10.257113300.481739720.671140231.198351252.32598335
 (−13.0%)(−3.6%)(4.7%)(−5.2%)(10.0%)
F21.290971210.884867170.292972570.985183001.29045770
 (−7.9%)(22.5%)(49.6%)(−14.2%)(18.2%)
G10.649320100.183627490.122758080.538954820.66523038
 (−0.5%)(−13.7%)(−12.0%)(90.4%)(109.8%)
G20.463148330.991456270.761907871.800287620.02575130
 (−49.0%)(−0.1%)(−9.7%)(129.6%)(20.3%)
Element-Independent Dispersion-Correction Parameters
s80.56467295 (6.3%)
a10.66492067 (−3.6%)
a23.34510640 (−2.9%)

Listed are only those parameters whose values differ from the standard OM2-D3T parameters. Relative deviations of the ODM2 from the OM2-D3T parameters are given in parentheses. See ref (50) for the description and the values of OM2-D3T parameters kept in ODM2 without change.

Table 2

ODM3 Parametersa

 HCNOF
Orbital Exponent: Scale Factor
ζ1.193769511.240637831.317947081.204203531.11606475
(au)(−5.2%)(−2.9%)(0.6%)(−0.3%)(−7.4%)
One-Center One-Electron Energies
Uss–12.34522989–50.53809829–76.86911508–105.32281913–119.94597476
(eV)(−0.9%)(0.0%)(1.2%)(−0.4%)(−0.6%)
Upp –39.18893465–57.30532511–78.61741113–107.72538552
(eV) (−1.0%)(−0.1%)(−0.4%)(0.2%)
Resonance Integrals
βs–3.55853909–7.22299220–11.27163531–13.99112485–6.21626110
(eV bohr–1/2)(4.6%)(1.0%)(−16.0%)(−3.0%)(0.3%)
βp –3.89503707–5.19814028–9.57907838–14.17975990
(eV bohr–1/2) (−2.9%)(−8.7%)(9.2%)(2.6%)
βπ –5.98389263–9.42644204–13.14461173–17.97416645
(eV bohr–1/2) (6.1%)(14.2%)(1.5%)(−5.2%)
αs0.067324260.087751930.083805470.128290590.36826212
(au)(−2.9%)(−4.6%)(−11.4%)(−1.0%)(18.3%)
αp 0.053028310.063873040.096163090.12592688
(au) (0.5%)(−8.0%)(3.7%)(1.2%)
απ 0.102445610.127064960.165884210.21676922
(au) (3.9%)(20.9%)(3.1%)(0.4%)
βs(X–H) –6.67027186–9.43111309–13.89205019–10.28397098
(eV bohr–1/2) (7.6%)(−17.3%)(2.4%)(27.5%)
βp(X–H) –4.66942670–7.96127622–10.28574783–14.01190662
(eV bohr–1/2) (10.3%)(1.1%)(9.2%)(0.6%)
αs(X–H) 0.104562050.098981950.159476410.40117629
(au) (4.3%)(−12.8%)(9.9%)(23.0%)
αp(X–H) 0.061457220.096555140.115631290.15563457
(au) (11.9%)(4.4%)(5.3%)(0.4%)
Orthogonalization Factors
F10.247795460.376299020.492004950.568999240.95888963
 (−2.4%)(−8.6%)(−15.5%)(3.0%)(−7.4%)
G10.361510430.081250590.195113760.190486470.15019006
 (1.5%)(−21.9%)(229.1%)(205.9%)(7.0%)
Element-Independent Dispersion-Correction Parameters
s80.78131768 (56.0%)
a10.69959980 (14.1%)
a23.05404380 (−6.3%)

Listed are only those parameters whose values differ from the standard OM3-D3T parameters. Relative deviations of the ODM3 from the OM3-D3T parameters are given in parentheses. See ref (50) for the description and the values of OM3-D3T parameters kept in ODM3 without change.

Listed are only those parameters whose values difpan class="Chemical">fer from the standard OM2-D3T parameters. Relative deviations of the ODM2 from the OM2-D3T parameters are given in parentheses. See ref (50) for the description and the values of OM2-D3T parameters kept in ODM2 without change. Listed are only those parameters whose values difpan class="Chemical">fer from the standard OM3-D3T parameters. Relative deviations of the ODM3 from the OM3-D3T parameters are given in parentheses. See ref (50) for the description and the values of OM3-D3T parameters kept in ODM3 without change.

Validation

In this Section, we compare the pan class="Chemical">ODMx results with the corresponding n>n class="Chemical">OMx and OMx-D3T results for our big collection of benchmark sets covering ground-state properties (Subsection ) and noncovalent interactions (Subsection ).[50,54] We also evaluate the results for the most important excited-state benchmark sets from our previous work[55] (Subsection ). We refer the reader to the cited literature for the description of these sets. Compared with our previous work[50,54] we made only a few minor modifications to the CHNO, OVS7-CHNOF, and PDDG sets to correct some erroneous or outdated reference data (see the SI) or to use more appropriate symmetry definitions of molecules. In the following, we report only a statistical analysis of the performance of the methods considered; the underlying individual numerical results for energies are documented in the SI. All calculations were done using our developmental version of the MNDO program.[107] We applied the same computational settings as in our previous studies.[50,54,55] Generally we used very tight convergence criteria. In the ground-state calculations, we applied the half-electron (HE) approach for open-shell molecules,[108] because the pan class="Chemical">OMx[46,48,49] and ODMx methods were parametrized using this approach and because it is known that the HE-SQC treatment gives results that are generally superior to those from unrestricted Hartree–Fock SQC calculations.[109] We had to loosen the convergence criteria only in very few difficult cases of ground-state calculations. Excited-state properties were computed using multireference configuration interaction (MRCI) calculations with SQC Hamiltonians including single (S), double (D), and optionally also triple (T) and quadruple (Q) substitutions: specifically, CISDTQ for vertical excitation energies and MRCISD for excited-state geometry optimizations; in some cases, we had to use MRCISDT or MRCISDTQ instead of MRCISD (or different starting geometries) to achieve convergence of the geometry optimizations. The quoted OMx and OMx-D3T results for ground-state properties were taken from our previous benchmarks,[50,54] except those for the updated sets (see above), which were recalculated. We used the same conventions as previously[50,54,55] for relative energies calculated at the OMx and OMx-D3T levels, i.e. they are based on heats of formation at room temperature (rather than ZPVE-exclusive energies at 0 K) unless mentioned otherwise. The quoted OMx/MRCI results for excited-state properties were taken from our previous benchmarks of electronically excited states.[55]

Ground-State Properties

Ground-state properties in the CHNO and FLUOR sets[45−50,110,111] were used for training both the pan class="Chemical">ODMx and n>n class="Chemical">OMx methods. It is evident from Tables and 4 that the ODMx methods are somewhat better than the OMx methods for heats of formation at 298 K. The inclusion of a posteriori D3T-corrections in the OMx methods significantly increases the mean absolute errors (MAEs) in the heats of formation at 298 K, which become systematically too small because the dispersion corrections are intrinsically negative. This highlights the importance of a consistent parametrization of SQC methods with dispersion corrections as integral part. Other properties including geometries, ionization potentials, dipole moments, relative enthalpies, and activation enthalpies are described similarly well by all SQC methods considered here. We note that ODM2 and ODM3 reproduce the bond lengths and bond angles in the CHNO set statistically somewhat better than their OMx and OMx-D3T counterparts.
Table 3

Mean Absolute Errors in Heats of Formation, Enthalpy Changes, and Activation Enthalpies at 298 K (kcal/mol), Bond Lengths (Å), Bond Angles (deg), Ionization Potentials (eV), and Dipole Moments (D) Calculated with the OMx, OMx-D3T, and ODMx Methods for the CHNO Set and Its Subsets

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
Heats of Formation
overall1383.055.102.643.006.952.74
CH571.724.981.601.637.721.46
CHN323.924.882.883.806.773.36
CHO374.406.244.014.056.873.83
CHNO41.962.132.383.243.453.39
HNO83.283.122.864.534.434.01
Bond Lengths
overall2420.0160.0160.0150.0190.0190.015
CH1130.0100.0100.0090.0090.0090.009
CHN490.0150.0150.0140.0270.0270.015
CHO570.0180.0180.0180.0220.0220.021
CHNO50.0180.0180.0230.0330.0330.019
HNO180.0490.0490.0440.0430.0430.033
Bond Angles
overall1012.242.242.041.851.861.70
CH381.461.481.481.231.251.27
CHN222.302.282.111.821.801.64
CHO312.452.442.112.032.041.96
HNO104.424.423.853.763.752.69
Ionization Potentials
overall520.260.260.230.440.440.41
CH220.240.240.190.370.370.30
CHN130.220.220.220.390.390.39
CHO140.340.340.290.610.610.61
HNO30.230.230.220.450.450.36
Dipole Moments
overall630.250.250.260.260.260.23
CH200.110.110.100.100.100.09
CHN160.270.270.270.330.330.32
CHO190.310.310.380.260.260.24
HNO60.490.490.410.580.580.44
Enthalpy Changes at 298 K
overall171.961.942.052.832.653.34
CH90.520.480.411.080.760.85
CHN34.094.073.575.655.638.11
CHO33.633.674.673.913.984.20
Activation Enthalpies at 298 K
overall601.551.551.771.531.482.01
CH201.631.621.881.961.942.33
CHN101.921.882.061.511.212.63
CHO251.351.361.491.313.21.52
CHNO31.431.442.400.680.682.19
Table 4

Mean Absolute Errors in Heats of Formation at 298 K (kcal/mol), Bond Lengths (Å), Bond Angles (deg), Ionization Potentials (eV), and Dipole Moments (D) Calculated with the OMx, OMx-D3T, and ODMx Methods for the FLUOR Set and Its Subsets

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
Heats of Formation
overall483.414.123.353.705.153.49
CHF393.724.613.423.885.713.42
HNOF92.081.993.072.932.753.80
Bond Lengths
overall1250.0230.0230.0210.0240.0240.022
CHF1040.0190.0190.0180.0210.0210.019
CHNOF30.0200.0200.0170.0140.0140.010
HNOF170.0430.0430.0400.0440.0440.040
Bond Angles
overall692.232.222.151.781.781.87
CHF562.062.062.041.611.611.71
CHNOF32.442.45891.751.751.86
HNOF92.912.912.642.682.672.66
Ionization Potentials
overall390.260.260.250.320.320.36
CHF290.250.250.260.290.290.33
HNOF90.290.290.210.420.420.48
Dipole Moments
overall390.310.310.310.250.250.23
CHF300.330.330.330.240.250.24
HNOF80.260.260.260.290.290.21
Turning to the independent OVS7-pan class="Chemical">CHNOF validation set[54] (Table ) that was not used for parametrization, the n>n class="Chemical">ODMx methods outperform their OMx counterparts for heats of formation of large molecules in the BIGMOL20 subset,[46,112] of anions in ANIONS24,[46] of various conformers in CONFORMERS30,[48] and of F-containing molecules in FLUORINE91.[113] They are however inferior for heats of formation of radicals in RADICALS71[109] and of cations in CATIONS41.[46] The ODMx and OMx methods perform similarly well for heats of formation of isomeric molecules in ISOMERS44.[48] The OMx-D3T methods again systematically underestimate the heats of formation (because of the uniformly attractive dispersion interactions included a posteriori) and thus suffer from larger errors in the heats of formation. For most other properties considered in the OVS7-CHNOF validation set, the ODMx methods and their OMx counterparts show similar errors; the ODM3 method performs better than OM3 and OM3-D3T for the ionization potentials in RADICALS71.[109]
Table 5

Mean Absolute Errors in Heats of Formation and Enthalpy Changes at 298 K (kcal/mol), Bond Lengths (Å), Bond Angles (deg), and Ionization Potentials (eV) Calculated with the OMx, OMx-D3T, and ODMx Methods for the OVS7-CHNOF Set and Its Subsets

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
Heats of Formation
RADICALS71425.075.816.215.757.236.56
ANIONS24248.379.118.129.5611.638.08
CATIONS41337.208.178.837.217.628.08
BIGMOL20204.4110.844.034.2615.013.51
CONFORMERS30112.956.562.213.059.811.61
ISOMERS44271.054.751.161.817.801.77
FLUORINE91917.157.596.657.347.496.03
Bond Lengths
FLUORINE914550.0160.0160.0150.0220.0220.019
Bond Angles
FLUORINE913552.042.032.021.781.781.92
Ionization Potentials
RADICALS71250.420.420.440.590.600.47
Enthalpy Changes at 298 K
RADICALS7143.663.661.763.273.272.15
CATIONS4156.366.376.116.196.206.56
CONFORMERS30171.001.071.361.101.201.34
ISOMERS44170.800.690.702.071.651.99
The MAEs in the heats of formation for the independent G2G3-pan class="Chemical">CHNOF set[49,54,114−116] and in the enthalpy changes for its n>n class="Chemical">ALKANES28 subset[49,116] (Table ) are in the same range for all methods considered; in the ALKANES28 subset, ODM2 outperforms OM2 in heats of formation, while ODM3 has the lowest MAE. It is also encouraging that the MAEs in the heats of formation at 298 K for the independent PDDG, PM7-CHNOF, and C7H10O2 sets are generally lower at the ODMx levels than at the corresponding OMx levels (Tables S1–S3). Other properties in the PDDG and PM7-CHNOF sets (geometries, ionization potentials, and dipole moments) are described similarly well by all methods.
Table 6

Mean Absolute Errors in Heats of Formation and Enthalpy Changes at 298 K (kcal/mol) Calculated with the OMx, OMx-D3T, and ODMx Methods for the G2G3-CHNOF Set and Its Subsets

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
Heats of Formation
G2933.374.013.523.835.043.93
G3523.186.303.113.719.243.06
ALKANES28221.919.241.150.7215.760.63
Enthalpy Changes at 298 K
ALKANES2860.610.210.341.480.901.06
The benefits of redefining the SQC total energy in the pan class="Chemical">ODMx methods are clearly seen in the evaluation of the W4-11-n>n class="Chemical">CHNOF set[101] (Table ). The reference ZPVE-exclusive atomization energies at 0 K and the relative energies derived therefrom are well reproduced by the ODMx methods (without any corrections), while the TAE140, TAE_nonMR124, and BDE99 subsets can be properly described by the OMx and OMx-D3T methods only after removing the ZPVE and thermal contributions from their heats of formation at 298 K.
Table 7

Mean Absolute Errors in ZPVE-Exclusive Atomization Energies at 0 K (kcal/mol) Calculated with the OMx, OMx-D3T, and ODMx Methods for the W4-11-CHNOF Set and Its Subsets

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
TAE1408814.9314.274.8915.2114.226.22
  4.81a4.64a 6.47a6.19a 
TAE_nonMR1248015.6314.944.7915.2514.225.63
  4.84a4.66a 6.05a5.79a 
BDE99798.157.956.469.659.297.31
  6.25a6.18a 7.51a7.31a 
HAT7073948.928.918.499.449.408.47
  9.17a9.16a 9.73a9.68a 
ISOMER20198.548.568.008.328.359.03
  8.34a8.35a 8.13a8.17a 
SN13135.555.355.204.314.154.94
  5.36a5.24a 4.98a5.13a 

The OMx and OMx-D3T energies are corrected by excluding ZPVE and thermal contributions.

The pan class="Chemical">OMx and pan class="Chemical">OMx-D3T energies are corrected by excluding ZPVE and thermal contributions. The evaluation of the diverse repan class="Chemical">ference data in the large GMTKN30-n>n class="Chemical">CHNOF set[117] leads to the impression that overall the ODMx methods perform somewhat better than the OMx and OMx-D3T methods (Table ). Again, in the MB08-165,[118] W4-08,[119] W4-08woMR,[119] and BSR36[120] subsets, the MAEs can be reduced substantially for OMx and OMx-D3T by removing the ZPVE and thermal contributions, while the ODMx methods have reasonably small MAEs without any need for additional corrections. Encouragingly, the ODMx methods perform better than the other methods for the WATER27[121] subset (with updated reference values for four large complexes from ref (122)). They also outperform their OMx and OMx-D3T counterparts for the RSE43,[123] O3ADD6,[124] and PCONF[125] subsets but have higher MAEs for the G21IP,[126] G21EA,[126] BH76,[127,128] and ACONF[129] subsets. ODM2 performs better than OM2 and OM2-D3T for the G2RC,[114] ISO34,[130] and SCONF[131,132] subsets, worse for the PA,[133,134] SIE11,[132] and DARC[135] subsets, and similarly to OM2 and/or OM2-D3T for the BHPERI,[119,136−139] BH76RC,[127,128] ISOL22,[140] IDISP,[130,141,142] S22,[143,144] and ADIM6[120,145] subsets. ODM3 performs worse than OM3 and OM3-D3T for the BH76RC and G2RC subsets and similarly to OM3 and/or OM3-D3T for the PA, SIE11, BHPERI, ISO34, ISOL22, DARC, IDISP, S22, ADIM6, and SCONF subsets. All SQC methods considered here fail to reproduce the isomerization energy of C20 in the DC9 subset.[132,138,146−151]
Table 8

Mean Absolute Errors (kcal/mol) in Properties Calculated with the OMx, OMx-D3T, and ODMx Methods for the GMTKN30-CHNOF Set and Its Subsets

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
overall4807.947.767.337.177.246.88
overall*a4546.956.776.646.306.356.32
MB08-1652522.4722.3616.1019.4620.1113.55
  12.20b11.51b 15.03b15.72b 
W4-08504.19b4.41b4.606.20b6.26b6.05
W4-08woMRc434.12b4.41b4.345.37b5.51b5.00
G21IP1512.0012.0013.7411.4511.4513.61
G21EA1211.3911.3913.959.319.3112.65
PA814.8214.8816.6211.9911.6911.56
SIE1157.788.078.424.314.704.26
BHPERI228.216.696.498.256.786.80
BH76549.729.7111.0810.6610.9311.24
BH76RC224.294.224.245.375.486.82
RSE43344.314.243.645.245.123.94
O3ADD6d612.2412.6110.5410.9711.387.12
G2RC158.237.755.584.163.624.63
ISO34344.444.553.884.374.484.35
ISOL22185.314.955.176.056.176.01
DC9725.0224.9326.9024.6923.3023.42
DC9woC20e613.5913.9414.4313.2012.3612.06
C20f193.6390.89101.7493.6188.9491.61
DARC147.249.3810.084.919.038.36
BSR363610.7713.9911.903.467.055.39
 367.08b10.28b 1.90b3.40b 
IDISP67.349.867.426.198.006.12
WATER272712.287.135.249.196.817.25
WATER27 (upd)g 11.496.344.458.407.386.46
S22223.050.940.843.540.950.93
ADIM663.130.090.154.090.390.62
PCONF101.281.020.731.331.391.12
ACONF150.640.220.800.860.311.05
SCONF171.671.621.351.321.341.36

Without MB08–165 and C20.

The OMx and OMx-D3T energies are corrected by excluding ZPVE and thermal contributions.

Subset W4–08 without multireference cases.

The adduct O3+C2H2 is treated as an open-shell singlet in all cases.

Subset DC9 without C20 bowl/cage isomerization energy.

C20 bowl/cage isomerization energy.

WATER27 subset with four reference dissociation energies of (H2O)20 clusters taken from ref (122).

Without MB08–165 and C20. The pan class="Chemical">OMx and pan class="Chemical">OMx-D3T energies are corrected by excluding ZPVE and thermal contributions. Subset W4–08 without multirepan class="Chemical">ference cases. The adduct O3+pan class="Chemical">C2H2 is treated as an open-shell singlet in all cases. Subset DC9 without C20 bowl/cage isomerization energy. C20 bowl/cage isomerization energy. pan class="Chemical">WATER27 subset with four ren>n class="Chemical">ference dissociation energies of (H2O)20 clusters taken from ref (122). Similar observations are made in the evaluation of the CE345-pan class="Chemical">CHNOF set[99,100] (Table ). The n>n class="Chemical">ODMx methods outperform the corresponding OMx and OMx-D3T methods for the ZPVE-exclusive atomization energies at 0 K for the MGAE109/11[97,98] subset, which has been used in the ODMx parametrization. The ODMx methods are better than their OMx and OMx-D3T counterparts for the IsoL6[152] and HC7/11[153] subsets but worse for the IP21,[97,154−158] EA13/03,[97,154−156] NHTBH38/08,[97,128,159,160] and ABDE12[97,153,161,162] subsets; in the latter case, the OMx and OMx-D3T energies were corrected by removing the ZPVE and thermal corrections. ODM2 is better than OM2 and OM2-D3T for the PA8/06[134] and NCCE31/05[155,163] subsets but worse for the πTC13[134,154,161] and HTBH38/08[97,128,159,160] subsets. ODM3 is better than OM3 and OM3-D3T for the πTC13 and HTBH38/08 subsets but worse for the PA8/06 and NCCE31/05 subsets.
Table 9

Mean Absolute Errors (kcal/mol) in Properties Calculated with the OMx, OMx-D3T, and ODMx Methods for the CE345-CHNOF Set and Its Subsets

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
overalla1866.406.186.686.896.526.72
MGAE109/11744.26b4.19b3.604.73b4.53b4.41
IsoL661.992.161.403.223.282.52
IP21413.2413.2415.5911.9111.9115.92
EA13/0349.809.8012.709.189.1813.26
PA8/06425.0424.9724.1117.8917.7718.20
ABDE12128.98b7.78b9.5410.52b8.91b11.39
HC7/1178.667.165.376.753.212.83
πTC13132.542.736.615.174.772.68
HTBH38/08264.964.965.945.996.615.73
NHTBH38/082313.6413.5815.3514.1814.0715.66
NCCE31/05132.151.110.972.611.261.35

The NH3···F2 complex in the NCCE31 subset was excluded from this statistics, because the SCF calculations did not converge with the OM2 and OM3 methods.

The OMx and OMx-D3T energies are corrected by excluding the ZPVE and thermal contributions.

The NH3···F2 complex in the NCCE31 subset was excluded from this statistics, because the SCF calculations did not converge with the OM2 and OM3 methods. The pan class="Chemical">OMx and pan class="Chemical">OMx-D3T energies are corrected by excluding the ZPVE and thermal contributions. Concerning barrier heights, the performance of the pan class="Chemical">ODMx methods is generally similar to that of the n>n class="Chemical">OMx and OMx-D3T methods. For example, the MAEs in 60 activation enthalpies (298 K) in the CHNO set are slightly higher for ODM2 and ODM3 (1.77 and 2.01 kcal/mol) than for the OMx methods (1.53–1.55 kcal/mol). On the other hand, the MAEs in 22 barrier heights of pericyclic reactions (BHPERI subset) are lower for ODM2 and ODM3 (6.49 and 6.80 kcal/mol) than for their OMx counterparts (8.21–8.25 kcal/mol) and of similar magnitude as those for the OMx-D3T methods (6.69–6.78 kcal/mol). Compared to their OMx and OMx-D3T counterparts, ODM2 and ODM3 perform somewhat worse for the BH76 subset (54 barriers of hydrogen and heavy-atom transfers, nucleophilic substitutions, unimolecular and association reactions), comparably bad for the O3ADD6 subset (only 2 barriers of ozone addition to unsaturated hydrocarbons), similarly for the HTBH38/08 (26 hydrogen transfer barriers), and somewhat worse for the NHTBH38/08 subset (23 non-hydrogen transfer barriers). Judging from the single-point results for the GMTKN30-CHNOF and CE345-CHNOF subsets (Tables and 9), the MAEs of the ODMx methods for barriers and reaction energies seem to be overall of similar magnitude, while those for energy differences between conformers and isomers are lower.

Noncovalent Interactions

The evaluation of the results for the A24-pan class="Chemical">CHNOF,[164] S22,[143,144,165] S66,[103,104] S66×8,[103] S66a8,[104] JSCH-2005-n>n class="Chemical">CHNOF,[143] S7L,[166] S30L-CHNOF,[19] and AF6[167] benchmark sets for noncovalent interactions shows that statistically ODM2 and ODM3 are rather similar to OM2-D3T and OM3-D3T, respectively, for energies at the reference geometries (Table ) and for the optimized geometries (Table ). The OMx methods without dispersion corrections are known to perform much worse (as expected). The ODMx methods are generally somewhat better than their OMx-D3T counterparts for predicting interaction energies in the hydrogen-bonded complexes of the A24-CHNOF, S22, S66, and JSCH-2005-CHNOF sets (Table ). ODM2 and ODM3 have similar MAEs of 4.36–4.86 kcal/mol for the S30L set with very large noncovalent complexes, which lie within the range of the MAEs for OM2-D3T (5.01 kcal/mol) and OM3-D3T (3.59 kcal/mol). The MAEs for the folding energies of alkanes (AF6 set) are reasonably low at the ODMx level (1.87–3.22 kcal/mol) but still higher than those at the OMx-D3T level (0.34–1.17 kcal/mol). On the other hand, the ODMx methods perform better than their OMx-D3T counterparts for the large stacked complexes in the S7L set (MAE values in kcal/mol: ODM2 1.62, OM2-D3T 2.38, ODM3 0.44, OM3-D3T 0.95).
Table 10

Mean Absolute Errors in Interaction Energies (kcal/mol) for the A24-CHNOF, S22, S66, S66×8, S66a8, JSCH-2005-CHNOF, S7L, and S30L-CHNOF Sets and Their Subsets and in Folding Enthalpies and Folding Energies for the AF6 Set at the Reference Geometries As Calculated with the OMx, OMx-D3T, and ODMx Methods

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
A24-CHNOF
overall210.890.560.501.110.670.72
hydrogen bonded51.791.401.151.861.321.28
mixed100.700.170.131.050.340.43
dispersion60.440.520.570.580.670.72
S22
overall223.070.940.823.580.970.94
hydrogen bonded73.632.151.914.202.281.81
mixed83.680.470.343.920.240.35
dispersion71.800.280.292.570.500.74
S66
overall662.660.850.763.110.810.83
electrostatic232.961.801.763.241.751.68
mixed201.820.270.222.500.390.48
dispersion233.100.390.233.520.230.29
S66×8
overall5281.930.790.752.230.710.72
electrostatic1842.271.431.402.391.301.32
mixed1601.310.370.341.780.340.37
dispersion1842.130.520.472.450.430.42
S66a8
overall5281.960.610.582.290.600.66
electrostatic1842.351.341.322.631.331.42
mixed1601.390.220.191.880.260.29
dispersion1842.070.210.172.300.180.21
JSCH-2005-CHNOF
overall1344.971.811.595.151.371.17
overall*a1284.811.611.394.951.130.91
hydrogen bonded base pairs315.703.052.935.722.471.33
interstrand base pairs321.730.720.751.800.700.69
stacked base pairs546.421.501.026.580.750.89
amino acid pairs175.132.552.495.882.622.63
amino acid pairs*a113.290.650.643.970.520.49
S7L
overall79.692.381.6210.080.790.35
π–π510.672.721.7310.250.950.44
S30L-CHNOF
overall2421.335.014.8624.143.594.36
π–π stacking731.153.042.3732.543.173.30
hydrogen bondedb816.236.496.3620.414.625.00
charged complexesb816.228.418.8519.404.566.27
AF6
folding enthalpiesc63.380.34 5.111.00 
folding energiesd63.550.341.875.281.173.22

Charged amino acids excluded.

Two complexes are attributed to both H-bonded and charged complexes subsets.

Folding enthalpies were not calculated at the ODMx levels, because this would require geometry optimizations at these levels.

Errors in folding energies were calculated using uncorrected changes in heats of formation at 298 K calculated at the OMx and OMx-D3T levels.

Table 11

Mean Absolute Errors in Selected Distances (Å) and Angles (deg) for the A24-CHNOF, S22, S66, S7L, and AF6 Sets and Their Subsets As Calculated with the OMx, OMx-D3T, and ODMx Methods

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
A24-CHNOF
Selected Interatomic Distances
overall230.7900.5000.2610.8710.2620.274
hydrogen bonded50.1890.1980.1140.3360.3350.371
mixed130.4190.4130.3860.2540.3260.326
dispersion52.3571.0270.0833.0090.0230.040
Selected Angles
overall4013.899.974.918.598.808.94
hydrogen bonded1311.4711.394.6711.5011.0811.30
mixed2110.786.456.004.279.569.77
dispersion630.0019.181.6017.401.200.88
S22
Selected Interatomic Distances
overall1050.7080.4240.3001.9960.2850.295
Selected Angles
overall142.502.602.401.760.851.12
S66
Selected Interatomic Distances
overall1721.0100.3480.3391.9420.3170.294
electrostatic280.1750.1730.2260.2860.2830.313
mixed630.5720.4150.3830.7620.3610.327
dispersion811.6390.3560.3453.4320.2950.263
Selected Angles
overall14121.5312.3212.8920.6912.9911.62
electrostatic2810.8413.3013.749.769.977.31
mixed5215.6614.5417.2120.0719.7317.92
dispersion6131.449.978.8226.238.638.22
S7L
Selected Interatomic Distances
overall2815.5790.4200.38910.5400.3930.386
C···C2021.6620.4210.40914.4790.4680.450
H···H80.3700.4160.3400.6940.2070.227
AF6
Selected Interatomic Distances
overall270.5020.1680.1630.5660.1650.198
Selected Angles
overall7415.446.426.2816.337.037.54
Charged amino acids excluded. Two complexes are attributed to both H-bonded and charged complexes subsets. Folding enthalpies were not calculated at the pan class="Chemical">ODMx levels, because this would require geometry optimizations at these levels. Errors in folding energies were calculated using uncorrected changes in heats of formation at 298 K calculated at the pan class="Chemical">OMx and n>n class="Chemical">OMx-D3T levels. The ODM3 method sufpan class="Chemical">fers from one particular problem that also plagues OM3 and OM3-D3T:[50,54] geometry optimization of carboxylic acid dimers leads to symmetric cyclic structures with equal O–H bond distances, i.e. the methods fail to difpan class="Chemical">ferentiate between covalent and noncovalent O–H bonds in these dimers. We did find ODM3 parameter sets that fix this problem, but their overall performance for other properties was less satisfactory than that of OM3 or OM3-D3T, and hence they were discarded. This underlines again how difficult it is to achieve an overall balanced treatment of a large variety of target properties during parametrization. Another problem common to many SQC methods[54] is the bad description of the HF dimer. The pan class="Chemical">OMx and n>n class="Chemical">OMx-D3T methods give a cyclic structure with two equal H···F hydrogen bonds and strongly underestimate the interaction energy. ODM3 suffers from the same problem, while ODM2 yields a qualitatively correct geometry (Figure ) and an interaction energy of −2.2 kcal/mol that is still too small but much closer to the reference value of −4.6 kcal/mol than the values obtained otherwise (−1.2, −1.4, 0.5, 0.2, and 0.3 kcal/mol at OM2, OM2-D3T, OM3, OM3-D3T, and ODM3). Mainly because of the improved description of the HF dimer geometry, the MAE for selected angles in the A24-CHNOF set is much lower at the ODM2 level (4.9°) than at any other level (more than 8.5°).
Figure 2

Geometries of the HF dimer at the reference coupled cluster level (CCSD(T) with complete basis set extrapolation) and at the ODM2 and ODM3 levels. Distances in Å.

Geometries of the HF dimer at the repan class="Chemical">ference coupled cluster level (CCSD(T) with complete basis set extrapolation) and at the ODM2 and ODM3 levels. Distances in Å. Since the above sets contain only a pan class="Chemical">few n>n class="Chemical">fluorine-containing noncovalent complexes, we also performed benchmarking on the X40×10-CHNOF set. This set is the subset of the X40×10 set constructed by excluding complexes containing elements beyond CHNOF. Reference geometries and interaction energies were taken from the original publication[168] and from subsequent higher-level calculations,[169] respectively. It is clear from the statistical analysis of the errors (Table S4) that all tested methods have similar accuracy with MAEs of 1.37–1.89 kcal/mol (lowest for ODM2, highest for OM3).

Excited-State Properties

In Table we provide a statistical evaluation of the results for the vertical excitation energies of the VEE set that was included in the pan class="Chemical">ODMx parametrization. The n>n class="Chemical">OMx/MRCI and OMx-D3T/MRCI results are trivially identical since the D3T-correction term does not affect the excitation energy at a given geometry. The ODMx/MRCI results are generally superior to their OMx/MRCI counterparts: the overall MAE for the excitation energies is reduced by ca. 25% in the ODM2 case (0.35 vs 0.47 eV) and by ca. 20% in the ODM3 case (0.33 vs 0.42 eV). Singlet and triplet excitations are described with similar accuracy by all methods considered.
Table 12

Mean Absolute Errors in Vertical Excitation Energies (eV) for the VEE Set and Its Subsets As Calculated with the OMx/MRCI, OMx-D3T/MRCI, and ODMx/MRCI Methods

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
overall1670.470.470.350.420.420.33
singlet1040.470.470.350.410.410.32
triplet630.460.460.350.440.440.35
We also performed benchmarking on a previously introduced set of excited-state equilibrium geometries (called ExGeom)[55] and compared the results from the pan class="Chemical">ODMx/MRCI, n>n class="Chemical">OMx/MRCI, and OMx-D3T/MRCI methods with reference results from time-dependent density functional theory (TDDFT) and coupled cluster theory (CC2). ODM2/MRCI performs very similarly to OM2/MRCI as seen from the MAEs in bond lengths and bond angles given in Table (comparison to TDDFT; see Table S5 for comparison to CC2). ODM3/MRCI is slightly superior to OM3/MRCI for bond lengths, consistent with the observations for ground-state covalent bonds computed at the ODM3/SCF and OM3/SCF levels. The accuracy for bond angles is similar across all methods. As expected, dispersion corrections have practically no effect on the excited-state geometries of these small molecules (compare the OMx/MRCI with the OMx-D3T/MRCI results in Table ).
Table 13

Mean Absolute Errors in Bond Lengths (Å) and Bond Angles (deg) Calculated with the OMx/MRCI, OMx-D3T/MRCI, and ODMx/MRCI Methods for the ExGeom Set and Its Subsets Relative to the TDDFT Level of Theory

subsetNOM2OM2-D3TODM2OM3OM3-D3TODM3
Bond Lengths
overall5270.0160.0160.0160.0220.0220.019
singlet3940.0180.0180.0180.0250.0250.021
triplet1330.0130.0130.0110.0160.0160.013
C–C bonds2910.0160.0160.0150.0180.0180.017
C–H bonds710.0110.0110.0110.0170.0170.015
C–O bonds580.0170.0170.0150.0200.0200.019
C–N bonds680.0250.0250.0240.0400.0390.032
N–H bonds330.0090.0090.0170.0300.0290.008
Bond Angles
overall2781.881.891.781.851.861.92
singlet2011.821.831.761.811.821.93
triplet772.042.041.821.961.971.88
Some brief remarks on specific molecules that had been addressed in our previous benchmarking are as follows:[55] The singlet and triplet excited-state geometries of pan class="Chemical">formaldehyde are better described by the n>n class="Chemical">ODMx/MRCI methods than by their OMx/MRCI counterparts (Table S86). More generally, the excited-state C=O bond lengths in formaldehyde, acetaldehyde, and acetone from ODMx/MRCI are closer to the experimental and TDDFT values than those from OMx/MRCI (Table S87), but they are still underestimated. The pyramidalization of these carbonyl compounds upon excitation is reproduced well by the ODMx/MRCI, similarly to TDDFT, CC2, and OMx/MRCI (Table S88). The nonlinear equilibrium geometries of acetylene in several excited states are qualitatively well described both at the ODMx/MRCI and OMx/MRCI levels (see the ∠CCH angles in Table S89); for two states (2 1A2 and 2 3A2) acetylene is still predicted to be linear, whereas the reference TDDFT and CC2 calculations give slightly bent structures. Both the ODMx/MRCI and OMx/MRCI methods give excited-state structures of 9H-adenine, aniline, cytosine, and 9H-guanine with out-of-plane bending angles of the amino groups that are much too small compared to the reference TDDFT and CC2 results (Table S90). Finally, we assess the performance of the pan class="Chemical">ODMx/MRCI methods on the SKF (Send–Kühn–Furche) set of experimental 0–0 transition energies.[170] In view of the technical problems encountered in MRCI calculations of ZPVEs,[55] we compare theoretical values without ZPVEs to experimental values back-corrected using ΔZPVEs from (TD)DFT calculations.[55,170] As seen from Table the ODM2/MRCI method is marginally better than OM2/MRCI for the 0–0 transition energies of the SKF set, while ODM3/MRCI performs statistically basically the same as OM3/MRCI. Again, as expected, the dispersion corrections do not have any significant efpan class="Chemical">fect for this set.
Table 14

Mean Absolute Errors (MAEs) in 0–0 Transition Energies (eV) Calculated with the OMx/MRCI, OMx-D3T/MRCI, and ODMx/MRCI Methods for the SKF Set Relative to Back-Corrected Experimenta

 OM2OM2-D3TODM2OM3OM3-D3TODM3
countb686765666665
MAE0.260.260.250.270.270.27

Experimental values were back-corrected with (TD)DFT ΔZVPE values to directly compare them with theoretical calculations, which do not include ZPVE-corrections, see ref (55) for the reasons.

Some calculations could not be converged which reduces the total count of successful computations to less than 68. Tests have confirmed that this practically does not affect the overall statistics.

Experimental values were back-corrected with (TD)DFT ΔZVPE values to directly compare them with theoretical calculations, which do not include ZPVE-corrections, see ref (55) for the reasons. Some calculations could not be converged which reduces the total count of successful computations to less than 68. Tests have confirmed that this practically does not afpan class="Chemical">fect the overall statistics. To conclude, we note that our current excited-state validations employ an MRCI treatment, which may no longer be pan class="Chemical">feasible for larger active span>ces that are often required for larger molecules. To deal with such cases, our code includes efficient implementations of the CIS and SF-XCIS (spin-flip extended CIS) methods[171,172] that allow for practical SQC explorations of large systems (with little loss of accuracy).

Conclusions

In this work we present two new semiempirical quantum-chemical methods with integrated orthogonalization and dispersion corrections, ODM2 and ODM3 (pan class="Chemical">ODMx). The electronic structure formalism is the same as in the established n>n class="Chemical">NDDO-based orthogonalization-corrected methods (OMx). In addition, the ODMx methods include D3-dispersion corrections with Becke–Johnson damping and with three-body corrections E as an integral part, for proper description of noncovalent interactions. Moreover, the total energy in the ODMx methods is defined in complete analogy to ab initio methods, and the traditional convention in NDDO-based methods of using SQC total energies directly in calculating heats of formation at room temperature is abandoned. Instead, ODMx heats of formation at 298 K are determined by explicitly computing ZPVE and thermal corrections within the harmonic-oscillator and rigid-rotor approximations. Compared with the previous pan class="Chemical">OMx development, the parametrization of the n>n class="Chemical">ODMx methods targeted a much broader range of reference properties, covering in particular also vertical excitation energies. To ensure a balanced description of a large variety of ground-state and excited-state properties as well as noncovalent interactions, we employed a novel robust parametrization procedure and a carefully chosen selection of representative training sets. The performance of the pan class="Chemical">ODMx methods was evaluated for a large and diverse collection of accurate ren>n class="Chemical">ference data. The ODMx methods are found to perform overall somewhat better than the OMx methods for ground-state and excited-state properties, while their accuracy is similar to that of the dispersion-corrected OMx-D3T methods for noncovalent interactions. They are also formally more consistent: since they were parametrized with integrated dispersion corrections, there are no problems arising from the a posteriori addition of attractive dispersion terms to SQC methods parametrized without them. Therefore, heats of formation at 298 K are well described by the ODMx methods but are systematically too small for the dispersion-corrected OMx-D3T methods. Moreover, the redefinition of the total energy (in analogy to ab initio methods) removes ambiguities caused by associating them directly with heats of formation at 298 K (as traditionally done in the NDDO-based SQC methods). Thus, we recommend the ODM2 and ODM3 methods as standard tools for fast electronic structure calculations. To widen their scope we plan to extend them to heavier main-group elements. The ODM2 method is the most complete model, shows good performance in our benchmarks, and would thus normally be the method of pan class="CellLine">choice. Of course, SQC application studies should generally begin with a careful validation, and it may turn out that another SQC method is more appropriate for a particular problem, which should then be n>n class="CellLine">chosen for the actual production work. The benchmark results reported here and in our previous studies[50,54,55] may be helpful for choosing the most appropriate method.
  123 in total

1.  Effects of London dispersion on the isomerization reactions of large organic molecules: a density functional benchmark study.

Authors:  Robert Huenerbein; Birgitta Schirmer; Jonas Moellmann; Stefan Grimme
Journal:  Phys Chem Chem Phys       Date:  2010-05-11       Impact factor: 3.676

2.  INDO/X: A New Semiempirical Method for Excited States of Organic and Biological Molecules.

Authors:  Alexander A Voityuk
Journal:  J Chem Theory Comput       Date:  2014-11-11       Impact factor: 6.006

3.  "Mindless" DFT Benchmarking.

Authors:  Martin Korth; Stefan Grimme
Journal:  J Chem Theory Comput       Date:  2009-03-04       Impact factor: 6.006

4.  Design of Density Functionals by Combining the Method of Constraint Satisfaction with Parametrization for Thermochemistry, Thermochemical Kinetics, and Noncovalent Interactions.

Authors:  Yan Zhao; Nathan E Schultz; Donald G Truhlar
Journal:  J Chem Theory Comput       Date:  2006-03       Impact factor: 6.006

5.  Assessment of density functionals for pi systems: Energy differences between cumulenes and poly-ynes; proton affinities, bond length alternation, and torsional potentials of conjugated polyenes; and proton affinities of conjugated Shiff bases.

Authors:  Yan Zhao; Donald G Truhlar
Journal:  J Phys Chem A       Date:  2006-09-07       Impact factor: 2.781

6.  A density-functional model of the dispersion interaction.

Authors:  Axel D Becke; Erin R Johnson
Journal:  J Chem Phys       Date:  2005-10-15       Impact factor: 3.488

7.  Looking at self-consistent-charge density functional tight binding from a semiempirical perspective.

Authors:  Nikolaj Otte; Mirjam Scholten; Walter Thiel
Journal:  J Phys Chem A       Date:  2007-03-27       Impact factor: 2.781

8.  First principles calculation of electron ionization mass spectra for selected organic drug molecules.

Authors:  Christoph Alexander Bauer; Stefan Grimme
Journal:  Org Biomol Chem       Date:  2014-11-21       Impact factor: 3.876

9.  Computation of Accurate Activation Barriers for Methyl-Transfer Reactions of Sulfonium and Ammonium Salts in Aqueous Solution.

Authors:  Hakan Gunaydin; Orlando Acevedo; William L Jorgensen; K N Houk
Journal:  J Chem Theory Comput       Date:  2007-05       Impact factor: 6.006

10.  The X40×10 Halogen Bonding Benchmark Revisited: Surprising Importance of (n-1)d Subvalence Correlation.

Authors:  Manoj K Kesharwani; Debashree Manna; Nitai Sylvetsky; Jan M L Martin
Journal:  J Phys Chem A       Date:  2018-02-15       Impact factor: 2.781

View more
  7 in total

1.  The Feynman dispersion correction for MNDO extended to F, Cl, Br and I.

Authors:  Maximilian Kriebel; Andreas Heßelmann; Matthias Hennemann; Timothy Clark
Journal:  J Mol Model       Date:  2019-05-11       Impact factor: 1.810

2.  Glucuronidation of Methylated Quercetin Derivatives: Chemical and Biochemical Approaches.

Authors:  Maite L Docampo-Palacios; Anislay Alvarez-Hernández; Olubu Adiji; Daylin Gamiotea-Turro; Alexander B Valerino-Diaz; Luís P Viegas; Ikenna E Ndukwe; Ângelo de Fátima; Christian Heiss; Parastoo Azadi; Giulio M Pasinetti; Richard A Dixon
Journal:  J Agric Food Chem       Date:  2020-12-08       Impact factor: 5.279

3.  Deep learning of dynamically responsive chemical Hamiltonians with semiempirical quantum mechanics.

Authors:  Guoqing Zhou; Nicholas Lubbers; Kipton Barros; Sergei Tretiak; Benjamin Nebgen
Journal:  Proc Natl Acad Sci U S A       Date:  2022-07-01       Impact factor: 12.779

4.  Providing theoretical insight into the role of symmetry in the photoisomerization mechanism of a non-symmetric dithienylethene photoswitch.

Authors:  Edison Salazar; Suzanne Reinink; Shirin Faraji
Journal:  Phys Chem Chem Phys       Date:  2022-05-18       Impact factor: 3.945

5.  Artificial intelligence-enhanced quantum chemical method with broad applicability.

Authors:  Peikun Zheng; Roman Zubatyuk; Wei Wu; Olexandr Isayev; Pavlo O Dral
Journal:  Nat Commun       Date:  2021-12-02       Impact factor: 14.919

6.  The PM6-FGC Method: Improved Corrections for Amines and Amides.

Authors:  Martiño Ríos-García; Berta Fernández; Jesús Rodríguez-Otero; Enrique M Cabaleiro-Lago; Saulo A Vázquez
Journal:  Molecules       Date:  2022-03-03       Impact factor: 4.411

7.  Absorption Properties of Large Complex Molecular Systems: The DFTB/Fluctuating Charge Approach.

Authors:  Piero Lafiosca; Sara Gómez; Tommaso Giovannini; Chiara Cappelli
Journal:  J Chem Theory Comput       Date:  2022-02-20       Impact factor: 6.006

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.