Literature DB >> 24803865

Parameterization of DFTB3/3OB for Sulfur and Phosphorus for Chemical and Biological Applications.

Michael Gaus1, Xiya Lu1, Marcus Elstner2, Qiang Cui1.   

Abstract

We report the parametrization of the approximate density functional tight binding method, class="Chemical">DFTB3, for class="Chemical">n class="Chemical">sulfur and phosphorus. The parametrization is done in a framework consistent with our previous 3OB set established for O, N, C, and H, thus the resulting parameters can be used to describe a broad set of organic and biologically relevant molecules. The 3d orbitals are included in the parametrization, and the electronic parameters are chosen to minimize errors in the atomization energies. The parameters are tested using a fairly diverse set of molecules of biological relevance, focusing on the geometries, reaction energies, proton affinities, and hydrogen bonding interactions of these molecules; vibrational frequencies are also examined, although less systematically. The results of DFTB3/3OB are compared to those from DFT (B3LYP and PBE), ab initio (MP2, G3B3), and several popular semiempirical methods (PM6 and PDDG), as well as predictions of DFTB3 with the older parametrization (the MIO set). In general, DFTB3/3OB is a major improvement over the previous parametrization (DFTB3/MIO), and for the majority cases tested here, it also outperforms PM6 and PDDG, especially for structural properties, vibrational frequencies, hydrogen bonding interactions, and proton affinities. For reaction energies, DFTB3/3OB exhibits major improvement over DFTB3/MIO, due mainly to significant reduction of errors in atomization energies; compared to PM6 and PDDG, DFTB3/3OB also generally performs better, although the magnitude of improvement is more modest. Compared to high-level calculations, DFTB3/3OB is most successful at predicting geometries; larger errors are found in the energies, although the results can be greatly improved by computing single point energies at a high level with DFTB3 geometries. There are several remaining issues with the DFTB3/3OB approach, most notably its difficulty in describing phosphate hydrolysis reactions involving a change in the coordination number of the phosphorus, for which a specific parametrization (3OB/OPhyd) is developed as a temporary solution; this suggests that the current DFTB3 methodology has limited transferability for complex phosphorus chemistry at the level of accuracy required for detailed mechanistic investigations. Therefore, fundamental improvements in the DFTB3 methodology are needed for a reliable method that describes phosphorus chemistry without ad hoc parameters. Nevertheless, DFTB3/3OB is expected to be a competitive QM method in QM/MM calculations for studying phosphorus/sulfur chemistry in condensed phase systems, especially as a low-level method that drives the sampling in a dual-level QM/MM framework.

Entities:  

Year:  2014        PMID: 24803865      PMCID: PMC3985940          DOI: 10.1021/ct401002w

Source DB:  PubMed          Journal:  J Chem Theory Comput        ISSN: 1549-9618            Impact factor:   6.006


Introduction

class="Chemical">Phosphorus aclass="Chemical">nd class="Chemical">n class="Chemical">sulfur are richly featured in chemistry and biology.[1] Sulfur is part of amino acids Cys and Met and therefore involved in redox sensing; sulfur is the third most abundant mineral element in the human body. Sulfur is also part of many important biological cofactors such as ironsulfur clusters, coenzyme A, and several vitamins. As another example, sulfonation states on heparan sulfate chains are known to govern crucial signaling pathways and molecular-recognition event.[2] Sulfur is also involved in many chemical and materials applications. For example, sulfonic acids are used in many detergents. Nafion,[3] a sulfonated tetrafluoroethylene based fluoropolymer-copolymer, is an important material used for the proton exchange membrane in fuel cells. Sulfur-containing heterocycles are broadly used in the field of organic electronics.[4] Phosphorus is essential in biology because it is part of phospholipids, nucleic acids, and many vital small molecules such as various phosphates (e.g., ATP/GTP) as well as bone (hydroxyapatite). The phosphoryl transfer reaction, for example, arguably represents the most important chemical transformation in biology.[1,5−7] Perturbations in phosphoryl transfer enzymes are involved in many serious human diseases such as cancer.[8,9] Protein kinases and phosphatases are among the most important drug targets;[10−15] there are ∼2000 protein kinases and ∼1000 phosphatases in the human genome, and these enzymes are essential to key cellular processes such as the control of cell cycles and division. In the chemical industry, phosphorus compounds are predominantly consumed as fertilizers, while organophosphorus compounds are also used in detergents, pesticides, and nerve agents. To describe the rich (bio)chemistry that involve class="Chemical">phosphorus aclass="Chemical">nd class="Chemical">n class="Chemical">sulfur in complex condensed phase environments, computational studies in the framework of QM/MM methods are essential.[16] Although ab initio based QM/MM simulations have become increasingly powerful thanks to developments in both computational hardware and theoretical algorithms,[17−20] they remain computationally demanding and therefore not ideally suited when multiple reaction mechanisms need to be analyzed. This is particularly the case when sampling is critical, such as in the study of intrinsically flexible systems (e.g., signaling proteins)[21,22] or enzymes that feature rather solvent-accessible active sites;[23−25] adequate sampling is also important to processes that occur at the solid/liquid interface.[26,27] Therefore, despite the tremendous progress in ab initio QM and QM/MM methods, it remains meaningful to explore further development of the semiempirical type of QM methods.[28−31] In this class="Chemical">coclass="Chemical">ntext, the Self-class="Chemical">n class="Chemical">Consistent Charge Density Functional Tight-Binding (SCC-DFTB) method proves to be a promising approach.[32−34] It is an approximate Density Functional Theory (DFT) and is derived by expanding the DFT total energy functional up to second order around a reference charge density. The resulting perturbative series is further approximated by applying a minimal basis LCAO expansion of the Kohn–Sham orbitals, by using a monopole approximation for the charge density fluctuations in the second order terms and by approximating the zero-th order terms, which resemble the DFT double counting contributions for the reference density, with a sum of atomic pair potentials.[32,35] The resulting approximate total energy terms have to be parametrized, and two classes of parameters can be distinguished: (i) the electronic parameters, which determine the atomic minimal basis set and the atomic reference densities as well as the chemical class="Disease">hardness values of the iclass="Chemical">nvolved atoms—the determiclass="Chemical">natioclass="Chemical">n of these parameters is quite straightforward; (ii) the repulsive eclass="Chemical">nergy parameters, which are class="Chemical">necessary to determiclass="Chemical">ne the atomic pair poteclass="Chemical">ntials modeliclass="Chemical">ng the zero-th order class="Chemical">n class="Chemical">contributions in the density expansion. Although these terms can in principle be computed based on DFT calculations, to achieve good general accuracy and partially compensate for approximations made in the other terms, an empirical fit to larger test sets is necessary, and therefore their determination is usually more involved. SCC-class="Chemical">DFTB has beeclass="Chemical">n parametrized for orgaclass="Chemical">nic molecules class="Chemical">n class="Chemical">containing O, N, C, and H, for which extensive tests have been performed,[36−38] as well for molecules containing S,[39] Zn,[40−42] P,[43] and a few other elements.[44−48] The resulting parameter set is referred to as the “MIO” set and can be downloaded from the DFTB Web site: www.dftb.org. To improve the description of class="Chemical">hydrogen-boclass="Chemical">nded class="Chemical">n class="Chemical">complexes and proton affinities, which are important to biological applications, DFTB has been extended to include third-order contributions and a modified second-order Coulomb scaling law,[33,49,50] leading to the DFTB3 approach. In its first implementation, the “MIO” parameters, originally derived for the second-order method, were also used for DFTB3.[43,50,51] Recently, two of us have reparametrized DFTB3 for O, N, C, and H, resulting in a parameter set called “3OB,”[52] emphasizing the application to organic and biologically relevant molecules. The main goal was to improve heats of formation and reaction energies, as well as nonbonded interactions. As benchmark calculations indicated, DFTB3–3OB generally gives results comparable to the most successful semiempirical methods such as the OMx methods.[30] In this work, motivated by the importance of P and S in (bio)chemistry, we extend the “3OB” parametrization to these elements. This is a worthwhile effort because our parametrization inclclass="Chemical">udes d orbitals, which have beeclass="Chemical">n kclass="Chemical">nowclass="Chemical">n to be esseclass="Chemical">ntial to the proper descriptioclass="Chemical">n of chemical species iclass="Chemical">nvolviclass="Chemical">ng P aclass="Chemical">nd S, especially hypervaleclass="Chemical">nt class="Chemical">n class="Chemical">compounds[53] (see, for example, refs (54) and (55)). By contrast, many popular semiempirical methods such as PM3[56] and PDDG[28,57] do not include d orbitals (PM6[58] does include d orbitals). We test the new parameters for P and S with a large set of biologically relevant molecules, focusing in particular on geometries, reaction energies, proton affinities, and hydrogen-bonding interactions. These results show that the 3OB set represents a notable overall improvement over the “MIO” set[39,43] parameters for P/S containing compounds for both structural and energetic properties. For example, the deficiency concerning the S···N interaction as reported in ref (38) is removed. Moreover, due to major improvement in the atomization energies, reaction energies are generally better described with the 3OB parameters than the “MIO” parameters. For the majority S/P-containing compounds tested here, DFTB3/3OB also out-performs PM6 and PDDG, especially for structural properties, vibrational frequencies, hydrogen bonding energies, and proton affinities. Nevertheless, several limitations remain, such as too short hydrogen bonding distances for complexes that involve charged S/P species; problems also remain for the energetics of phosphoryl transfer reactions that involve changes in the coordination number of the phosphorus. Some of these limitations are likely due to the intrinsic deficiencies of the underlying functional (PBE[59]), while others might be alleviated via continuing development of the DFTB3 framework, such as including three-center terms and multipoles for the charge fluctuation.

Theory

The class="Chemical">DFTB3 parametrizatioclass="Chemical">n of class="Chemical">n class="Chemical">sulfur and phosphorus follows the protocol we outlined previously[52,60] and used to parametrize for C, H, N, and O.[52] The theory of DFTB3 has been described in detail in ref (51); for a recent review, see ref (61). Here, we only briefly discuss the parameters that enter the DFTB3 total energy: To determine the Hamilton matrix elements Hμ0, the atomic orbital basis functions ημ and the atomic reference densities ρa0 have to be determined. This is done by solving the Kohn–Sham equations for the atom in an additional harmonic potential with a class="Chemical">coclass="Chemical">nficlass="Chemical">niclass="Chemical">ng radius rwf for the atomic basis set aclass="Chemical">nd a differeclass="Chemical">nt radius rdeclass="Chemical">ns for the declass="Chemical">nsity. Siclass="Chemical">nce the d orbitals of S aclass="Chemical">nd P are uclass="Chemical">noccupied, we use aclass="Chemical">nother value for the class="Chemical">n class="Chemical">confinement radius, denoted by rwfd. The off-diagonal matrix elements Hμ0 are then precomputed and tabulated using an exchange-correlation functional (PBE[59]), the atomic basis set consisting of Slater functions, and the initial atomic density. The diagonal elements Hμμ0 are equal to the atomic eigenvalues ε (x = s, p, or d orbital). Here, no confinement of the orbitals is used to ensure the proper dissociation limit. In the case of sulfur and phosphorus, however, we make an exception to this standard procedure and optimize εd as discussed in the following section. The Slater functions are defined by lmax as the highest angular momentum taken into account, nmax; another parameter that determines the size of the basis set; and α0, α1, ..., and α4, the exponents of the Slater functions. For the seclass="Chemical">coclass="Chemical">nd-order term of the eclass="Chemical">nergy Eγ, the atomic Hubbard parameters Ua are class="Chemical">needed, which are class="Chemical">n class="Gene">calculated from DFT as the first derivative of the highest occupied orbital with respect to its occupation number. [Ua describes the electron interaction on-site of an atom and enters the γ function, which interpolates between the on-site and the long-range electron interaction of atom pairs. As a consequence of the form of the γ function in the interpolating region, which is well within the covalent and noncovalent bond distances (e.g., hydrogen bonds), the Hubbard parameter determines the size of the atom in an inverse relation. While this was found to be reasonable within one period of the system of elements, it is not for interactions between atoms of different periods. The most significant difference from the inverse relation was found for hydrogen.[33] Therefore, the original γ function of DFTB2[32] is modified for DFTB3[50,51] (called γh in ref (51)) for all pairs that include at least one hydrogen atom. Essentially one additional parameter (called ζ) was introduced to damp the influence of the Hubbard parameter such that γ is closer to Coulomb behavior at short distances (see also Figure 3 of ref (51)) and therefore accounts for the small covalent radius of hydrogen. While the 3OB parameter set includes elements of the first (H) and second period (CNO), we now extend it to third period elements S and P. Naturally, the question arises as to whether one can improve performance by applying a similar modification to γ. Since the Hubbard parameters for S and P are larger than they should be to fulfill the inverse relation of second period elements, we would have to damp the γ function such that γ approximates the Coulomb behavior at larger distances. We have tested this idea but found insignificant improvement over the standard γ and therefore decided not to introduce additional complexity to our DFTB model. Note that an alternative to the γ function has been suggested that intrinsically contains such effects; however, it has not been applied and tested yet.[62]] For the third-order term EΓ, the derivative of the Hubbard parameter with respect to charge, Uad has to be determined. For CHNO it is calculated as the second derivative of the highest occupied orbital with respect to its occupation number. For P and S, as discussed below, we find a significant advantage in optimizing it. The repulsive energy Erep is described as pair potentials. Their parametrization is done by fitting to a selected set of reference atomization energies, molecular geometries, vibrational frequencies, and barrier or reaction energies. A careful benchmark of a resulting fit is the most time-n class="Chemical">coclass="Chemical">nsumiclass="Chemical">ng part of the procedure, despite the progress of partially automatized fitticlass="Chemical">ng procedures.[60,63] Specifically for the description of atomic energies and atomization energies (Eat), a further parameter was introduced, the spiclass="Chemical">n-polarizatioclass="Chemical">n eclass="Chemical">nergy class="Chemical">n class="Gene">Espin. Because the total energy of an atom is described in a spin-unpolarized manner in standard DFTB, the total energy is overestimated for open-shell atoms. Therefore, the energy difference between a restricted and an unrestricted spin DFT calculation, called Espin, is subtracted to yield the atomic energies at the DFT level. [Note that a spin-polarization corrected DFTB (sDFTB) has been proposed.[64] In principle, sDFTB could also be used to calculate atomic energies; however, for calculation of atomization energies, it is not the standard. sDFTB uses parameters calculated for states with partial spin in the vicinity of spin-unpolarized atoms (as the Taylor-series suggests) and therefore is somewhat less accurate for single atoms than Espin. In contrast, for radical molecules, sDFTB is the method of choice.]

Parametrization of Sulfur and Phosphorus

Parameters for General Application

This section summarizes the choices of all parameters discussed in the previous section. We follow the search proton class="Chemical">col established iclass="Chemical">n refs (60) aclass="Chemical">nd (52) aclass="Chemical">nd divide the parameters iclass="Chemical">nto two parts, electroclass="Chemical">nic aclass="Chemical">nd repulsive parameters. The electronic parameters are given in Table 1. For the sake of class="Chemical">completeclass="Chemical">ness, we also iclass="Chemical">nclclass="Chemical">n class="Chemical">ude parameters that are calculated from DFT (U, Espin, εs/p) or are in line with the standard choice of the basis set (nmax, α, i = 0–4); those parameters are not subject to optimization. We note that although l-dependent Hubbard parameters have been considered[65] and being explored by us for improving the description of transition metals (Gaus and Cui, unpublished), the Hubbard parameter is taken to be l-independent for P and S. For all DFT calculated values, the PBE exchange correlation functional[59] has been used. The following parameters have been optimized to improve the overall performance:
Table 1

Overview of Electronic Parameters (in Atomic Units if not Unitless)a

parameterSP
lmax22
nmax22
α00.500.50
α11.191.17
α22.832.74
α36.736.41
α416.0015.00
rwf3.83.6
rwfd4.44.4
rdens9.09.0
εs–0.63 000 872–0.51 063 909
εp–0.25 802 653–0.20 276 532
εd0.32 140 7660.52 019 087
Espin–0.03 121 074–0.06 868 820
U0.32880.2894
Ud–0.11–0.14

As described in the text, U, Espin, εs,p, nmax, and α are in line with the standard choices of DFTB parametrization and not subject to optimizations. By contrast, rwf, rwfd, rdens, εd, and Ud are adjusted based on properties of molecules in the fitting set.

• lmax: While a minimal basis has been used for H (s orbital only) and class="Gene">CNO (2s aclass="Chemical">nd 2p orbitals oclass="Chemical">nly), for S aclass="Chemical">nd P, we iclass="Chemical">nclclass="Chemical">n class="Chemical">ude 3s, 3p, and 3d orbitals, the latter being used as polarization functions, which are required to have a higher angular momentum than the valence orbitals. As already found previously for the “MIO” parameters for S and P, d orbitals are necessary to properly describe the diverse chemical environments of these elements. • rwf: In class="Chemical">DFTB, the atomic orbitals used as a basis set are class="Chemical">n class="Chemical">confined using a harmonic confining potential characterized by this parameter. It essentially determines the spatial extension of the basis functions. Therefore, this parameter has influence on bonding properties as well as on noncovalent interactions (Pauli repulsion). For the H2S dimer, for example (also see discussions below), the intermolecular distance is found to be too short with our parametrization; this could be improved by using a larger rwf, but only at the cost of less accurate binding energy and angle of the hydrogen bond. For phosphorus, we have not tested dimer structures due to their rare appearance in biological systems; tests of other molecules indicate only a very subtle influence on the general performance of bond angles, bond lengths, and atomization energies. With a larger rwf, the angles improve, but for bond distances and Eat, slightly larger errors are found. With a smaller rwf, the opposite effects are observed: the description of angles worsens, while bond distances and Eat slightly improve. • rwfd: Generally, this parameter has only a minor influence on the results. A larger rwfd tends to improve bond distances and worsens bond angles. Note that the quality of n class="Chemical">computed boclass="Chemical">nd distaclass="Chemical">nces is to a large degree depeclass="Chemical">ndeclass="Chemical">nt oclass="Chemical">n the repulsive poteclass="Chemical">ntials; however, for a higher degree of traclass="Chemical">nsferability to differeclass="Chemical">nt chemical eclass="Chemical">nvclass="Chemical">n class="Chemical">ironments, it is also helpful to tune rwfd. • rdens: The density class="Chemical">compressioclass="Chemical">n has a large impact oclass="Chemical">n the total electroclass="Chemical">nic eclass="Chemical">nergy; however, the effect caclass="Chemical">n be class="Chemical">n class="Chemical">compensated by the repulsive potential without introducing substantial errors (as long as rdens is chosen within a reasonable range, e.g., rdens > 3.0 a0). Therefore, rdens is chosen such that the repulsive potentials smoothly interpolate to their cutoffs. Note that for hydrogen a very small density compression was chosen,[52] causing the H–S and H–P potentials to be negative in the binding region while other potentials like O–S, O–P, N–S, and N–P are still significantly repulsive. Thus, the choice of rdens for S and P is a tradeoff between too negative and too repulsive potentials. • εd: While all eigenvalues are usually class="Gene">calculated from atomic DFT class="Chemical">n class="Gene">calculations, an exception is made for the eigenvalues of the d orbitals since they are unoccupied in the ground state for atoms S and P, and therefore a change of εd does not affect the atomic total energy. Optimizing εd allows control of the d-orbital involvement in molecular bonding situations. This becomes particularly important for balancing energetics and geometries for the different oxides of S and P species. Generally, the calculated values of 0.02140766 au for S and 0.02019087 for P are too low; i.e., d orbitals are heavily involved in all types of bonding situations. Higher values reduce the excessive d-orbital participation. Empirically, we found that for C–S and C–P bonds a larger amount of d-orbital involvement decreased errors, while a rather small amount of d orbitals seemed more appropriate for O–S and O–P bonding situations. class="Chemical">Ud: The Hubbard derivative appears iclass="Chemical">n the third-order term EΓ (eq 1) aclass="Chemical">nd is usually class="Chemical">n class="Gene">calculated from DFT. In the case of S and P, the values are −0.0695 and −0.0701 au, respectively. For the mean absolute deviation (MAD) of eight proton affinities (PAs) for S containing species, however, we find a drop from 7.6 to 2.4 kcal/mol after optimizing Ud for sulfur. Similarly for P species, the MAD for 17 PAs is reduced from 9.1 to only 2.9 kcal/mol by optimizing Ud for P. The change of the Hubbard derivative only marginally affects other properties for systems with small charge fluctuations; however, for the highly oxidized phosphorus species, larger impacts are observed. Atomization energies become somewhat less accurate. P–P bond lengths are overestimated, and the P–O–P angle in diphosphoric acid is overestimated to be 141° instead of 116° for B3LYP/cc-pVTZ and 121° for DFTB3 using the calculated Hubbard derivative. Due to the importance of accurate PAs for many biochemistry applications, we choose to use the fitted Ud and list the values in Table 1. As described in the text, U, class="Gene">Espin, εs,p, class="Chemical">nmax, aclass="Chemical">nd α are iclass="Chemical">n liclass="Chemical">ne with the staclass="Chemical">ndard choices of class="Chemical">n class="Chemical">DFTB parametrization and not subject to optimizations. By contrast, rwf, rwfd, rdens, εd, and Ud are adjusted based on properties of molecules in the fitting set. Note that the spiclass="Chemical">n-polarizatioclass="Chemical">n class="Chemical">n class="Chemical">constants that enter the sDFTB formalism have also been calculated and included in the Supporting Information. For the repulsive potentials, first a fit for class="Chemical">sulfur is carried out (iclass="Chemical">nclclass="Chemical">n class="Chemical">uding all electronic parameters), and subsequently phosphorus related parameters are optimized. Table 2 provides an overview of all reference systems and values that lead to the repulsive potentials related to S and P. For the solution of the linear equation system set up to determine the repulsive potentials, division points have to be selected. Further, additional equations can be chosen to better represent second derivatives. These parameters are specified in Table 2. The geometries are taken from the B3LYP/cc-pVTZ level of theory with the exception of [CH3–COO–PO3]2–, for which we use B3LYP/aug-cc-pVTZ. Atomization energies (Eat) are calculated from G3B3,[66−68] barrier geometries and energetics (Ebar) from MP2/G3large. Additional equations for the fitting process (see ref (52)) are prepared using a few selected vibrational frequencies shown below as determined from BLYP/cc-pVTZ (unscaled) using the harmonic approximation. All reference structures can be found in the Supporting Information.
Table 2

Parameters Defining the Repulsive Potentialsa

molecules (weeq, wfeq if different than 1.0) and Eat (kcal/mol)
SH2 (10.0, 1.0)181.9(CH3)3C–SH (0.0, 1.0) HP=CH2390.3
N2S248.2P2116.6H3PO4760.0
H2SO4592.5PH3 (2.0, 1.0)140.2P4O6 (0.1, 1.0)1056.2
S284.7H2P–PH2366.3P4O10 (0.1, 1.0)1595.9
CH2S325.5N≡P (0.1, 0.1)147.8CH3S–P(=O)(OH)2 (0.0, 0.1) 
CH3SH472.5P(NH2)3 (0.1, 0.1)788.4H3PS4 (0.05, 0.05)527.2
CH3S–SCH3828.4CH3PH2536.4[CH3COO–PO3]2– (0.0, 1.0) 

weeq and wfeq are the weighting factors for energy and force equations in 1/a.u. Reaction equations and additional equations are weighted with 1/a.u., for details see ref (52). rXX is the heavy atom distance.

weeq and wfeq are the weighting factors for energy and force equations in 1/a.u. Reaction equations and additional equations are weighted with 1/a.u., for details see ref (52). rXX is the heavy atom distance.

Special Parametrization for Phosphate Hydrolysis Reactions

With the general parameter set outlined above, class="Chemical">coclass="Chemical">nsiderable errors are observed for class="Chemical">n class="Chemical">phosphate hydrolysis reactions. The details are discussed in the benchmark sections. The bottom line is that the current DFTB3/3OB model still cannot properly describe the energetics of reactions where the phosphorus coordination changes from four to five/three oxygen atoms. This suggests that the current DFTB3 methodology has limited transferability for complex phosphorus chemistry with the required accuracy in energetics for mechanistic studies. As a temporary solution, we suggest a specific reaction parametrization[69] of the O–P repulsive potential referred to as “OPhyd.” The need of this specific fit suggests that additional improvements in the DFTB3 methodology, such as the inclusion of three-center terms and better electrostatic models (e.g., multipoles for charge fluctuations[62,70]), are still needed and will be systematically analyzed in the future. For “OPhyd,” we carry out the same fit for class="Chemical">phosphorus as above with the iclass="Chemical">nclusioclass="Chemical">n of two reactioclass="Chemical">n eclass="Chemical">nergies (the weight is 10/(kcal/mol)). The first oclass="Chemical">ne is a reactioclass="Chemical">n of class="Chemical">n class="Chemical">water with a dimethyl hydrogen phosphate, leading to a penta-coordinated intermediate (the geometries can be found in the Supporting Information, n_com1n_int1); the second reaction is the dissociation of methanol from the previous intermediate (n_int1 → n_com2). As reference energies, MP2/G3large values are calculated at B3LYP/6-31+G(d,p) geometries. This fit shifts the O–P repulsive potential by about −10 kcal/mol; i.e., overbinding for O–P bonds is introduced.

Benchmarks and Discussion

In this section, we discuss benchmark results for our 3OB parametrization for S and P in class="Chemical">combiclass="Chemical">natioclass="Chemical">n with the 3OB parameters[52] for O, N, C, aclass="Chemical">nd H. [Note that iclass="Chemical">n class="Chemical">n class="Chemical">connection with 3OB as well as “MIO”, the hydrogen molecule is evaluated using the respective H–H-mod potential.[71]] Comparisons are made to MP2, DFT, popular semiempirical methods (PDDG[28,57] and PM6[58]), and the previous DFTB3/MIO parametrization (using the parameters as defined in ref (51): Ud = −0.23, −0.16, −0.13, −0.19, −0.14, −0.09 au for C, H, N, O, P, and S and ζ = 4.2). For PM6, it is worth noting that the choice of heat of formation for H2 and protons is essential to the computed hydrogenation reaction energies and proton affinities, respectively; we take the value of 0.0 and −54.0 kcal/mol for H2 and protons, respectively, as recommended by the MOPAC program (http://openmopac.net/manual/pm6_accuracy.html). We have not carried out any comparison to OMx methods despite their impressive performance for typical organic molecules[30] because we are not aware of any systematic parametrization of OMx for P and S. Unless noted otherwise, energetics are given for geometries that are optimized at the respective level of theory. To compare directly the potential energy surfaces at different levels of theory, zero-point energies (ZPEs) are not included; the exception is for heat of formation calculations, in which vibrational contributions are included. As shown in the Supporting Information, the ZPEs calculated with DFTB3/3OB agree very well (within 1 kcal/mol) with higher-level calculations. For the calculation of noncovalent interactions, although we recognize the importance of including dispersion interactions in general, they are not included for most test cases here because these cases include small molecules and are dominated by hydrogen-bonding interactions (see Supporting Information, for explicit results); we explore explicitly the effect of dispersion to address a recently discussed caveat of DFTB in describing noncovalent interactions that involve sulfur.[72] All DFTB calculations are carried out using our in-house DFTB code; for G3B3, MP2, B3LYP, BLYP, PBE, PM6, and PDDG/PM3, the Gaussian 03 and 09 software packages[73,74] are used.

Sulfur

Atomization Energies and Geometries

The first test set class="Chemical">coclass="Chemical">nsists of atomizatioclass="Chemical">n eclass="Chemical">nergies aclass="Chemical">nd geometrical properties of 38 small class="Chemical">neutral closed-shell molecules iclass="Chemical">nclclass="Chemical">n class="Chemical">uding sulfides, oxides, acids, thoils, sulfite, sulfate, sulfones, sulfoxides, etc. Table 3 summarizes the results; further details are included in the Supporting Information.
Table 3

Mean and Maximum Absolute Deviations from G3B3 Atomization Energies and Heats of Formations and B3LYP/cc-pVTZ Geometric Properties for Our Sulfur Test Set

propertyaNbMP2cB3LYPdPBEdPM6PDDGMIO3OB
Eat (kcal/mol)388.316.820.8  68.74.7
Emaxat (kcal/mol) 20.556.0100.7  221.734.4
ΔHf0 (kcal/mol)389.6e18.421.45.45.768.14.8
ΔHf,max0 (kcal/mol) 24.1e57.6100.024.0 (1.5/12.5)f18.9 (3.8/15.7)f218.6 (1.1/12.0)f34.0 (0.4/1.8)f
r (Å)1240.0080.0070.0140.0150.0270.0140.008
rmax (Å) 0.0370.0190.0410.0790.1210.2240.042
a (deg)1030.50.30.52.52.71.81.2
amax (deg) 3.71.63.231.614.811.49.6
d (deg)121.21.93.38.711.512.04.0
dmax (deg) 2.917.427.072.653.686.210.8

Bond lengths r, bond angles a, and dihedral angles d are compared to B3LYP/cc-pVTZ calculations; max stands for maximum absolute deviation.

Number of comparisons.

Basis set is cc-pVTZ.

Basis set is 6-31G(d).

Diphenylsulfoxide, -sulfone, and -sulfide have been excluded due to excessive computation time.

Single point G3B3 ΔHf0 computed at the structures optimized by semiempirical methods; the value before/after the slash is the mean/maximum absolute deviation from G3B3, which uses B3LYP/6-31G(d) structures.

Bond lengths r, bond angles a, and dihedral angles d are class="Chemical">compared to B3class="Chemical">n class="Gene">LYP/cc-pVTZ calculations; max stands for maximum absolute deviation. Number of n class="Chemical">comparisoclass="Chemical">ns. Basis set is cc-pVTZ. Basis set is 6-31G(d). class="Chemical">Diphenylsulfoxide, class="Chemical">n class="Chemical">-sulfone, and -sulfide have been excluded due to excessive computation time. Single point G3B3 ΔHf0 n class="Chemical">computed at the structures optimized by semiempirical methods; the value before/after the slash is the meaclass="Chemical">n/maximum absolute deviatioclass="Chemical">n from G3B3, which uses B3class="Chemical">n class="Gene">LYP/6-31G(d) structures. class="Chemical">DFTB3/3OB shows a very small MAD for atomizatioclass="Chemical">n eclass="Chemical">nergies aclass="Chemical">nd clearly improves over class="Chemical">n class="Gene">DFTB3/MIO (4.7 vs 68.7 kcal/mol). [MIO reveals a strong overbinding that leads to large errors. Note that the overbindings of different bond pairs are balanced such that they cancel out for reaction energies.[52] For S, however, additional drawbacks have been found especially for the S–O bond.[38] We further discuss the consequences of the unbalanced overbinding within MIO for reaction energies below.] The largest deviations are found for SO3 and thiocyanates. For DFTB3, we have also carried out calculations for heats of formation following the standard approximations for the heat capacity corrections[75] and using unscaled vibrational frequencies calculated with DFTB3. Encouragingly, DFTB3/3OB yields a small MAD of only 4.8 kcal/mol; it is 68.1 kcal/mol for DFTB3/MIO. The poor performance of class="Gene">MP2 aclass="Chemical">nd DFT for atomizatioclass="Chemical">n eclass="Chemical">nergies is iclass="Chemical">n liclass="Chemical">ne with large errors for heats of formatioclass="Chemical">n as has beeclass="Chemical">n reported previously, e.g., iclass="Chemical">n ref (76). A fit of atomic class="Chemical">n class="Chemical">contributions reported in the same reference significantly reduces this error but has not been carried out in this study. PM6 and PDDG/PM3 are parametrized for directly yielding heats of formation and implicitly include corrections for heat capacity contributions. Therefore, the calculation of atomization energies with these methods is not straightforward. However, heats of formation have been reported to be in very good agreement with experiments for large test sets including very challenging cases such as cations, anions, and transition structures leading to a MAD of 6.5 and 6.4 kcal/mol for PM6[77] and PDDG,[57] respectively. Note that those numbers were based on different test sets. Nevertheless, for our test set, we find similar results; the MADs of 5.4 and 5.7 kcal/mol (see Table 3) indicate very similar performances of PM6 and PDDG. This is quite remarkable as PDDG does not use d orbitals on sulfur atoms while PM6 does. With a few exceptions, geometries are described very well for all semiempirical methods and particularly well for class="Chemical">DFTB3/3OB; e.g., the MADs iclass="Chemical">n boclass="Chemical">nd distaclass="Chemical">nces are 0.008, 0.014, 0.027, aclass="Chemical">nd 0.015 Å respectively for class="Chemical">n class="Chemical">DFTB3/3OB, DFTB3/MIO, PDDG, and PM6. The largest bond distance error for PDDG is found for the singlet state of C=S with a difference of 0.12 Å relative to B3LYP/cc-PVTZ. Other deviations are substantially smaller. DFTB3/MIO overestimates the N=S bond within N2S significantly by 0.22 Å. For the dihedral angle OSOH of sulfonic acid, large deviations to B3LYP/cc-pVTZ are found for almost all methods compared. Interestingly, this is not the case for ab initio methods using a triple-ζ basis; MP2 and B3LYP with the cc-pVTZ basis set deviate only about 2°. More statistics including mean absolute deviations for different bond types are given in the Supporting Information. To demonstrate the superb performance of the semiempirical methods for structural properties, we have carried out single point heat of formation class="Gene">calculatioclass="Chemical">ns with G3B3 at geometries of the respective level. The MADs of class="Chemical">n class="Chemical">PM6, PDDG, DFTB3/MIO, and DFTB3/3OB are as low as 1.5, 3.8, 1.1, and 0.4 kcal/mol; the MAX values are 12.5, 15.7, 12.0, and 1.8 kcal/mol. These results again highlight the quality of DFTB3/3OB structures.

Vibrational Frequencies

The benchmark for vibrational frequencies is a cumbersome task since class="Chemical">comparisoclass="Chemical">ns require matchiclass="Chemical">ng the character of vibratioclass="Chemical">nal modes, which is class="Chemical">not straightforward especially wheclass="Chemical">n class="Chemical">n class="Chemical">comparing to experimental data for fairly large molecules. A rigorous comparison is out of the scope of this work. In Table 4, a few unscaled harmonic vibrational frequencies as calculated from semiempirical methods are compared to those from DFT calculations. For cases where an obvious assignment of the mode to experimental ones is possible, we have also listed the experimental values. Note that selected stretching frequencies are also used to determine the additional equations during the fitting (mainly for repulsive potentials), although the contributions from vibrational frequencies are rather limited.
Table 4

Selected Vibrational Frequencies in cm–1

molecule (point group)irrepdescriptionexpBLYPaB3LYPaPM6PDDGMIO3OB
H2S (C2v)A1bending1183117612081055109010421043
 A1stretch-sym2615260126842689154227282611
 B2stretch-asy2626261526982698158827782665
S21 (D∞h)ΣgS–S-stretch 657706718712704707
HSSH (C2)AS–S-stretch515464498542388409530
dimethyldisulfide (C2)AS–S-stretch509459492525362469495
thioformaldehyde (C2v)A1C=S-stretch1059104010871069107611281052
methanethiol (Cs)A′C–S-stretch708654696749755735767
 A′H–S-stretch2597258826742706155626762595
dimethylsulfide (C2v)A1CSC-bend282250258257256255255
 A1C–S-stretch-sym695640680742741705735
 B2C–S-stretch-asy743688733776754734774
thiophene (C2v)A1ring-sym608594615559628609621
 B2ring-asy751716751678692783776
 A1ring-sym872803838773798851845
 B2ring-asy903850879853907912903
2,5-dihydrothiophene (C2v)A1ring-sym506491512470524499520
 B2ring-asy665592632628639642675
 A1ring-sym716666706753747724750
 B2ring-asy824797826806848851860
tetrahydrothiophene (C2)Aring-sym472457475442495472485
 Bring-asy684631669724729691723
 Aring-sym678639677743739703727
H2SO4 (C2)AS–O-stretch-sym831706794773723700805
 BS–O-stretch-asy882766851775860735809
 AS=O-stretch-sym113611181196103687411661165
 BS=O-stretch-asy145213721445125090213971389
dimethyl sulfoxide (C2)A′CSO-bend-sym 264278269359266262
 A″CSO-bend-asy 294313296355324313
 A′S=O-stretch110210471097105479411621122
dimethylsulfone (C2v)A1S=O-stretch-sym112110811145105973511311110
 B1S=O-stretch-asy125812581330117081613311312
MAD (δν/νexp%)   5.62.66.615.15.13.8
MAX (δν/νexp%)   15.08.513.941.020.611.8

Basis set is cc-pVTZ.

Basis set is cc-pVTZ. Table 4 shows that the DFT methods are generally in good agreement with experiments, although Bclass="Gene">LYP teclass="Chemical">nds to slightly uclass="Chemical">nderestimate the experimeclass="Chemical">ntal values. class="Chemical">n class="Chemical">DFTB3 with both parameter sets MIO and 3OB also performs quite well, and the deviations from experiments are similar to the ones for the DFT methods. For larger test sets, we expect a more consistent performance of BLYP and B3LYP than DFTB3.[78,79] As to other semiempirical methods, PM6 reveals very satisfying results, with only S–O stretch frequencies consistently underestimated. By contrast, PDDG shows large errors of several hundreds of wavenumbers, specifically for S–H and S–O stretch frequencies (in H2S, H2SO4, and dimethylsulfone).

Reaction Energies

We mentioned earlier the overbinding tendency of class="Chemical">DFTB3 with the class="Chemical">n class="Gene">MIO parameters.[52] We have shown recently that for a set of reactions (with small stoichiometric coefficients) that include CHNO containing species, the overbinding of different bonds cancels out and leads to relatively small errors for reaction energies with exception of some species (e.g., N–O bond).[52] During the MIO parametrization of sulfur, however, the parameters could not be adjusted in such a way that the overbinding approximately canceled out even for simple reactions. Table 5 shows the large deviations for DFTB3/MIO for a small set of selected reaction energies.
Table 5

Deviation for Reaction Energies of Neutral Closed Shell Sulfur Containing Molecules Compared to G3B3a

reactionG3B3B3LYPbPBEcPM6dPDDGdMIO3OB
S2 + 2 H2 → 2 H2S–59.3–3.6–0.1+4.9+11.1+33.0+1.5
HSSH + H2 → 2 H2S–15.6–0.2+2.5+10.5+7.9+19.7–3.7
CH3SH + H2 → H2S + CH4–19.2–1.9+2.1+8.6+9.3+7.4+2.4
CH3SH + H2O → H2S + CH3OH10.5–2.8–5.1–2.9–0.3+11.5+1.0
CH3SH + NH3 → H2S + CH3NH27.0–1.1–1.8–4.6+3.4+8.7+2.2
H2C=S + H2 → CH3SH–37.1–1.8–2.3–3.9+3.7–0.1–3.7
H2C=S + H2O → H2S + H2C=O1.2–5.5–10.1–6.9–5.6+14.8–8.0
H2C=S + NH3 → H2S + H2C=NH2.0–3.2–3.0–11.1–1.2+11.1–3.1
H2C=S + 2 H2 → H2S + CH4–56.4–3.5–0.1+4.8+13.2+7.4–1.2
H2SO4 + 4 H2 → H2S + 4 H2O–77.6–10.3+19.5+26.7+44.3+49.2–3.3
O=S(OH)2 + 3 H2 → H2S + 3 H2O–65.1–0.4+22.3+27.2+41.8+38.0–10.2
HS(=O)2OH + 3 H2 → H2S + 3 H2O–71.9–9.7+12.8+34.0+36.7+45.3–5.0
SO2 + 3 H2 → H2S + 2 H2O–60.6–5.8+14.2+13.3+11.9+35.4+0.6
2 SO2 → S21 + 2 O21243.7–11.9–22.0+16.4–49.5+50.7–22.3
2 SO1 + O21 → 2 SO2–213.0+6.4+5.0+11.4+75.9–24.2+31.3
2 SO2 + O21 → 2 SO3–74.8+6.2+12.8–28.0+15.5–74.3–52.6
SO3 + H2O → H2SO4–22.0+2.3+1.0+2.0–24.5+20.1+35.7
SO2 + H2O → O=S(OH)24.5–5.5–8.1–13.9–29.8–2.6+10.8
2 O=S(OH)2 + O21 → 2 H2SO4–127.8+21.7+30.9+3.7+26.1–28.8–2.8
O=S(OH)2 → HS(=O)2OH6.8+9.3+9.5–6.8+5.1–7.3–5.2
MAD 5.69.312.120.824.510.3
MAX 21.730.934.075.974.352.6

Energies are calculated at 0 K excluding zero point energy and thermal corrections. All numbers are given in kcal/mol.

Basis set is cc-pVTZ.

Basis set is 6-31G(d).

Heats of formation for H2 are calculated as −26 and −22 kcal/mol for PM6 and PDDG. To correct for this exceptional error, this value is set to 0.0 kcal/mol. See main text.

Energies are class="Gene">calculated at 0 K exclclass="Chemical">n class="Chemical">uding zero point energy and thermal corrections. All numbers are given in kcal/mol. Basis set is cc-pVTZ. Basis set is 6-31G(d). Heats of formation for class="Chemical">H2 are class="Chemical">n class="Gene">calculated as −26 and −22 kcal/mol for PM6 and PDDG. To correct for this exceptional error, this value is set to 0.0 kcal/mol. See main text. By class="Chemical">coclass="Chemical">ntrast, 3OB is parametrized to give reliable atomizatioclass="Chemical">n eclass="Chemical">nergies (see discussioclass="Chemical">ns above). For reactioclass="Chemical">ns with small stoichiometric class="Chemical">n class="Chemical">coefficients, it is expected that small errors in atomization energies lead to small errors for reaction energies as well. This is generally observed in Table 5. Together with an analysis of the atomization energies, there are a few species that show large errors and spoil the performance for the respective reactions; those are molecules SO (singlet), SO3, and O2 (singlet) with large atomization energy errors of +11.8, +34.4, and +11.9 kcal/mol, respectively. Reactions that do not include these outliers are in very good agreement with G3B3 and therefore demonstrate the strength of the 3OB parametrization protocol. By comparison, PDDG and PM6 are featured with larger MADs; PDDG has a MAD of 20.8 kcal/mol. In fact, even DFT calculations, especially PBE, may have large errors; the MAD for B3LYP/cc-PVTZ and PBE/6-31G(d) is 5.6 and 9.3 kcal/mol, respectively. [Because the reactions have been chosen somewhat arbitrarily, the overall MAD given in Table 4 has to be considered with care. As pointed out by Fishtik,[80] one can in principle compile a set of chemical reactions that yield arbitrarily high or low overall errors for any kind of method by linearly combining the reactions. One can, however, transform the problem and assign species errors that are independent of a particular choice of linearlly independent reactions.[80] Of course, for a meaningful analysis, one would still have to compile a set of reactions that include a representative set of species. Therefore, the test presented here serves as an illustration of the overbinding problematic for DFTB3/MIO only.]

Proton Affinities

One goal of the further development of class="Chemical">DFTB2 toward class="Chemical">n class="Chemical">DFTB3 was to improve the description of proton affinities, which are critical to the proper description of many biochemical reactions.[16,34,81−83] This property is evaluated for sulfur-containing species in Table 6.
Table 6

Proton Affinities for Sulfur Containing Species in kcal/mol: Deviations in Comparison to G3B3a

systemG3B3B3LYPbPBEcPM6dPDDGMIO/calcMIO3OB/calc3OBG3B3//3OB
H3S+174.1+0.3–0.3–13.7+8.6–5.4–7.5+1.2+0.8–1.5
H2S355.7–1.1–3.6–16.4–12.1+0.4–2.1+6.9+0.5–0.6
SH498.4–1.1–4.8–3.3+12.2+18.1+15.2+24.6–2.8+0.1
CH3SH2+190.4+0.7–1.1+8.9–6.3–7.5–10.0–1.4–1.7–1.0
CH3SH362.5–0.7–3.4–18.5–21.8–2.4–5.2+4.7+0.7–0.3
H2SO4317.6+0.7–2.0–4.1+6.1+19.9+15.8+7.8+5.1–1.0
HSO4454.3–0.1–2.8–9.3–2.9+24.3+17.2+6.3+2.2–0.1
O=S(OH)2330.3+2.0–1.9+3.8+7.3+17.5+13.7+18.4+17.5+3.7
O=S(OH)O471.5+0.6–2.1+7.4+7.3+35.5+29.1+28.9+27.7+2.6
HS(=O)2OH318.2+2.4+0.9+11.2+16.8+19.7+15.7+7.7+5.5+1.2
H3CS(=O)2OH323.5+2.9+1.4+6.1+8.3+14.1+10.0+4.0+2.0+0.8
[H3CS(=O)(OH)2]+186.3+4.0+2.3+13.0+14.3+5.2+2.1+4.7+3.2–2.0
MAD 1.42.29.610.314.211.09.75.81.2
MAX 4.04.818.521.835.528.928.927.73.7

The molecules are given in the protonated form. The proton affinity is computed with the potential energies at 0 K without any zero-point energy correction (exception are PM6 and PDDG which calculate reaction enthalpies at 298 K). For the DFTB method, the deviation is given as the difference from the G3B3 method (Emethod – EG3B3).

Basis set is aug-cc-pVTZ.

Basis set is 6-31+G(d,p).

The energy of the proton in PM6 is in error by −54 kcal/mol, which has been accounted for by adding this number to the original result. See discussion at http://openmopac.net/manual/pm6_accuracy.html.

The molecules are given in the protonated form. The proton affinity is class="Chemical">computed with the poteclass="Chemical">ntial eclass="Chemical">nergies at 0 K without aclass="Chemical">ny zero-poiclass="Chemical">nt eclass="Chemical">nergy class="Chemical">n class="Chemical">correction (exception are PM6 and PDDG which calculate reaction enthalpies at 298 K). For the DFTB method, the deviation is given as the difference from the G3B3 method (Emethod – EG3B3). Basis set is aug-cc-pVTZ. Basis set is 6-31+G(d,p). The energy of the proton in class="Chemical">PM6 is iclass="Chemical">n error by −54 kcal/mol, which has beeclass="Chemical">n acclass="Chemical">n class="Chemical">counted for by adding this number to the original result. See discussion at http://openmopac.net/manual/pm6_accuracy.html. A difference between the 3OB and class="Gene">MIO parameters for class="Chemical">n class="Chemical">CHNO is the use of calculated (3OB) and fitted (MIO) Hubbard derivative parameters. With the calculated Hubbard derivative for sulfur, the DFTB3/3OB proton affinities errors are quite large with a MAD of 9.7 kcal/mol (3OB/calc in Table 6). Therefore, we optimize Ud for sulfur and observe a significant reduction of errors for many species (the final 3OB-set). The overall MAD remains somewhat large (5.8 kcal/mol) due mainly to the significant errors for sulfurous acid and its anions; for the geometry of the anion [O2S–OH]−, the S–OH bond length is overestimated by about 0.14 Å in comparison to B3LYP/aug-cc-pVTZ. In general, nevertheless, the improvement of DFTB3/3OB over PM6 and PDDG is notable. It is also worth noting that a fit of Ud(S) for DFTB3/MIO does not substantially improve over the use of the calculated Ud(S) (MIO/calc), with a large MAD of 11.0 kcal/mol. Thus, the transferability of parameters has improved with the new 3OB parameter set. The comparison to PBE/6-31+G(d,p) shows that this set of proton affinities is well described by a standard density functional using a medium sized basis set; note that diffuse functions in the basis set are important for properly describing the anionic species. Finally, we note that G3B3 single points at DFTB3/3OB optimized structures lead to small errors in the proton affinities (a MAD of merely 1.2 kcal/mol) compared to the original G3B3, again highlighting the good quality of DFTB3/3OB structures.

Hydrogen Bonding Energies

class="Chemical">Hydrogen boclass="Chemical">ndiclass="Chemical">ng eclass="Chemical">nergies are geclass="Chemical">nerally well described by class="Chemical">n class="Chemical">DFTB3 as shown in Table 7 for neutral, protonated, and deprotonated dimers, except for two cases (see below); by comparison, PM6 and PDDG give less satisfactory results and sometimes very large errors. However, somewhat surprisingly, considerable errors are found for geometries with all semiempirical methods, including DFTB3 (Table 8). Besides the tremendous underestimation of the heavy atom distance, angles are also different than the counterpoise corrected MP2/G3large (MP2-CP) reference as shown for the most problematic cases in Figure 1. Note for example the different positions of the shared hydrogen between the monomers for DFTB3 and MP2-CP within [HS–H–SH]− and [H2S–H–SH2]+. This leads to the largest hydrogen bonding energy errors within the test set. Another obvious deficiency is that the shared hydrogen of [H2O–H–SH2]+ is covalently bound to sulfur rather than to oxygen. We will discuss this issue below in connection with the proton transfer barriers.
Table 7

Hydrogen Bonding Energies in kcal/mol: Deviations in Comparison to G3B3

 G3B3MP2-CPaMP2aB3LYPbPBEbPM6PDDGMIO3OB
2 H2S → (H2S)2–1.7+0.1–0.1+0.3–0.70.40.2–0.2+0.1
SH + H2S → HS-H–SH–11.7–1.3–2.3–3.1–8.4–6.3–18.6–9.9–11.1
H3S+ + H2S → [H2S–H–SH2]+–13.6–0.4–1.5–4.3–9.44.0–8.1–7.5–9.3
H2O + H2S → HS-H–OH2–2.4–0.1–0.6–0.9–1.8–3.10.5–1.6–0.9
H2O + H2S → HO-H–SH2–3.0+0.4–0.1+0.0–0.9 0.1+1.0+1.2
H2O + SH → HO-H–SH–14.8+0.5–0.3–0.6–2.40.01.6+0.2+0.3
H3O+ + H2S → [H2O–H–SH2]+–23.5+0.4–0.8–2.9–6.7–12.0–25.4+4.1–2.0
NH3 + H2S → HS-H–NH3–3.1–0.1–0.7–1.4–3.0–0.72.5–2.1–0.2
NH3 + SH → H2N–H–SH–8.2+0.3–0.2+0.1–1.4–3.2–2.2+1.6+1.0
NH4+ + H2S → [H3N–H–SH2]+–12.8+0.2–0.4–0.8–2.82.42.9+0.7+1.3
MAD 0.40.71.43.73.66.22.92.7
MAX 1.32.34.39.412.025.49.911.1

Basis set is G3large.

Basis set is 6-31+G(d,p), without counterpoise correction.

Table 8

Heavy Atom Distances in Å for Small Sulfur Containing Dimers: Deviations in Comparison to MP2-CP/G3large

dimerMP2-CPaMP2aB3LYPbB3LYPcPBEcPM6PDDGMIO3OB
(H2S)24.180–0.066+0.007–0.004–0.163–0.098–0.275–0.372–0.329
HS-H–SH3.484–0.056–0.114–0.050–0.131–0.255–0.289–0.301–0.330
[H2S–H–SH2]+3.450–0.044–0.076–0.084–0.078–0.234–0.297–0.205–0.243
HS-H–OH23.595–0.069–0.176–0.103–0.172–0.866–0.224–0.394–0.280
HO-H–SH23.533–0.054+0.028–0.036–0.120 –0.043–0.318–0.085
HO-H–SH3.245–0.034+0.007–0.009–0.063–0.285–0.060–0.189–0.117
[H2O–H–SH2]+2.929–0.018–0.007–0.054–0.047–0.3880.055–0.126+0.086
HS-H–NH33.620–0.057–0.217–0.129–0.238–0.2230.253–0.268–0.248
H2N–H–SH3.533–0.040+0.007+0.015–0.060–0.444–0.379–0.246–0.227
[H3N–H–SH2]+3.290–0.024–0.007–0.026–0.081–0.1870.011–0.178–0.366
MAD 0.0460.0650.0510.1150.3310.1890.2600.231
MAX 0.0690.2170.1290.2380.8660.3790.3940.366

Basis set is G3large. CP: counterpoise corrected calculation.

Basis set is 6-31G(d); this method is used within G3B3 for the geometry optimization.

Basis set is 6-31+G(d,p).

Figure 1

Problematic geometries for DFTB3/3OB with significant difference in angles or connectivity (i.e., the position of the hydrogen involved in hydrogen bonding) in comparison to counterpoise corrected MP2/G3large.

Problematic geometries for class="Chemical">DFTB3/3OB with sigclass="Chemical">nificaclass="Chemical">nt differeclass="Chemical">nce iclass="Chemical">n aclass="Chemical">ngles or class="Chemical">n class="Chemical">connectivity (i.e., the position of the hydrogen involved in hydrogen bonding) in comparison to counterpoise corrected MP2/G3large. Basis set is G3large. Basis set is 6-31+G(d,p), without n class="Chemical">couclass="Chemical">nterpoise class="Chemical">n class="Chemical">correction. Basis set is G3large. CP: class="Chemical">couclass="Chemical">nterpoise class="Chemical">n class="Chemical">corrected calculation. Basis set is 6-31G(d); this method is used within G3B3 for the geometry optimization. Basis set is 6-31+G(d,p). With an attempt to understand the significant errors in the class="Chemical">hydrogen-boclass="Chemical">ndiclass="Chemical">ng structures, we look further iclass="Chemical">nto the DFT results. Geometries class="Chemical">n class="Gene">calculated with B3LYP/6-31G(d) (which are used in G3B3) in comparison to the ones from MP2-CP differ also with respect to the bond lengths (Table 8); calculated hydrogen-bonding angles are quite similar to MP2-CP though. For [HS–H–SH]− and [H2S–H–SH2]+, similar to DFTB3/3OB, the shared hydrogen is halfway between the sulfur atoms. When going to larger basis sets (e.g., G3large or aug-cc-pVTZ), the structure of [H2S–H–SH2]+ becomes similar to the MP2-CP result shown in Figure 1; however, for [HS–H–SH]−, the shared hydrogen remains half way between the sulfur atoms. For PBE, the position of the hydrogen remains always halfway between the heavy atoms for both systems independent of the basis set. Therefore, these observations seem to suggest that the significant errors in the DFTB3/3OB structures are largely intrinsic to the DFT functional (PBE) used, although the errors in DFTB3/3OB are larger than those at the PBE level. Interestingly, despite the structural differences, the binding energies are comparable for G3B3 and MP2-CP, with a small MAD of 0.4 kcal/mol. Also, B3LYP and PBE with the 6-31+G(d,p) basis set show MADs of 1.4 and 3.7 kcal/mol, respectively, the latter being in fact larger than that for DFTB3/3OB. More details are provided in the Supporting Information. The modified basis set in the 3OB-parametrization for class="Chemical">CHNO leads to aclass="Chemical">n improved O–O distaclass="Chemical">nce iclass="Chemical">n the class="Chemical">n class="Chemical">water dimer,[52] which was too short with DFTB3/MIO. The larger wave function compression radius for oxygen in 3OB increases the Pauli repulsion, such that the underestimation of the O–O distance is reduced. For a proper binding energy, we then adjusted the parameter associated with the modified γ function for X–H pairs.[33] In the case of sulfur, a larger rwf leads also to larger intermolecular distances as expected, however at the cost of larger errors for bond angles, which obviously limits the optimization. The current set is a compromise within the DFTB3 framework that uses the PBE functional.

Proton Transfer Barriers

To stclass="Chemical">udy protoclass="Chemical">n traclass="Chemical">nsfer iclass="Chemical">n molecules iclass="Chemical">nclclass="Chemical">n class="Chemical">uding sulfur atoms, we consider several simple model systems: [HS–H–SH]−, [H2S–H–SH2]+, [HS–H–OH]−, [H2S–H–OH2]+, [HS–H–NH2]−, and [H2S–H–NH3]+. For comparison with higher level methods, we fix the heavy atoms at four different interatomic distances. The shared hydrogen atom is moved on a straight line from the first heavy atom to the second one, and the total energy for each position is calculated after a relaxation of all other hydrogen atoms. Figure 2 shows the potential energy curves for all model systems for one fixed heavy atom distance (all other graphs are shown in the Supporting Information). The energy is shown relative to the energy of two infinitely separated monomers; e.g., for [HS–H–OH]− the energy is shown relative to SH– and class="Chemical">H2O. We class="Chemical">n class="Chemical">compare DFTB3/3OB to MP2/G3large, B3LYP and PBE with the 6-31+G(d,p) basis set. While the relative energies vary by several kilocalories per mole between the methods due mainly to the difference in the binding energies, the barriers are overall consistent. For DFTB3/3OB, two major deficiencies are observed. First, the underestimation of the proton affinity of NH3 leads to too low a relative energy for the system [H2S–H–NH3]+. This drawback is already known and can be alleviated by applying a modified repulsive potential H–N-mod which improves the proton affinity (for details, see ref (52)). Second, for [H2S–H–OH2]+, the total energy of the SH2–H–OH2+ system is underestimated by almost 8 kcal/mol. The error does not stem from the differences in proton affinities for SH2 and H2O as their deviations from G3B3 are only +0.7 and +1.7 kcal/mol, respectively. However, the binding energy H2S + H3O+ → H2SH3O+ is underestimated by ∼8 kcal/mol, whereas the binding energy of H2O + SH3+ → [SH3+–OH2] is only slightly overestimated by 2 kcal/mol.
Figure 2

Proton transfer barriers. The energy is given relative to the energy of (SH– + XH) for the upper row, and relative to (SH2 + XH2+) for the lower row, where X = SH, OH, NH2. The vertical axes show the distance between sulfur and the shared hydrogen atom (rSH). The color code is black = MP2/G3large, green = B3LYP/6-31+G(d,p), blue = PBE/6-31+G(d,p), red = DFTB3/3OB, light red = DFTB3/3OB/H–N-mod; the heavy atom distance is 3.7, 3.6, and 3.8 Å for the upper row and 3.6, 3.2, and 3.6 Å for the lower row.

Proton transfer barriers. The energy is given relative to the energy of (SH– + XH) for the upper row, and relative to (class="Chemical">SH2 + Xclass="Chemical">n class="Chemical">H2+) for the lower row, where X = SH, OH, NH2. The vertical axes show the distance between sulfur and the shared hydrogen atom (rSH). The color code is black = MP2/G3large, green = B3LYP/6-31+G(d,p), blue = PBE/6-31+G(d,p), red = DFTB3/3OB, light red = DFTB3/3OB/H–N-mod; the heavy atom distance is 3.7, 3.6, and 3.8 Å for the upper row and 3.6, 3.2, and 3.6 Å for the lower row. As a slightly larger example, we stclass="Chemical">udy a model protoclass="Chemical">n traclass="Chemical">nsfer betweeclass="Chemical">n a class="Chemical">n class="Chemical">hydronium ion and sulfonic acid. Proton transfers involving sulfonic acids are implicated in proton conducting membrane materials used in fuel cells.[3] The hydronium oxygen is fixed at a set of distances from the sulfur in the sulfonic acid, and the proton transfer is studied in a similar fashion as discussed above; the transferred proton is fixed for a set of distances along the S–O axis, while the remaining atoms are relaxed using DFTB3/3OB. The result for the O–S distance of 4.5 Å is shown in Figure 3 as an example. The DFTB3/3OB approach captures the exothermicity of the reactions rather well compared to B3LYP/aug-cc-pVTZ single point energies, while the barrier is substantially underestimated by almost 7 kcal/mol. The degree of underestimation is smaller at shorter O–O distances, but the result in Figure 3 underlines the fact that DFTB3 is developed based on PBE and therefore may underestimate proton transfer barriers at large donor–acceptor distances. Therefore, in applications where rates or proton conductance is of interest, it is likely important to calibrate/correct the computed barrier height based on high-level calculations.
Figure 3

Potential energy curves for the proton transfer between a hydronium ion and a sulfonic acid calculated at DFTB3/3OB (red) and B3LYP/aug-cc-pVTZ (green) levels, respectively. The energies are relative to infinitely separated reactants.

Potential energy curves for the proton transfer between a class="Chemical">hydronium ioclass="Chemical">n aclass="Chemical">nd a class="Chemical">n class="Chemical">sulfonic acid calculated at DFTB3/3OB (red) and B3LYP/aug-cc-pVTZ (green) levels, respectively. The energies are relative to infinitely separated reactants.

Noncovalent Interactions

In a recent stclass="Chemical">udy,[72] it was poiclass="Chemical">nted out that class="Chemical">n class="Gene">DFTB3/MIO leads to artificially favorable interactions involving sulfur atoms, thus spurious noncovalent binding for sulfur-containing compounds. Test calculations indicated that resolving the issue likely requires refitting the electronic parameters for sulfur. With the 3OB set of parameters, especially the new density compression radius (rdens), we see that the artificial binding discussed in ref (72) is significantly reduced; rather than a few kilocalories per mole, now the overbinding is on the order of 0.2 kcal/mol for the S–O/N interaction and 0.5 kcal/mol for the S–S interaction (Figure 4). We are unable to completely remove the attractive interactions within the framework of DFTB3 without affecting other properties discussed above, especially geometry and hydrogen-bonding interactions. For practical applications, however, we emphasize that when such weak overbinding effects need to be considered, explicit dispersion corrections are likely also important. In Figure 5, we compare optimized structures for antiparallel and parallel thiophene dimers, which were studied in ref (72), at different levels of theory. Without explicit dispersion, both DFTB3/MIO and DFTB3/3OB lead to structures considerably different from B3LYP including dispersion (B3LYP-D3BJ[84]); the S–S distance is longer with DFTB3/3OB, reflecting the weaker spurious interaction between the sulfur atoms compared to DFTB3/MIO. When the empirical dispersion[85] is included, the DFTB3/3OB structures become in rather good agreement with B3LYP-D3BJ. The distance between the parallel thiophene monomers is underestimated; since the dispersion model used for DFTB3 here was developed in the framework of DFTB2,[85] a more systematic refinement of the dispersion model together with DFTB3/3OB will likely further improve the result.
Figure 4

Potential energy curves for selected noncovalent interactions involving sulfur atoms studied in ref (72). The format is similar to Figure 2 of ref (72). Although the artificially attractive interaction remains with DFTB3/3OB, the magnitude of interaction is significantly reduced compared to DFTB3/MIO.

Figure 5

Optimized structures for antiparallel (AP) and parallel (P) thiophene dimers at different levels of theory. The S–S distance is given in Å.

Potential energy curves for selected nonclass="Chemical">covaleclass="Chemical">nt iclass="Chemical">nteractioclass="Chemical">ns iclass="Chemical">nvolviclass="Chemical">ng class="Chemical">n class="Chemical">sulfur atoms studied in ref (72). The format is similar to Figure 2 of ref (72). Although the artificially attractive interaction remains with DFTB3/3OB, the magnitude of interaction is significantly reduced compared to DFTB3/MIO. Optimized structures for antiparallel (AP) and parallel (P) n class="Chemical">thiophene dimers at differeclass="Chemical">nt levels of theory. The S–S distaclass="Chemical">nce is giveclass="Chemical">n iclass="Chemical">n Å.

Phosphorus

Atomization Energies and Geometries

Similar to what was done for class="Chemical">sulfur, we have class="Chemical">n class="Chemical">compiled a test set of small neutral closed-shell phosphorus-containing molecules. Among those are acids, oxides, phosphines, and thioesters with different oxidation states of phosphorus. Table 9 summarizes the results, and further details are given in the Supporting Information.
Table 9

Mean and Maximum Absolute Deviations for Small Neutral Closed-Shell Phosphorus Containing Moleculesa

propertyNbMP2cB3LYPdPBEdPM6PDDGMIOe3OB3OB/OPhyd
Eat (kcal/mol)3212.933.28.8  82.97.321.2
Emaxat (kcal/mol) 29.3146.031.0  406.840.7113.3
ΔHf,0 (kcal/mol)3214.334.38.815.315.082.47.221.5
ΔHf,max0 (kcal/mol) 32.7146.830.857.3 (8.1/33.6)f43.1 (27.6/150.8)f405.3 (3.4/13.9)f40.5 (1.6/9.5)f114.3 (2.2/11.1)f
r (Å)1300.0070.0070.0180.0290.0690.0200.0120.012
rmax (Å) 0.0400.0220.0340.1090.4230.1420.1290.112
a (deg)1300.60.61.24.35.43.83.33.0
amax (deg) 2.52.04.732.818.233.125.221.3
d (deg)421.72.94.429.939.421.417.516.4
dmax (deg) 6.517.525.5139.0152.370.0101.4119.5

Atomization energies are compared to G3B3 results; bond lengths r, bond angles a, and dihedral angles d are compared to B3LYP/cc-pVTZ calculations; max stands for maximum absolute deviation.

Number of comparisons.

Basis set is cc-pVTZ.

Basis set is 6-31G(d).

Trimethylmethylenephosphorane does not converge and is excluded from the statistics.

Single point G3B3 ΔHf0 computed at the structures optimized by semiempirical methods; the value before/after the slash is the mean/maximum absolute deviations from G3B3, which uses B3LYP/6-31G(d) structures.

Atomization energies are class="Chemical">compared to G3B3 results; boclass="Chemical">nd leclass="Chemical">ngths r, boclass="Chemical">nd aclass="Chemical">ngles a, aclass="Chemical">nd dihedral aclass="Chemical">ngles d are class="Chemical">n class="Chemical">compared to B3LYP/cc-pVTZ calculations; max stands for maximum absolute deviation. Number of n class="Chemical">comparisoclass="Chemical">ns. Basis set is cc-pVTZ. Basis set is 6-31G(d). class="Chemical">Trimethylmethylenephosphorane does class="Chemical">not class="Chemical">n class="Chemical">converge and is excluded from the statistics. Single point G3B3 ΔHf0 n class="Chemical">computed at the structures optimized by semiempirical methods; the value before/after the slash is the meaclass="Chemical">n/maximum absolute deviatioclass="Chemical">ns from G3B3, which uses B3class="Chemical">n class="Gene">LYP/6-31G(d) structures. Even though the MAD is dependent on the particular choice of the test set, 3OB for class="Chemical">phosphorus seems overall less accurate thaclass="Chemical">n for class="Chemical">n class="Chemical">sulfur (7.3 vs 4.7 kcal/mol). The largest deviations for atomization energies are found for the oxides P4O10 and P4O6 with an error of −40.7 and −33.9 kcal/mol, respectively. Again, the 3OB set is a major improvement over the MIO set, which shows large errors (MAD of 82.9 kcal/mol) due to overbinding. Even MP2 and B3LYP, without fitting atomic contributions, show large errors as discussed above for sulfur. The trends in the heats of formation also follow those for the atomization energies. For geometries, 3OB also improves in class="Chemical">comparisoclass="Chemical">n to class="Chemical">n class="Gene">MIO, although the magnitude of improvement is modest. However, some large errors remain; P–P bond lengths are overestimated, leading to the largest deviations of over 0.1 Å. Furthermore, the P–O–P angle in diphosphoric acid is overestimated by about 25°. Other large deviations for angles and dihedrals stem mainly from erroneous positions/orientations of a hydrogen atom. DFTB3/3OB also out-performs wave function based semiempirical methods for geometries. PDDG generally underestimates H–P and P–P bond lengths in comparison to B3LYP/cc-pVTZ or aug-cc-pVTZ; an exceptional outlier is found for the bond distance within P2 deviating by more than 0.4 Å. Considering that PDDG does not include d orbitals on phosphorus, the method performs remarkably well even though it is less accurate than PM6 and DFTB3. For the purpose of representing also class="Chemical">phosphate aclass="Chemical">nioclass="Chemical">ns iclass="Chemical">n our test sets, we have class="Chemical">n class="Chemical">compiled nine model geometries that are important for many key biological processes (Table 10; for details, see the Supporting Information). The overall performance is similar to that for the neutral species. The largest deviations for bond lengths are found for P–O bonds in [CH3COOPO3]2– and [HPO4]2–, which are underestimated by 0.127 and 0.044 Å, respectively. All other errors are smaller than 0.04 Å, demonstrating the excellent performance of DFTB3/3OB for bond lengths.
Table 10

Mean and Maximum Absolute Deviations for 9 Closed-Shell Phosphate Anions in Comparison to B3LYP/aug-cc-pVTZ Geometries

propertyaNbMP2cB3LYPdPBEdPM6PDDGeMIO3OB3OB/OPhyd
r (Å)530.0040.0070.0190.0240.0310.0170.0130.013
rmax (Å) 0.0270.0250.0560.1800.1110.1730.1270.116
a (deg)570.80.71.32.63.52.12.01.9
amax (deg) 4.56.310.28.114.511.810.39.5
d (deg)232.12.83.821.614.926.634.735.6
dmax (deg) 6.49.310.698.5101.767.379.282.3

Bond lengths r, bond angles a, and dihedral angles d; max stands for maximum absolute deviation.

Number of comparisons.

Basis set is cc-pVTZ.

Basis set is 6-31G(d).

[CH3COO–PO3]2– dissociates during geometry optimization and is therefore excluded from the statistics.

Bond lengths r, bond angles a, and dihedral angles d; max stands for maximum absolute deviation. Number of n class="Chemical">comparisoclass="Chemical">ns. Basis set is cc-pVTZ. Basis set is 6-31G(d). [class="Chemical">CH3COOclass="Chemical">n class="Chemical">PO3]2– dissociates during geometry optimization and is therefore excluded from the statistics. The special parametrization 3OB/OPhyd based on class="Chemical">phosphate hydrolysis reactioclass="Chemical">ns reveals a geclass="Chemical">nerally similar performaclass="Chemical">nce to that of 3OB (Tables 9 aclass="Chemical">nd 10); class="Chemical">n class="Chemical">computed atomization energy and heat of formation are substantially worse due to the shift of the P–O repulsive potential by about −10 kcal/mol. Finally, to class="Chemical">compare the structures from semiempirical methods, we examiclass="Chemical">ne heats of formatioclass="Chemical">n usiclass="Chemical">ng siclass="Chemical">ngle poiclass="Chemical">nt G3B3 eclass="Chemical">nergy class="Chemical">n class="Gene">calculations at structures determined by these methods. At DFTB3/MIO, DFTB3/3OB, and DFTB3/3OB/OPhyd geometries, we find MADs of 3.4, 1.6, and 2.2 kcal/mol in comparison to the reference G3B3 values (which uses B3LYP/6-31G(d) geometries). By contrast, G3B3 single point energetics with PM6 structures give a MAD of 8.1 kcal/mol. For PDDG, the considerable geometric inaccuracies cause the G3B3 single point energetics to deviate by as much as 27.6 kcal/mol on average. Clearly, the DFTB3 structures are considerably most reliable. A few selected unscaled harmonic vibrational frequencies class="Gene">calculated from DFT-based aclass="Chemical">nd semiempirical methods are class="Chemical">n class="Chemical">compiled in Table 11. While stretching frequencies from B3LYP are consistently larger than those from BLYP, the differences are quite small (<60 cm–1). In general, DFTB3/3OB values are close to those predicted by DFT methods (only one exception for the N=P stretching frequency). PM6 shows more outliers; e.g., P–H stretching frequencies of PH3 are overestimated by about 400 cm–1 in comparison to B3LYP, whereas PDDG reproduces the overall trend quite well but also deviates substantially for the less typical bonding situations of N≡P and C=P.
Table 11

Selected Vibrational Frequencies in cm–1

molecule (point group)irrepdescriptionBLYPaB3LYPaPM6PDDGMIO3OB3OB/OPhyd
phosphine (C3v)A1sym. bending9981018994861929931931
 Easy. bending1106103811351046102810331033
 A1sym. stretch2305238627602190240723132313
 Easy. stretch2314239427562302244623542354
diphosphorus (D∞h)A1gP–P stretch758806835891774796796
H2P=PH2 (C2)BP–P stretch388419648477412412412
methylphosphine (Cs)A′C–P stretch632666768674713679679
methylenephosphine (Cs)A′C=P stretch961100410951192104610651065
N≡P (C∞v)A1N–P stretch1322140214961627145313311331
H2N=PH2 (C1)AN=P stretch78081382878881110271027
H3PO4 (C3)Asym. P–O stretch772831772785932861833
 Easy. P–O stretch8639198348121027956850
 Asym. P=O stretch1259131813101348133812891260
H3PS4 (C3)Asym. P–S stretch348379452362325391391
 Easy. P–S stretch447489586415377461461
 AP=S stretch647673638536567655655
MAD (δν/νB3LYP%)  5.4 12.29.47.04.85.0
MAX (δν/νB3LYP%)  8.6 54.720.422.926.326.3

Basis set is cc-pVTZ.

Basis set is cc-pVTZ.

Reaction Energies

Table 12 shows a few simple reactions that are class="Chemical">compiled to evaluate the performaclass="Chemical">nce for certaiclass="Chemical">n boclass="Chemical">nd breakiclass="Chemical">ng aclass="Chemical">nd formiclass="Chemical">ng processes. As this set of reactioclass="Chemical">ns is arbitrarily choseclass="Chemical">n, the MAD should be iclass="Chemical">nterpreted with cautioclass="Chemical">n. However, it is obvious that B3class="Chemical">n class="Gene">LYP performs quite well despite its poor description for atomization energies; the effect of the basis set is also small. Within DFTB3/MIO, the overbinding is not balanced between different bond types; thus large errors are observed (see also discussion for reactions containing sulfur above). DFTB3/3OB performs well overall but has outliers for particular species, HP=O, hypodiphosphoric acid, and also P(OH)5. Note that the latter indicates already the underlying problem for hydrolysis reaction barriers where the transition state contains pentavalent phosphorus. Specifically, the last reaction in Table 12 that changes the P coordination from 4 to 5 oxygen is reasonably well described by 3OB/OPhyd. By construction, all other reactions involving P–O bond forming or breaking deviate significantly for 3OB/OPhyd in comparison to the G3B3 method. The performance of PM6 and PDDG is less satisfying than DFTB3/3OB, with almost doubled MADs.
Table 12

Deviations for 16 Reaction Energies of Neutral Closed Shell Phosphorus Containing Molecules Compared to G3B3a

reactionG3B3B3LYPbB3LYPcPBEcPM6dPDDGdMIO3OB3OB/OPhyd
PH3 + H2O → H2P–OH + H29.4–0.0–2.4–2.7–7.5–5.5–27.2–0.5–10.9
PH3 + H2O → HP=O + 2H240.1+2.5–2.4–2.3–2.3–11.7–39.6–10.3–18.5
PH3 + CH4 → H3C–PH2 + H213.7+2.5+2.2+0.5–11.6–13.4–8.7–4.4–4.4
PH3 + C2H6 → H3C–PH2 + CH4–4.6+0.7+1.2+0.9–2.1–8.9–7.5–2.8–2.8
(CH3)2P=O + 2H2O → HP(=O)(OH)2 + 2CH4–36.0–4.0–9.8–7.2+22.5+12.1–41.5+3.3–16.1
PH3 + 4H2O → H3PO4 + 4H2–31.5+9.3–4.5–7.6+4.3–17.4–92.7–2.1–39.0
H2P(=O)OH + H2O → HP(=O)(OH)2 + H2–16.0+1.9–1.6–2.2–1.9–0.7–24.2+2.7–6.8
HP(=O)(OH)2 + H2O → H3PO4 + H2–12.1+2.3–1.1–1.5–3.2+0.7–32.3–5.8–15.1
P(OH)3 + H2O → H3PO4 + H2–24.1+8.3+3.1+2.1+14.6+4.0–18.7–7.2–13.2
(HO)2(O=)P–P(=O)(OH)2 + 2H2O → 2H3PO4 + H2–29.5–0.3–3.7+0.8–6.3+16.3–60.8–21.8–39.7
2PH3 → H2P–PH2 + H24.3+2.0+2.0–0.3–18.0–37.9–21.2–8.1–8.1
PH3 + NH3 → NP + 3H259.8+8.7+2.9+4.0–4.2–35.8–12.3–6.4–6.4
PH3 + NH3 → HN–PH + 2H248.3+4.0+1.7+1.1–8.4–20.1–2.5–1.5–1.5
PH3 + NH3 → H2N–PH2 + H211.4+1.7+0.6–0.0–14.0–11.3+3.9+1.2+1.2
P(NH2)3 + 3H2O → P(OH)3 + 3NH3–20.2–6.5–11.1–9.3+20.4+4.2–92.7–8.6–39.6
H3PO4 + H2O → P(OH)50.4+0.7–4.8–8.8–18.4+16.1–3.6+19.3+4.8
MAD 3.53.43.210.013.530.66.614.3
MAX 9.311.19.322.537.992.721.839.7

Energies are calculated at 0 K excluding zero point energy and thermal corrections. All numbers are given in kcal/mol.

Basis set is cc-pVTZ.

Basis set is 6-31G(d).

Heats of formation for H2 are calculated as −26 and −22 kcal/mol for PM6 and PDDG. To correct for this exceptional error, this value is set to 0.0 kcal/mol.

Energies are class="Gene">calculated at 0 K exclclass="Chemical">n class="Chemical">uding zero point energy and thermal corrections. All numbers are given in kcal/mol. Basis set is cc-pVTZ. Basis set is 6-31G(d). Heats of formation for class="Chemical">H2 are class="Chemical">n class="Gene">calculated as −26 and −22 kcal/mol for PM6 and PDDG. To correct for this exceptional error, this value is set to 0.0 kcal/mol.

Proton Affinities and Hydrogen Bonding Energies

For a test on proton affinities, we use one of our earlier class="Chemical">compilatioclass="Chemical">ns from ref (43). Similar to the situatioclass="Chemical">n for class="Chemical">n class="Chemical">sulfur, we find that when using the calculated Hubbard derivative, proton affinities in the gas phase are overestimated (for both MIO/calc and 3OB/calc, see Table 13). The fit of the phosphorus Hubbard drivative eliminates this systematic deviation with both MIO and 3OB parameters, except for certain cases that still have error on the order of 5–10 kcal/mol. The special parametrization 3OB/OPhyd, which uses the same Hubbard parameter, also performs very well. This is not surprising as the binding situation of P–O bonds does not alter significantly for the calculation of proton affinities.
Table 13

18 Proton Affinities for Phosphorus Containing Molecules in kcal/mol: Deviation of DFTB in Comparison to G3B3a

moleculebG3B3B3LYPcPBEcPM6dPDDGMIO/calcMIO3OB/calc3OB3OB/OPhyd
H3PO4334.0–2.4–4.3–18.0–2.9+18.3+5.5+11.5+4.2+3.4
H2PO4464.5–3.3–5.8–17.8+12.1+17.2–4.3+10.4–1.7–2.0
DMPHe336.3–1.3–4.4–13.2–6.4+15.9+4.8+10.6+4.5+3.8
MMPe336.7–2.0–4.5–16.6–6.0+15.8+3.8+9.8+3.1+2.4
MMPe460.5–3.0–6.0–12.7+11.0+18.3–1.2+12.0+1.3+1.0
PH3OH+201.6+3.0–0.1+13.0+13.6+4.8–0.0+0.7–1.5–0.4
PH2OHOH+201.6+1.5–0.7+2.2+7.3+8.2+2.1+3.1–0.1+0.2
PHOHOHOH+200.8–0.0–2.5–5.6+4.0+13.6+6.2+8.5+3.7+3.4
PH2(OH)=O336.6+1.4–0.9+2.3+21.6+12.2+3.3+7.9+3.3+4.2
PH(OH)(OH)=O334.7–0.4–2.6–6.5+12.8+16.0+5.3+10.2+4.4+4.4
P(O)(OH)(−O–CH2CH2–O−)336.3–2.2–5.0–13.9–7.3+13.4+2.4+8.3+2.1+1.5
P(OH)(OH)(−O–CH2CH2–O−)(OH*)359.0–2.8–7.7–18.3–16.4+12.9+0.3+5.9–0.3–1.0
P(OH*)(OH)(−O–CH2CH2–O−)(OH)350.4–2.9–6.7–18.1–17.2+11.0–0.4+3.1–2.9–3.6
P(OH*)(OH)(−O–CH2CH2–O−)(OCH3)351.2–2.7–6.5–17.7–20.9+8.9–1.4f–3.6–4.0
P(OH)(OCH3)(−O–CH2CH2–O−)(OH*)359.6–2.7g–17.6–18.9+3.3+0.1h–4.1–9.6–10.2
P(OH*)(OCH3)(−O–CH2CH2–O−)(OH)352.9–2.4–6.4–19.1–19.5+10.1–0.5+2.6–2.9–3.5
P(OH)(OH)(OH)(OH*)(OH)_ax357.3–2.2–6.4–18.8–9.0+14.2–1.2+6.8–0.9–1.6
P(OH)(OH)(OH)(OH*)(OH)_eq347.0–3.5–7.1–17.1–8.3i–0.0ii–3.1
MAD 2.24.313.812.012.62.47.22.93.0
MSE –1.6–4.3–11.9–2.8+12.6+1.46.70.2–0.3
MAX 3.57.719.121.618.36.212.09.610.2

The proton affinity is computed using potential energies at 0 K without any zero-point energy correction.

The molecules are given in the protonated form.

Basis set is 6-31+G(d,p).

The energy of the proton in PM6 is in error by −54 kcal/mol, which has been accounted for by adding this number to the original result. See discussion at http://openmopac.net/manual/pm6_accuracy.html.

“DMPH” refers to dimethyl hydrogen phosphate, “MMP” to P(O)(OH)(OH)(OCH3), and “MMP–” to P(O)(O)(OH)(OCH3)−.

One ligand dissociates.

Converges to a slightly different minima and is excluded from the statistics.

This value is different from the one we reported in ref (51), where the molecule was erroneously optimized to a different local minimum. Here, we stay as close to the conformation of the high-level optimized one as possible.

Molecule dissociates, forming H2O, and has been excluded from the statistics. Depending on the basis set, this dissociation also occurs for the DFT functionals PBE and B3LYP, e.g., dissociation for basis set 6-311G(2d,2p), no dissociation for basis set cc-pVTZ.

The proton affinity is n class="Chemical">computed usiclass="Chemical">ng poteclass="Chemical">ntial eclass="Chemical">nergies at 0 K without aclass="Chemical">ny zero-poiclass="Chemical">nt eclass="Chemical">nergy class="Chemical">n class="Chemical">correction. The molecules are given in the protonated form. Basis set is 6-31+G(d,p). The energy of the proton in class="Chemical">PM6 is iclass="Chemical">n error by −54 kcal/mol, which has beeclass="Chemical">n acclass="Chemical">n class="Chemical">counted for by adding this number to the original result. See discussion at http://openmopac.net/manual/pm6_accuracy.html. class="Chemical">DMPH” refers to class="Chemical">n class="Chemical">dimethyl hydrogen phosphate, “MMP” to P(O)(OH)(OH)(OCH3), and “MMP–” to P(O)(O)(OH)(OCH3)−. One ligand dissociates. n class="Chemical">Coclass="Chemical">nverges to a slightly differeclass="Chemical">nt miclass="Chemical">nima aclass="Chemical">nd is exclclass="Chemical">n class="Chemical">uded from the statistics. This value is different from the one we reported in ref (51), where the molecule was erroneously optimized to a different local minimum. Here, we stay as close to the n class="Chemical">coclass="Chemical">nformatioclass="Chemical">n of the high-level optimized oclass="Chemical">ne as possible. Molecule dissociates, forming class="Chemical">H2O, aclass="Chemical">nd has beeclass="Chemical">n exclclass="Chemical">n class="Chemical">uded from the statistics. Depending on the basis set, this dissociation also occurs for the DFT functionals PBE and B3LYP, e.g., dissociation for basis set 6-311G(2d,2p), no dissociation for basis set cc-pVTZ. A few model systems for class="Chemical">hydrogen boclass="Chemical">ndiclass="Chemical">ng eclass="Chemical">nergies betweeclass="Chemical">n class="Chemical">n class="Chemical">water and phosphate esters in the gas phase are compiled in Table 14. In comparison to G3B3, DFTB3 slightly underestimates the binding; PM6 and PDDG generally have larger errors. Note that hydrogen bond lengths are underestimated with B3LYP/6-31G(d) (the geometry used for G3B3) in comparison to B3LYP/aug-cc-pVTZ by almost 0.04 Å on average. DFTB3/3OB predicts even shorter hydrogen bonds, the mean signed error (MSE) is −0.06 Å; for MIO it is −0.10 Å. PM6 often leads to very different structures (which is why only single point energies on PDDG structures are shown in Table 14), and PDDG is also featured with substantially larger errors with a MAD of 0.226 Å (Table 15). More details are summarized in the Supporting Information.
Table 14

Hydrogen Bonding Energies for Phosphates in kcal/mol: Deviations in Comparison to G3B3a

moleculebG3B3MP2cB3LYPcB3LYPdPBEdPM6-SPePDDGMIO3OB3OB/OPhyd
MMPH–OH213.3+0.2–1.9–0.4+1.0–5.4–2.8–2.0–2.7–2.7
MMP–1–H2O17.9–0.4–2.7–1.1–0.3–5.2–3.1–0.6–1.1–1.3
MMP–1–OH217.4+0.1–2.0–1.0+0.6–4.8–3.5–1.3–2.0–2.1
MMP–2–H2O32.3–0.4–3.2–0.9+0.5–4.5+2.4–0.5–0.6–0.7
DMPH–OH214.4+0.0–2.1–0.5+0.8–5.3–3.2–2.9–3.7–3.8
DMP–1–H2O18.1–0.6–2.9–1.3–0.6–5.2–3.0–1.0–1.4–1.6
MAD 0.32.50.90.65.13.01.41.92.0
MAX 0.63.21.31.05.43.52.93.73.8

The binding energy is computed with the potential energies at 0 K without any zero-point energy correction.

MMPH: dihydrogenated monomethylphosphate. MMP–1: monohydrogenated momomethylphosphate. MMP–2: dehydrogenated monomethylphosphate, respectively for DMP: dimethylphosphate; coordinates are given in the Supporting Information.

Basis set is aug-cc-pVTZ.

Basis set is 6-31+G(d,p).

During the geometry optimization using PM6, the hydrogen bonds relaxe to substantially different minima. Therefore, single-point calculations have been used for comparison.

Table 15

Hydrogen Bonding Distances in Å for Small Phosphorus Containing Systems: Deviations in Comparison to B3LYP/aug-cc-pVTZ

systemH-bondB3LYPaMP2aB3LYPbB3LYPcPBEcPDDGdMIO3OB
DMPH–OH2HOH–O1.892–0.028–0.004+0.025–0.029–0.219–0.062–0.010
DMPH–OH2H–OH21.776–0.031–0.055–0.009–0.081–0.090+0.018+0.075
DMP–1–H2OHOH–O12.052–0.036–0.028–0.004–0.031–0.322–0.170–0.125
DMP–1–H2OHOH–O22.093–0.060–0.031+0.009–0.023–0.360–0.191–0.144
MMPH–OH2H–OH21.773–0.031–0.054–0.015–0.087–0.084+0.020+0.072
MMPH–OH2O–HOH1.916–0.034–0.015+0.024–0.037–0.239–0.091–0.034
MMP–1–H2OHOH–O12.064–0.046–0.040–0.002–0.034–0.329–0.183–0.140
MMP–1–H2OHOH–O22.073–0.043–0.031+0.008–0.024–0.335–0.178–0.127
MMP–1–OH2H–OH22.113–0.063–0.136–0.019–0.125–0.346–0.205–0.102
MMP–1–OH2O–HOH1.638–0.009+0.093+0.008–0.0300.003+0.029+0.001
MMP–2–H2OHOH–O11.862–0.040–0.007+0.009–0.026–0.193–0.113–0.115
MMP–2–H2OHOH–O21.861–0.040–0.007+0.009–0.026–0.192–0.112–0.114
MAD  0.0380.0420.0120.0460.2260.1140.088
MAX  0.0630.1360.0250.1250.3600.2050.142

Basis set is aug-cc-pVTZ.

Basis set is 6-31G(d); this method is used within G3B3 for the geometry optimization.

Basis set is 6-31+G(d,p).

Optimization using PM6 often leads to very different geometries; thus only PDDG results are reported. See Supporting Information.

The binding energy is n class="Chemical">computed with the poteclass="Chemical">ntial eclass="Chemical">nergies at 0 K without aclass="Chemical">ny zero-poiclass="Chemical">nt eclass="Chemical">nergy class="Chemical">n class="Chemical">correction. class="Chemical">MMPH: diclass="Chemical">n class="Chemical">hydrogenated monomethylphosphate. MMP–1: monohydrogenated momomethylphosphate. MMP–2: dehydrogenated monomethylphosphate, respectively for DMP: dimethylphosphate; coordinates are given in the Supporting Information. Basis set is aug-cc-pVTZ. Basis set is 6-31+G(d,p). During the geometry optimization using class="Chemical">PM6, the class="Chemical">n class="Chemical">hydrogen bonds relaxe to substantially different minima. Therefore, single-point calculations have been used for comparison. Basis set is aug-cc-pVTZ. Basis set is 6-31G(d); this method is used within G3B3 for the geometry optimization. Basis set is 6-31+G(d,p). Optimization using n class="Chemical">PM6 ofteclass="Chemical">n leads to very differeclass="Chemical">nt geometries; thus oclass="Chemical">nly PDDG results are reported. See Supporticlass="Chemical">ng Iclass="Chemical">nformatioclass="Chemical">n.

Hydrolysis Reactions

class="Chemical">Phosphorus parameters have beeclass="Chemical">n beclass="Chemical">nchmarked aclass="Chemical">nd used to stclass="Chemical">n class="Chemical">udy phosphate hydrolysis reactions.[21,43,86,87] We have adopted the compilation of elementary steps of these reactions and show results for our new parametrization in Table 16.
Table 16

Deviations of Exothermicities and Barrier Heights in Comparison to MP2/G3Large Single Point Calculations at B3LYP/6-31+G(d,p) Relaxed Geometries for 37 Elementary Steps in the Hydrolysis of MMP and DMPa

methodMP2B3LYPPBEbPM6PDDGMIO3OB3OB/OPhyd3OB3OB/OPhydMP2MP2
at geometry optimized byB3LYPB3LYPB3LYPB3LYPB3LYPB3LYPB3LYPB3LYP3OB3OB/OPhyd3OB3OB/OPhyd
com1 → ts1 (MMP,B)31.0–1.7–6.1–14.5+12.3–7.2+11.4–0.8 –0.8 +0.5
com1 → int1 (MMP,E)30.6–1.4–5.7–16.4+14.8–7.0+12.9–0.8 –1.5 +0.6
com1 → ts1_2 (MMP,B)41.5–2.1–7.4–11.9+9.4–3.3+9.7+1.1+9.8+1.8+1.8+1.1
com1 → int1_2 (MMP,E)31.0–1.1–5.1–17.8+15.6–5.9+15.4+0.5+14.5–0.2+0.8+0.7
int1_2 → ts2_0 (MMP,B)11.9–2.0–3.3+7.4–7.2+2.7–4.9+1.6–4.9+0.3–2.1+3.0
int1_2 → ts2 (MMP,B)3.6+0.1–0.4+4.3+0.4–1.1–2.7–0.2+2.3+2.3+6.6+6.4
int1_2 → com2 (MMP,E)–28.8–0.9+3.4+20.4–17.6+4.4–16.6–1.8–15.7–0.9–1.8–1.8
com1 → diss_tsa (MMP,B)36.8–4.2–10.2–6.2+11.8+4.9–2.8–2.2–7.8–6.0–2.0+0.4
com1 → diss_int (MMP,E)19.6–6.5–7.5–4.6–14.8+0.3–23.1–10.5–22.5–10.0–1.0–1.4
com1_w2 → ts1_2_w2 (MMP,B)39.9–2.1–8.5–24.0+10.0–12.7+5.3–4.8+6.2–3.1–1.5+1.9
com1_w2 → int1_2a_w2 (MMP,E)28.0–0.5–4.4–21.6+16.4–7.9+12.8–2.1+11.9–2.8+0.2–0.1
int1_2a_w2 → int1_2_w2 (MMP,E)0.4+0.8+0.5+4.0+1.3+2.5+2.9+3.0+2.7+2.7+1.1+1.1
int1_2_w2 → ts2_0_w2 (MMP,B)11.4–1.8–3.7+1.4–0.2–5.4–9.0–3.5–7.7–2.1–4.2+0.0
com1_da → ts1_da (MMP,B)55.0–3.1–6.8–7.7–11.1–0.2+2.0–0.7+4.2–0.3+7.4+1.0
com1_da → int_da (MMP,E)4.5–2.0–1.8+1.4+4.0–2.0–1.9–1.9–1.7–1.6–0.2–0.4
com1 → ts1 (DMP,B)38.6–1.4–5.9–15.0+5.8–7.3+9.4–0.5+9.9–0.9+0.6+0.5
com1 → int1 (DMP,E)35.4–0.2–4.4–19.4+11.8–7.6+13.8–0.9+12.9–1.6+1.8+1.4
int1 → int1_2 (DMP,E)1.3–0.7–1.3+1.9+1.6+1.3+1.3+2.1+0.7+0.8+0.1+0.3
int1_2 → ts2 (DMP,B)0.6–0.5–0.5+3.1–2.4+1.2–0.8+1.5–0.4+3.4–0.1–0.4
int1_2 → com2 (DMP,E)–35.2–0.7+4.3+21.2–15.2+6.0–15.0–1.0–13.5+1.0–1.8–1.7
n_com1 → n_ts3 (DMP,B)33.6–1.4–8.1–11.0+28.5+0.2+16.5+4.4+12.2+0.8+2.3+1.2
n_com1 → n_int1 (DMP,E)13.2+0.4–3.2–15.6+21.8–4.9+18.2+3.6+17.7+3.1+0.8+1.1
n_int1 → n_ts4 (DMP,B)22.9–1.6–5.4+8.0+13.4+5.1+0.6+1.2–3.6–1.4+0.2+0.8
n_int1 → n_com2 (DMP,E)–15.8–1.9+0.9+23.2–14.9+3.7–18.7–4.2–18.9–4.2–0.6–1.0
DMP_P → diss_ts (DMP,B)40.9–2.9–9.1–2.3+11.2+8.8+0.2–0.7–2.2–3.2–0.6–1.5
DMP_P → diss_prod (DMP,E)28.2–3.8–6.7–12.6–0.9–0.9–6.1–5.2–18.7–7.3+5.6+4.0
diss_prod2 → diss_ts2 (DMP,B)13.5+0.7–2.5+12.1+18.3+10.4+8.9+5.1+16.9+4.7–2.8–1.2
diss_prod2 → MMP_P (DMP,E)–29.8+3.6+6.7+18.4+12.3+0.6+8.1+5.1+19.6+8.1–2.2–1.2
diss_w_reac → diss_w_ts (DMP,B)20.9–2.3–7.4–8.0+10.8+0.7+1.3+0.7+2.1+1.5+1.0+0.8
diss_w_reac → diss_w_prod (DMP,E)18.4–2.6–5.6–6.5+1.0+0.3–2.5–2.9–12.9–2.9+12.0+0.0
diss_w_prod2 → diss_w_ts2 (DMP,B)1.9+0.2–1.6+0.9+10.9+0.3+3.3+2.8+16.6+3.4–8.2–0.1
diss_w_prod2 → diss_w_reac2 (DMP,E)–21.0+2.8+6.2+14.5+7.7–0.7+2.2+2.3+14.7+2.0–8.3+0.1
n_w_com1 → n_w_ts3 (DMP,B)28.2–1.8–10.1–24.3+32.4–8.9+13.7+0.2+10.6–0.2–0.2+0.2
n_w_com1 → n_w_int1 (DMP,E)13.1+1.0–3.1–15.7+22.2–5.8+17.5+2.8+17.3+2.7+0.6+0.7
n_w_int1 → n_w_int2 (DMP,E)–0.5+0.5+0.4–0.0+0.8+0.6+1.0+1.0+0.6+0.6+0.4+0.4
n_w_int2 → n_w_ts4 (DMP,B)15.1–2.3–6.2–2.6+17.2–3.1–2.9–2.7–0.9–1.1+4.2+2.1
n_w_int2 → n_w_com2 (DMP,E)–13.0–2.0+1.3+23.8–14.8+4.1–19.0–4.4    
MAD 1.84.711.511.44.18.52.59.82.52.51.1
MAX 6.510.324.332.412.723.110.522.510.012.06.4

Compilation from ref (43); no zero-point corrections are included in either exothermicity or barrier heights. All quantities are given in kcal/mol. The processes are labeled in the notation of ref (43); “E” stands for “Exothermicity,” “B” for “Barrier;” coordinates are listed in the Supporting Information.

Basis set is 6-31+G(d,p).

class="Chemical">Compilatioclass="Chemical">n from ref (43); class="Chemical">no zero-poiclass="Chemical">nt class="Chemical">n class="Chemical">corrections are included in either exothermicity or barrier heights. All quantities are given in kcal/mol. The processes are labeled in the notation of ref (43); “E” stands for “Exothermicity,” “B” for “Barrier;” coordinates are listed in the Supporting Information. Basis set is 6-31+G(d,p). In ref (43), transition and intermediate states, reactants, and products have been optimized using B3class="Gene">LYP/6-31+G(d,p). First, we class="Chemical">n class="Chemical">compare the single point energetics at those structures. DFTB3/3OB shows a rather poor performance with a MAD of 8.5 kcal/mol, although slightly better than PM6 and PDDG, which have MADs beyond 10 kcal/mol. A closer look reveals that the poor performance is systematically found for reactions where pentavalent phosphorus structures are involved, whereas the performance for reactions that include only tri- and tetravalent structures (i.e., all reactions that follow a dissociative pathway, abbreviated as “diss” in Table 16) are overall in reasonably good agreement with the MP2 reference. Along this line, we have to distinguish even further. Some structures are trivalent, i.e., a PO3 unit is (almost) trigonal-planar; however one water is still axially coordinated to the phosphorus. The P–water distance lies within the P–O repulsive potential; that is, the P–water distance is smaller than the cutoff of the P–O repulsive potential. Only for the geometry named diss_int, however, where no axial water is included, is any P–water distance beyond the cutoff of the repulsive potential; thus, in the reaction, phosphorus reacts from a four-coordinated to a three-coordinated complex. Specifically for that reaction, a large error is found. In other words, the 3OB parameters seem to work well as long as P is coordinated by exactly four oxygen atoms. For coordinations with three and five oxygens, large errors are obtained indicating a deficiency in properly describing different hybridization states. We have attempted tuning the s-orbital eigenvalue to achieve a more appropriate balance for different hybridized systems (similar to tuning the s-orbital eigenvalue of nitrogen[88]); however by doing so we could not identify an overall well-performing parameter set. This indicates that the current DFTB3 methodology has limited transferability for complex phosphorus chemistry at the level of accuracy in energetics required for detailed mechanistic studies due to the small basis set, the lack of three-center integrals, or multipole moments of the charge fluctuation.[61] We emphasize that even for these challenging cases, as discussed below, the 3OB structures are rather reliable and MP2 single point energies at these structures have a rather small MAD of 2.5 kcal/mol compared to MP2 at B3LYP structures. As a temporary and ad hoc solution to this problem, we suggest the special parametrization OPhyd that reduces these errors as follows. In the first two reactions in Table 16 a fifth class="Chemical">oxygen is bouclass="Chemical">nd to a class="Chemical">n class="Chemical">phosphorus atom. Both the transition state (ts1) and the intermediate state (int1) are overestimated in energy by about 10 kcal/mol. Therefore, we shift the O–P repulsive potential by about 10 kcal/mol to be more attractive. While this introduces an overbinding for all P–O bonds, the energy difference between tetra- and pentavalent phosphorus is compensated. Note that for the reaction from a tetra- to trivalent P (com1 → diss_int), a considerable error of about 10 kcal/mol remains. Due to the dominance of tetra- and pentavalent phosphorus models in our compilation, the MAD drops down to only 2.5 kcal/mol. As a seclass="Chemical">coclass="Chemical">nd class="Chemical">n class="Chemical">comparison, we optimize all structures at the DFTB3 level. Transition states are found using the nudged elastic band technique.[89] The results for DFTB3/3OB/OPhyd are impressive, with the MAD in comparison to MP2/G3large at B3LYP/6-31+G(d,p) geometries staying at 2.5 kcal/mol. However, a few structural problems appear. Hydrogen bonds are often too short by about 0.1 Å. P–O coordinations of leaving groups are too long, most severely for the structures diss_prod and diss_prod2 with deviations of over 0.4 Å. The geometry n_w_com2 relaxes to a different minimum with a different type of hydrogen bonding network; therefore it is excluded from the statistics. For the standard 3OB parameters, additionally the fifth ligand of int1_mmp dissociates, and this structure is therefore also excluded from the statistics. In a third class="Chemical">comparisoclass="Chemical">n, we class="Chemical">n class="Gene">calculate MP2/G3large single-point energies at DFTB3 geometries. For DFTB3/3OB, the MAD drops down to only 2.5 kcal/mol; for comparison, PBE/6-31+G(d,p) single points at B3LYP structures have a MAD of 4.7 kcal/mol (see Table 16). Thus, even though the energetics for tri- and pentavalent structures are inaccurate, DFTB3/3OB can be corrected using higher level methods to yield very good energetics. For the special parametrization, 3OB/OPhyd results are excellent; the MAD is only 1.1 kcal/mol, showing that the geometrical differences between B3LYP/6-31+G(d,p) and 3OB/OPhyd are of minor relevance, or from a different perspective, the potential energy surface seems very shallow and leads to similar energies.

Conclusion

class="Chemical">Coclass="Chemical">nsidericlass="Chemical">ng the importaclass="Chemical">nce of class="Chemical">n class="Chemical">phosphorus and sulfur in (bio)chemistry, we extend the parametrization of the approximate density functional tight binding approach, DFTB3, to these elements. The parametrization is carried out in a framework consistent with the DFTB3/3OB set[52] for O, N, C, and H; thus the resulting parameters can be used to describe a broad set of organic and biologically relevant P/S-containing molecules. The 3d orbitals, which are known to be polarization functions essential to the proper description of structure and energetics of P/S-containing molecules, are included in the parametrization, in contrast to a few popular wave function based semiempirical methods such as PM3/PDDG. Compared to the previous parametrizations of DFTB2 and DFTB3 (the “MIO” set), the electronic parameters of the current 3OB parametrization were chosen with the goal of minimizing errors in atomization energies. As a result, systematic overbinding of covalent interactions is avoided, leading to generally more reliable reaction energies. The parameters are available for download from www.dftb.org. The parameters are tested with a fairly diverse set of molecules of chemical and biological relevance. We focus on the geometries, reaction energies, proton affinities, and class="Chemical">hydrogen boclass="Chemical">ndiclass="Chemical">ng iclass="Chemical">nteractioclass="Chemical">ns of these molecules; vibratioclass="Chemical">nal frequeclass="Chemical">ncies are also examiclass="Chemical">ned, although less systematically. These properties class="Chemical">n class="Gene">calculated at the DFTB3/3OB level are compared to results of DFT (B3LYP and PBE), ab initio (MP2, G3B3), and several popular semiempirical methods (PM6 and PDDG), as well as predictions of DFTB3 with the older parametrization (the MIO set). In general, DFTB3/3OB is a major improvement over the previous parametrization (DFTB3/MIO), and for the majority of cases tested here, it also outperforms PM6 and PDDG, especially for structural properties, vibrational frequencies, hydrogen bonding interactions, and proton affinities. For reaction energies, DFTB3/3OB exhibits major improvement over DFTB3/MIO, due mainly to significant reduction of errors in atomization energies; compared to PM6 and PDDG, DFTB3/3OB also generally performs better, although the magnitude of improvement is more modest. Compared to high-level calculations, DFTB3/3OB is most successful at predicting geometries; the energetics are largely on par with results of DFT (especially PBE, which is the functional used in DFTB3 for deriving the relevant matrix elements) using a medium basis set (e.g., 6-31G(d)), although larger errors are seen more often with DFTB3. We note that since we focus on molecules of biological interest, our test cases here do not feature a large number of radical species. The extensive tests also indicate that there are remaining cases for which the current class="Chemical">DFTB3 methodology has trouble properly makiclass="Chemical">ng a descriptioclass="Chemical">n. For example, SO (siclass="Chemical">nglet) aclass="Chemical">nd class="Chemical">n class="Chemical">SO3 have large errors (11.8 and 34.4 kcal/mol, respectively) in atomization energies; thus reactions involving these species are generally not well described by DFTB3/3OB. Another example involves a substantial underestimation of hydrogen-bonding distances for many (especially charged) sulfur-containing species; the comparison to DFT calculations using different functionals and basis sets suggest that the errors are likely due in part to the intrinsic limitations of the PBE functional and also to the minimal basis set used in DFTB3. It is worth noting that the computed hydrogen bonding energies, nevertheless, are generally fairly reliable and even slightly better than PBE/6-31+G(d,p). Finally, another remaining limitation is that DFTB3/3OB is still incapable of providing reliable energetics for phosphate hydrolysis reactions that involve a change in the coordination number of the phosphorus. Thus, the current DFTB3 model has limited transferability for complex phosphorus chemistry at the level of accuracy in energetics required for detailed mechanistic investigations. As a temporary solution, we develop a specific parametrization, 3OB/OPhyd, where an overbinding is introduced to O–P interactions to empirically minimize the errors for a set of reference reactions. Although this is clearly not a satisfactory solution for the long-term, the 3OB/OPhyd can serve as a useful tool for exploring the energy landscape of phosphoryl transfer reactions in biological systems, especially for structural properties. Alternatively, the 3OB structures remain reliable for these challenging cases, and one may still use the standard 3OB parameters to obtain structures and improve energetics with higher level methods. Due to class="Chemical">computatioclass="Chemical">nal efficieclass="Chemical">ncy aclass="Chemical">nd geclass="Chemical">neral robustclass="Chemical">ness for structural property predictioclass="Chemical">ns, QM/MM class="Chemical">n class="Gene">calculations using DFTB3/3OB as the QM level will be an effective computational approach for the analysis of condensed phase systems. The energetic properties from such calculations should be taken as semiquantitative in nature and can be improved by combining with ab initio QM/MM calculations. For example, the fact that single point high-level calculations using DFTB3/3OB geometries often exhibit small errors suggests that, for example, DFTB3/3OB is likely effective as the low-level QM approach that drives the sampling for reliable free energy simulations in the framework of dual-level QM/MM calculations.[20,24,90−93] Along this line, for the phosphate hydrolysis reaction, the closer agreement between 3OB/OPhyd and higher level methods in energies makes 3OB/OPhyd better suited for dual-level QM/MM free energy calculations than the standard 3OB set. Since many reactions that involve phosphorus/sulfur chemistry implicate metal ions, a pressing need is to extend the DFTB3/3OB parametrization to these ions, such as the alkali metal ions, zinc and copper; this will be reported in separate work in the near future. Along this line, we note that although there is currently an impressive effort to develop DFTB parameters for the entire periodic table in a largely automated fashion,[94] developing parameters appropriate for many (bio)chemical applications would require careful calibration and refinement as we have done here for sulfur and phosphorus. Finally, the remaining limitations highlight that it is also worthwhile continuing to improve the DFTB3 methodology, such as by including multipole terms for the charge fluctuation and better description of polarization and short-range repulsions.
  65 in total

1.  Generalized Gradient Approximation Made Simple.

Authors: 
Journal:  Phys Rev Lett       Date:  1996-10-28       Impact factor: 9.161

Review 2.  Protein kinases--the major drug targets of the twenty-first century?

Authors:  Philip Cohen
Journal:  Nat Rev Drug Discov       Date:  2002-04       Impact factor: 84.694

Review 3.  Enzymatic mechanisms of phosphate and sulfate transfer.

Authors:  W Wallace Cleland; Alvan C Hengge
Journal:  Chem Rev       Date:  2006-08       Impact factor: 60.622

4.  Application of the computationally efficient self-consistent-charge density-functional tight-binding method to magnesium-containing molecules.

Authors:  Zheng-Li Cai; Philip Lopez; Jeffrey R Reimers; Qiang Cui; Marcus Elstner
Journal:  J Phys Chem A       Date:  2007-06-08       Impact factor: 2.781

5.  Paradynamics: an effective and reliable model for ab initio QM/MM free-energy calculations and related tasks.

Authors:  Nikolay V Plotnikov; Shina C L Kamerlin; Arieh Warshel
Journal:  J Phys Chem B       Date:  2011-05-27       Impact factor: 2.991

6.  Does water relay play an important role in phosphoryl transfer reactions? Insights from theoretical study of a model reaction in water and tert-butanol.

Authors:  Yang Yang; Qiang Cui
Journal:  J Phys Chem B       Date:  2009-04-09       Impact factor: 2.991

7.  Description of phosphate hydrolysis reactions with the Self-Consistent-Charge Density-Functional-Tight-Binding (SCC-DFTB) theory. 1. Parameterization.

Authors:  Yang Yang; Haibo Yu; Darrin York; Marcus Elstner; Qiang Cui
Journal:  J Chem Theory Comput       Date:  2008       Impact factor: 6.006

Review 8.  Why nature really chose phosphate.

Authors:  Shina C L Kamerlin; Pankaz K Sharma; Ram B Prasad; Arieh Warshel
Journal:  Q Rev Biophys       Date:  2013-01-15       Impact factor: 5.318

Review 9.  Proton transfer function of carbonic anhydrase: Insights from QM/MM simulations.

Authors:  Demian Riccardi; Shuo Yang; Qiang Cui
Journal:  Biochim Biophys Acta       Date:  2009-08-11

10.  Extension of the PDDG/PM3 Semiempirical Molecular Orbital Method to Sulfur, Silicon, and Phosphorus.

Authors:  Ivan Tubert-Brohman; Cristiano Ruch Werneck Guimarães; William L Jorgensen
Journal:  J Chem Theory Comput       Date:  2005       Impact factor: 6.006

View more
  61 in total

1.  Benchmarking density functional tight binding models for barrier heights and reaction energetics of organic molecules.

Authors:  Maja Gruden; Ljubica Andjeklović; Akkarapattiakal Kuriappan Jissy; Stepan Stepanović; Matija Zlatar; Qiang Cui; Marcus Elstner
Journal:  J Comput Chem       Date:  2017-07-24       Impact factor: 3.376

2.  Nucleic acid reactivity: challenges for next-generation semiempirical quantum models.

Authors:  Ming Huang; Timothy J Giese; Darrin M York
Journal:  J Comput Chem       Date:  2015-05-06       Impact factor: 3.376

3.  On the hydrogen activation by frustrated Lewis pairs in the solid state: benchmark studies and theoretical insights.

Authors:  Lei Liu; Jan Gerit Brandenburg; Stefan Grimme
Journal:  Philos Trans A Math Phys Eng Sci       Date:  2017-08-28       Impact factor: 4.226

4.  Perspective: Quantum mechanical methods in biochemistry and biophysics.

Authors:  Qiang Cui
Journal:  J Chem Phys       Date:  2016-10-14       Impact factor: 3.488

5.  Exploring the applicability of density functional tight binding to transition metal ions. Parameterization for nickel with the spin-polarized DFTB3 model.

Authors:  Milena Vujović; Mioy Huynh; Sebastian Steiner; Pablo Garcia-Fernandez; Marcus Elstner; Qiang Cui; Maja Gruden
Journal:  J Comput Chem       Date:  2018-10-09       Impact factor: 3.376

6.  Catalytic Mechanism of Amyloid-β Peptide Degradation by Insulin Degrading Enzyme: Insights from Quantum Mechanics and Molecular Mechanics Style Møller-Plesset Second Order Perturbation Theory Calculation.

Authors:  Rui Lai; Wei-Jen Tang; Hui Li
Journal:  J Chem Inf Model       Date:  2018-09-06       Impact factor: 4.956

7.  Analysis of Density Functional Tight Binding with Natural Bonding Orbitals.

Authors:  Xiya Lu; Juan Duchimaza-Heredia; Qiang Cui
Journal:  J Phys Chem A       Date:  2019-08-15       Impact factor: 2.781

8.  QM/MM free energy simulations: recent progress and challenges.

Authors:  Xiya Lu; Dong Fang; Shingo Ito; Yuko Okamoto; Victor Ovchinnikov; Qiang Cui
Journal:  Mol Simul       Date:  2016-07-05       Impact factor: 2.178

9.  QM/MM Analysis of Transition States and Transition State Analogues in Metalloenzymes.

Authors:  D Roston; Q Cui
Journal:  Methods Enzymol       Date:  2016-07-01       Impact factor: 1.600

10.  Improving intermolecular interactions in DFTB3 using extended polarization from chemical-potential equalization.

Authors:  Anders S Christensen; Marcus Elstner; Qiang Cui
Journal:  J Chem Phys       Date:  2015-08-28       Impact factor: 3.488

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.