Literature DB >> 25178644

Parametrization of DFTB3/3OB for magnesium and zinc for chemical and biological applications.

Xiya Lu1, Michael Gaus, Marcus Elstner, Qiang Cui.   

Abstract

We report the parametrization of the approximate depan class="Chemical">nsity functional theory, n class="Chemical">DFTB3, for n class="Chemical">magnesium and zinc for chemical and biological applications. The parametrization strategy follows that established in previous work that parametrized several key main group elements (O, N, C, H, P, and S). This 3OB set of parameters can thus be used to study many chemical and biochemical systems. The parameters are benchmarked using both gas-phase and condensed-phase systems. The gas-phase results are compared to DFT (mostly B3LYP), ab initio (MP2 and G3B3), and PM6, as well as to a previous DFTB parametrization (MIO). The results indicate that DFTB3/3OB is particularly successful at predicting structures, including rather complex dinuclear metalloenzyme active sites, while being semiquantitative (with a typical mean absolute deviation (MAD) of ∼3-5 kcal/mol) for energetics. Single-point calculations with high-level quantum mechanics (QM) methods generally lead to very satisfying (a typical MAD of ∼1 kcal/mol) energetic properties. DFTB3/MM simulations for solution and two enzyme systems also lead to encouraging structural and energetic properties in comparison to available experimental data. The remaining limitations of DFTB3, such as the treatment of interaction between metal ions and highly charged/polarizable ligands, are also discussed.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25178644      PMCID: PMC4306495          DOI: 10.1021/jp506557r

Source DB:  PubMed          Journal:  J Phys Chem B        ISSN: 1520-5207            Impact factor:   2.991


Introduction

n class="Chemical">Magnesiumpan> and zinc are two of the most n class="Chemical">common metal cofactors of numerous n class="Chemical">metalloenzymes and other biomolecules.[1−3] The binding of Mg2+ to DNA and RNA helps to maintain their structures by stabilizing highly negatively charged phosphates. ATP, the main source of energy in cells, is bound to Mg2+ in its biologically active form.[4] In addition, a large number of enzymes involved in the biochemistry of nucleic acids bind Mg2+ for catalytic activity; for example, Mg2+ is essential for DNA replication and phosphoryl transfers.[5] Zinc is the second most abundant cation in the human body, so Zn2+ has important biological functions in numerous biomolecular systems[6,7] such as zinc-finger class of transcription factors, signaling proteins, transport/storage proteins, as well as enzymes. The zinc ion contributes to catalysis in more than 300 enzymes and serves a structural role in many proteins and enzymes.[8−10] To describe the structural properties of n class="Chemical">Mg2+pan> and n class="Chemical">Zn2+ in solution and biomolecules, molecular mechanical (MM) force field parameters have been developed.[11−15] Some of the advanced force field models such as SIBFA,[16] AMOEBA,[17,18] and Drude oscillator models[19] treat electronic polarization and charge transfer[20] terms explicitly and have been demonstrated to describe potential energy surfaces in close agreement with quantum mechanics (QM) calculations.[21,22] For chemical reactions that involve these n class="Chemical">metal ions, however, MM models are no longer appropriate. Even for structural properties, when the binding mode of ligands is flexible,[23−25] it is challenging to describe the metal site with a MM model. An appropriate modelipan class="Chemical">ng approach in tn class="Chemical">his n class="Chemical">context is the QM/MM framework, where the metal site of interest is treated with QM while the rest of the enzyme/solution is described by an empirical force field. For relatively rigid metal sites, ab initio QM and QM/MM methods based on either reaction path or free energy simulations have been demonstrated to be powerful for mechanistic analysis.[26−30] The cost of these calculations for flexible systems, however, remains very high. For instance, one key feature of zinc coordination is its flexibility:[23−25] a zinc ion can adopt multiple binding modes with the coordination number ranging from 4 to 6, underlining the importance of developing efficient QM(/MM) methods that complement ab initio based methods. The dynamical nature of the metal binding mode is particularly important to enzymes that feature solvent accessible active sites.[31−33] For these applications, semiempirical QM-based QM/MM simulations can potentially strike the balance between computational accuracy and conformational sampling. The traditional semiempirical QM methods based opan class="Chemical">n the n class="Chemical">Neglect-of-Diatomic-Differenpan>tial-Overlap (npan> class="Chemical">NDDO) approximation, such as MNDO,[34,35] MNDO/d,[36] AM1,[37] and PM3,[38] however, give in general rather poor geometries for Mg/Zn-containing molecules. For example, metal–ligand lengths computed from AM1,[39,40] PM3, MNDO, and MNDO/d[40] deviate by roughly 0.1 Å on average in comparison to B3LYP/6-311++G** for small Mg complexes.[41] Recently developed AM1/d parameters[42] for magnesium provide a clear improvement in accuracy compared to AM1 by incorporating d orbitals. According to a recent test,[43] the mean absolute deviations (MADs) of metal–ligand distances for AM1,[44] PM3, PM3(ZnB),[45] MNDO/d,[46] and PM6[47] are more than 0.05 Å compared to CCSD(T) values for small zinc-containing compounds. The importance of a reliable treatment for zinc sites to the description of enzyme catalysis is illustrated by our recent study of alkaline phosphatase (AP),[48,59] where we showed that previous AM1/d-PhoT/MM calculations[50,51] likely led to incorrect transition states due to distortions of the bimetallic zinc motif. The Self-n class="Chemical">Copan>nsistenpan>t-Charge Density-Functional Tight-Binding (SCC-DFTB) method provides a promising alternative to NDDO approaches.[52−54] It is derived by expanding the DFT total energy up to sen class="Chemical">cond order around a reference density, which is usually taken to be the sum of atomic charge densities. The computational speed of SCC-DFTB is comparable to NDDO, and it has been rather extensively benchmarked for reaction energies, geometries, and vibrational frequencies for a large number of small molecules.[55−61] The fact that SCC-DFTB is formulated in a DFT framework suggests that it is potentially more reliable for metal ions than NDDO-based methods, which are based on Hartree–Fock. Specifically for Mg[41] and Zn,[62,63] SCC-DFTB(/MM)[64−66] has been successfully applied to a rather diverse set of biological and chemical applications that include these ions.[48,49,67−74] More recently, DFTB has been extended to include third-order contributions, termed as DFTB3.[75] Compared to the second-order formulation, DFTB3 considers the charge dependence of the atomic Hubbard parameter and therefore provides an improved description of chemical properties such as proton affinity,[53,75,76] which is important in many biological applications. With the most recent parametrization referred to as 3OB,[77,78] the electronic properties and therefore atomization energies have been substantially improved; as a result, the DFTB3 approach is shown to have comparable performance as DFT with double-ζ-plus-polarization basis sets for many systems of biological and chemical interest, although certain limitations remain.[78,79] n class="Chemical">Copan>nsidering the recenpan>t success of n class="Chemical">DFTB3/3OB, here we extend the 3OB parametrization of DFTB3 to magnesium and zinc, again focusing on chemical and biological applications. In the next section, we briefly summarize the DFTB3 methodology, the procedures for parametrization, and simulation details for benchmark calculations. We then present data from multiple levels of test sets in the gas phase, which are followed by additional benchmark calculations in both solution and enzymes. Finally, we summarize tn class="Chemical">his work with several conclusions.

Computational Methods

Theory

The n class="Chemical">DFTB3pan> parametrization of n class="Chemical">magnesium and zinc follows the protocol outlined in the previous publications[77,80] and extends the 3OB parameter set that so far includes elements C, H, n class="Chemical">N, O,[77] S, and P.[78] We first briefly review the formulation of DFTB3.[54,75] The total energy of n class="Chemical">DFTB3 is given by Here H0 is the effective Kohn–Sham Hamiltopan class="Chemical">nian evaluated at the atomic reference density ρ0, which is determined by solving the Kohn–Sham equations for the atom in the presence of an additional harmonic n class="Chemical">confining potenpan>tial with a n class="Chemical">confining radius rwf for the atomic basis set and a different radius rdens for the density. Off-diagonal H0 integrals are computed in a two-center approximation together with the PBE[81] exchange-correlation functional and an atomic basis set consisting of Slater functions. The Slater functions are defined by lmax as the highest orbital angular momentum included, nmax, and α0, α1, ..., α4, the exponents of the Slater functions. In principle, basis functions for each orbital angular momentum can be compressed differently; in this work, however, s and p orbitals are confined with the same compression radius for Mg, and s, p, and d orbitals (compression radius for d-orbitals is defined by the parameter rwfd) are confined with the same compression radius for Zn. The diagonal elements H0 are equal to the calculated atomic eigenvalues ε (x = s, p, or d orbital). No confinement of the orbitals is applied here in order to yield the correct dissociation limit. We optimize εp as discussed in the following section. The atomic Hubbard parameter Ua in the sepapan class="Chemical">n class="Chemical">conpan>d-order term of the energy Eγ describes the electron–electron interaction within one atom and enpan>ters the γ-function, which interpolates the on-site and the long-range electronic interaction of atom pairs. The Hubbard parameters are calculated from DFT as the first derivative of the highest occupied orbital with respect to its occupation number. For the third-order term EΓ, an additional parameter per elemenpan>t Ud, the derivative of the Hubbard parameter with respect to chn class="Chemical">arge has to be determined. For CHNO it was calculated as the second derivative of the highest occupied orbital with respect to its occupation number;[77] for Mg and Zn we find a significant advantage to optimize it.

Reference Systems and Parameter Fitting

As emphasized earlier,[54] most parameters in the papan class="Chemical">n class="Chemical">DFTB3 approach are directly npan> class="Chemical">computed based on atoms or atom pairs at the DFT (PBE) level, while some parameters are optimized based on specific reference systems. In this way, accuracy appropriate for chemical applications can be obtained without significantly reducing the transferability of the DFTB3 model. The parametrization for Mg/Zn used herein follows that specified previously for the parametrization of CHNOPS;[77,78] two distinct groups of parameters have to be optimized, the electronic parameters appearing in the first three terms of the DFTB3 total energy (eq 1) and the repulsive potential. The electronic parameters are summarized in Table 1. The following parameters are optimized to improve the overall performance:
Table 1

Overview of Electronic Parameters (in Atomic Units if Not Unitless)a

parameterMgZn
lmax12
nmax22
α00.500.50
α11.111.39
α22.453.87
α35.4210.78
α412.0030.00
rwf5.54.2
rwfd0.04.2
rdens14.09.0
εs–0.17267741–0.21408495
εp–0.020.02
εd–0.38831893
Espin0.00.0
U0.22460.2663
Ud–0.02–0.03

As described in the text, U, Espin, εs,d, nmax, and α are consistent with the standard choices of DFTB parametrization and not subject to optimizations. By contrast, rwf, rwfd, rdens, εp, and Ud are adjusted based on properties of molecules in the fitting set.

•rwf and rdepan class="Chemical">ns: The general procedure was to start from the electronic parameters used in the n class="Chemical">MIO set[75,77] (which are rn class="Chemical">Mgwf = 5.0, rMgdens = 12.0, rZnwf = 4.9, rZndens = 10.0). rwf and rdens are fitted in a “brute force” manner to achieve good performance on bond (metal–ligand), angle, vibrational frequencies, as well as reaction energies. •εp: The eigenvalues are ipan class="Chemical">n principle n class="Chemical">computed directly from atomic DFT calculations and enpan>ter as diagonal elemenpan>ts into the Hamiltonian matrix without the additional n class="Chemical">confining potential. Exceptions are made here to magnesium and zinc since the p orbitals are often not heavily involved in the coordination of Mg2+ and Zn2+. εp is calculated to be −0.04914 au for Mg and −0.04261 au for Zn. εp are fitted in the way that p orbitals are less involved in molecular bonding, so as to improve sequential ligation energies and proton affinities. •Ud: The Hubbard derivative is generally papan class="Chemical">n class="Chemical">computed from atomic DFT calculationpan>s. In the case of n class="Chemical">Mg and Zn, they are −0.0496 au and −0.0544 au, respectively. We found a significant improvement of proton affinities after optimizing Ud for n class="Chemical">Mg and Zn. It is important to note that fitting of the Hubbard derivatives does not significantly alter most properties of neutral molecules like equilibrium geometries, so with all the other parameters held fixed, Ud is tuned as the last step in the parametrization procedure. Although the repulsive potential, which papan class="Chemical">n class="Chemical">conpan>tains nuclear–nuclear repulsion and double-n class="Chemical">counting terms that depend on the reference density, can be directly computed, practical application to molecular systems with reasonable accuracy requires fitting the repulsive potential. The repulsive potential between atom type A and type B (Vabrep in eq 1) is described by a spline function in the covalent binding region. A cutoff is introduced at which the function and first three derivatives reach zero. The remaining degrees of freedom for the spline function are fitted to high-level ab initio values of geometries, reaction energies, and vibrational stretching frequencies. The overall quality of the repulsive potential is sensitive to the chosen division points. Therefore, establishing reliable repulsive potential parameters remains a time-consuming step despite the progress of partially automatized fitting procedures.[80,82] As described in the text, U, Espipan class="Chemical">n, εs,d, nmax, and α are n class="Chemical">consistenpan>t with the standard choices of npan> class="Chemical">DFTB parametrization and not subject to optimizations. By contrast, rwf, rwfd, rdens, εp, and Ud are adjusted based on properties of molecules in the fitting set. The repulsive potentials are fitted to several properties as summarized ipan class="Chemical">n Table 2. Reference values are calculated from ab initio methods that are known to produce results close to experimental data. For n class="Chemical">magnesium, optimized geometries of small closed-shell molecules are calculated from B3LYP[83−85]/aug-cc-pVTZ, and reaction enpan>ergies are calculated from the G3B3[86−88] method. For zinc, both optimized geometries and reaction enpan>ergies are calculated from B3LYP/aug-cc-pVTZ. Additional equations for the fitting process are prepared using a few vibrational stretching frequenpan>cies determined from BLYP/aug-cc-pVTZ (unscaled) using the harmonic approximation. B3LYP with good quality basis set has beenpan> shown to describe structures of n class="Chemical">Zn complexes in good agreement with X-ray[89] and CCSD(T)[43,90] data. However, heats of formation (HOF) predicted from B3LYP/aug-cc-pVTZ deviate ∼10 kcal/mol on average from experimental data of eight small complexes even though CCSD(T)/aug-cc-pVTZ does not do much better with a mean unsigned error of ∼7 kcal/mol.[91] B97-1,[92] B97-2,[93] and B98[94,95] have also been tested because they have been reported to be reliable for reproducing experimental data for small transition metal complexes;[96] geometries from the B97 family functionals and B3LYP are very similar, and reaction energies differ by ∼2 kcal/mol (data shown in the Supporting Information). Despite shortcomings in HOF, we have chosen to use the B3LYP functional because it is computationally less intensive so that large sets of consistent reference data are accessible, and it systematically gives reliable results for binding energies and proton affinities, which are of major importance to our work. Since the systems studied here are relatively small in size, dispersion interactions are not expected to impact the structural and energetic properties of interest. For a fairly large active site model for a zinc enzyme, we do discuss briefly the impact of dispersion. For even larger cases, we note that empirical dispersion models have been developed and tested for both SCC-DFTB[97,98] and DFTB3.[99]
Table 2

Parameters Defining the Repulsive Potentialsa

reference geometries for force equations
MgH2, Mg(CH3)2, MgO, [Mg(H2O)2]2+, [Mg(H2O)6]2+, [Mg(NH3)4]2+, MgS, [Mg(SH2)3(SH)]+, Mg(PH2)2, [Mg(PH3)4]2+, ZnH2, [Zn(CH3)]+, [Zn(CO)]2+, [Zn(H2O)3]2+, Zn(OH)2, Zn(H)(NH2), [Zn(NH3)4]2+, [Zn(SH)]+, [Zn(SH2)3]2+, [Zn(SH2)3(SH)]+, [Zn(PH2)]+, [Zn(PH3)4]2+, [Zn(PH3)3(PH2)]+
reference potential energy differences in kcal/mol
[Mg(H2O)2]2+[Mg(H2O)]2+ + H2O73.3
[Mg(H2O)3]2+[Mg(H2O)2]2+ + H2O59.2
[Mg(CO)4]2+[Mg(CO)3]2+ + CO30.9
[Mg(NH3)]2+Mg2+ + NH397.7
[Mg(NH3)2]2+[Mg(NH3)]2+ + NH383.9
[Mg(NH3)3]2+[Mg(NH3)2]2+ + NH363.1
[Mg(NH3)4]2+[Mg(NH3)3]2+ + NH350.1
[Mg(NH3)5]2+[Mg(NH3)4]2+ + NH328.6
[Mg(NH3)6]2+[Mg(NH3)5]2+ + NH326.3
[Mg(SH2)2]2+[Mg(SH2)]2+ + SH262.4
[Mg(SH2)3]2+[Mg(SH2)2]2+ + SH243.9
[Mg(SH2)4]2+[Mg(SH2)3]2+ + SH234.5
[Mg(SH2)5]2+[Mg(SH2)4]2+ + SH220.0
[Mg(PH3)2]2+[Mg(PH3)]2+ + PH366.9
[Mg(PH3)4]2+[Mg(PH3)3]2+ + PH334.8
[Zn(H2O)]2+Zn2+ + H2O104.3
[Zn(H2O)4]2+[Zn(H2O)3]2+ + H2O43.0
[Zn(CO)]2+Zn2+ + CO81.6
[Zn(CO)2]2+[Zn(CO)]2+ + 2CO149.8
[Zn(CO)3]2+[Zn(CO)]2+ + 3CO190.1
[Zn(SH2)]2+Zn2+ + SH2126.1
[Zn(SH2)2]2+Zn2+ + 2SH2210.5
[Zn(PH3)4]2+[Zn(PH3)3]2+ + PH330.5

See text for the methods used as references for different properties.

Condensed-Phase Benchmark Calculations

To supplement the gas-phase calculatiopan class="Chemical">ns, simulations are carried out with n class="Chemical">DFTB3/MM to investigate if the parametrized model works in anpan> explicit n class="Chemical">condensed-phase environment. They are important tests because the ultimate goal is to use n class="Chemical">DFTB3/3OB in enzyme simulations.

Mg2+ and Zn2+ Ion Solvation and pKa Calculations

Both MM and QM/MM simulatiopan class="Chemical">ns are performed using CHARMM.[100] The generalized solvent boundary potential (n class="Chemical">GSBP)[101,102] is used to treat long-range electrostatic interactions in MD simulations. The system is set up with an inpan>ner region of 20 Å. All molecules are located in the inner region, and the outer region is described with a n class="Chemical">constant dielectric continuum with εw = 80. To be consistent with the GSBP protocol, electrostatic interactions among inner region atoms are treated with the extended electrostatic model[103] in which interactions beyond 12 Å are modeled with multipolar expansions, including the dipolar and quadrupolar terms. The reaction field matrix M is evaluated using 400 spherical harmonics for GSBP simulations. The ion is fixed at the center of the sphere. Water molecules are described with the modified version of TIP3P.[104] In the classical MM simulations, standard CHARMM force field[105] parameters are used for Mg2+ and Zn2+. The QM/MM calculations employ the same n class="Chemical">water sphere as in MM simulations. The QM region includes the metal ion together with the nearest 6 water molecules (referred to as small QM region) or the nearest 25 water molecules (referred to as large QM region) which include the first and the second solvation shells. All the remaining water molecules are represented by the TIP3P water model. In the large QM region simulations, to ensure the QM water molecules are always closest to the metal ion, Flexible Inner Region Ensemble Separator (FIRES[106]) restraining potential is imposed to any outer sphere MM water molecules that are closer to the ion than the most distant inner sphere QM water molecule. The FIRES potential takes the form EFIRES = (1/2)∑kFIRES(r – Rinner)2 where Rinner is defined as the distance between the metal ion and the most distant inner QM water molecule from it. kFIRES is set to be 100 kcal/(mol·Å2). FIRES restraints are not necessary for the small QM region simulations because water molecules in the first solvation shell are strongly bound to the metal ion. All bonds involving hydrogen are constrained with the SHAKE algorithm,[107] allowing a 1 fs time step for MD propagation. Simulations are equilibrated for 500 ps followed by a production run of 500 ps. See text for the methods used as references for differepan class="Chemical">nt properties. To further evaluate the performance of papan class="Chemical">n class="Chemical">DFTB3/3OB for npan> class="Chemical">Mg and Zn from the energetics perspective, we perform calculations of relative pKa of Mg2+/Zn2+-bound water in a water sphere using the thermodynamic integration (TI)[108,109] approach within the dual topology single coordinate (DTSC) scheme; absolute pKa results are discussed in the Supporting Information.[66,110] In this approach, the dominant contribution to the total free energy of deprotonation is from the electrostatic free energy change (ΔFM) associated with converting the acidic proton to a dummy atom (D), i.e., transforming M2+·(H2O)6 to M2+·(H2O)5(HOD)−, where M represents the metal ion, Mg2+ or Zn2+. The corresponding free energy derivative is given bywhich represents the QM/MM energy difference averaged for a specific coupling parameter λ using the same set of coordinates for both protonation states. The total electrostatic energy contribution (ΔFM) is determined by integrating the free energy derivatives (∂F/∂λ) over λ from 0 to 1. Estimation of the absolute pKa is in principle possible with the DTSC-TI protocol, but the solvation free energy of a proton has to be taken into account. To avoid the difficulty of getting accurate experimental/calculated values for this term, we compute the relative pKa between Mg2+·(H2O)6 and Zn2+·(H2O)6 in water so that other contributions are canceled out to a great extent, such as the zero-point energy difference between the protonated and deprotonated states and van der Waals interactions involving the acidic proton.[66,110] Since ΔFM is electrostatic in nature, we expect that the free energy derivative depends largely linearly on the coupling parameter λ in this simple system; this is supported by actual calculations (see below). We carry out pKa calculations with both small QM region and large QM region as defined above. To avoid dissociation of hydroxide from the metal ion in the fully deprotonated state (λ = 1 window M2+·(H2O)5(HOD)−), a NOE potential is added to the distance between M2+ and OH–. The NOE potential takes the formin which rmin and rmax set the interval between which the restraining potential is zero; they are taken to be 1.8 and 2.5 Å, respectively. kmax is set to be 1000 kcal/(mol·Å2). To prevent the dummy atom D from being too close to the metal site in the fully deprotonated state, a weak angular constraint is applied to the M–O–D angle to keep it no smaller than 120°.

Myosin-II Enzyme Setup: Test for a Mg Site

The enzyme model is built from a high-resolutiopan class="Chemical">n X-ray structure for the n class="Chemical">Mg·n class="Chemical">ADP·vanadata bound myosin[111] (PDB code 1VOM), which corresponds to the hydrolyzing state of the D. discoideum myosin-II motor domain.[112,113] As described in our earlier work,[68,114,115] starting from the PDB structure, the ADP·vanadata is replaced by ATP or ADP·Pi. The QM region includes the triphosphate and part of the ribose group of ATP, three water molecules (include the lytic water) in the active site, Ser 181, Ser 236, the Mg2+ ion, and all its ligands (Thr 186 and Ser 237 and two water molecules). Only side chains of protein residues are included in the QM region, and link atoms are introduced to saturate the valence of the QM boundary atoms. ATP is assumed to be fully deprotonated, and the titratable groups are kept in their normal protonation states (i.e., Lys, Arg are protonated, Asp, Glu are deprotonated) which are consistent with a simple estimate of pKa using the Poisson–Boltzmann (PB) approach.[116] GSBP is used as the simulation protocol. The system is divided into a 24 Å spherical inner region centered at the Mg ion. Newtonian equations-of-motion are solved for the MD region (within 20 Å), and Langevin equations-of-motion are solved for the buffer region (20–24 Å) with a temperature bath of 300 K. Protein atoms in the buffer region are harmonically constrained with force constants determined from the crystallographic B-factors.[117] Protein atoms in the MM region are described by the all-atom CHARMM22 force field,[105] and water molecules are described with the TIP3P model. All bonds involving hydrogen are constrained using SHAKE, and the time step is set to 1 fs. All the water molecules in the inner region are subject to a weak GEO type of restraining potential to keep them inside the inner sphere with the MMFP module in CHARMM. The static field due to outer-region atoms, ϕs(io), is evaluated with the linear PB equations. The reaction field matrix M is evaluated with 625 spherical harmonics. In the PB calculations, the protein dielectric constant is set as 1; the water dielectric constant is set as 80; and 0.0 M salt concentration is used. Our previous[118] and recent[119] analyses indicated that the GSBP protocol is reliable for a site well shielded from the bulk solvent. To study the structural flexibility in the active site of myosipan class="Chemical">n before and after n class="Chemical">ATP hydrolysis, potential of mean force (PMF) simulations have been carried out. The reaction coordinate is defined as the distance between Mg2+ and atom O in Ser 237 (see Figure 5). The umbrella sampling approach[120] is used to restrain the system along the reaction coordinate with a force constant of 150 kcal/mol·Å–2. Thirty-two windows are used for PMF, and a 500 ps simulation is performed for each window. Convergence of PMF is monitored by examining the overlap of reaction coordinate distributions sampled in different windows. Weighted histogram analysis (WHAM)[121] is used to postanalyze the probability distribution to obtain the PMF along the reaction coordinate. Errors are calculated from block average.
Figure 5

Structural properties of the Myosin active site during equilibrium QM/MM MD simulations with different nucleotide chemical states (ATP or ADP·Pi). (a) Overlay of average DFTB3/MM structure and crystal structure (gray) of Myosin-II motor domain active site in the ATP state. The VO4 moiety in the crystal structure (PDB code 1VOM) is replaced by a PO3 phosphate group and a presumptive lytic water molecule. Instantaneous distances between Mg2+ and oxygen atoms in its four nonwater ligands at (b) ATP state and (c) ADP·Pi state. Ser237 refers to OSer237 and Thr186 refers to OThr186. (d) PMF calculations for the ATP/ADP·Pi states. The reaction coordinate is the distance between Mg2+ and OSer237.

Alkaline Phosphatase Enzyme Setup: Test for a Bimetallic Zinc Site

The enzyme model is papan class="Chemical">n class="Chemical">conpan>structed based on the X-ray structures for the n class="Species">E. coli Alkaline Phosphatase (AP) mutant R166S with bound n class="Chemical">inorganic phosphate at 2.05 Å resolution (PDB code 3CMR(122)). The substrate methyl p-nitrophenyl phosphate (MpNPP–) is first “mutated” to the orientation with the −OMe group oriented toward the magnesium ion (denoted as α orientation) starting from the PDB structure. All basic and acidic amino acids are kept in their physiological protonation states except for Ser102 in AP, which is accepted to be the nucleophile and deprotonated in the reactive complex. Water molecules are added following the standard protocol of superimposing the system with a water droplet of 25 Å radius centered at Zn12+ (see Figure 6 for atomic labels) and removing water molecules within 2.8 Å from any heavy atoms resolved in the crystal structure. GSBP is used to treat long-range electrostatic interactions in MD simulations. In the QM/MM simulations, as described in details in our previous work,[48,49] the QM region includes the two zinc ions and their six ligands (Asp51, Asp369, His370, Asp327, His412, His331), Ser102, and the substrate MpNPP–. Only side chains of protein residues are included in the QM region, and link atoms are added between Cα and Cβ atoms. A NOE potential is added to the C–O bonds in Asp51 in AP to prevent artificial polarization.[48] The integration time step is 1 fs, and all bonds involving hydrogens are constrained using SHAKE. The DFTB3/MM results are also compared to MM simulations using a conventional nonbonded zinc model[14] (referred to as a Coulomb scheme) or short–long effective functions (SLEF1)[25] model developed by Zhang and co-workers. Protein atoms in the MM region are described by the all-atom CHARMM22 force field, and water molecules are described with the TIP3P model.
Figure 6

Structural properties of the AP active site during equilibrium QM/MM MD Simulations. A snapshot for the reactant state with (a) Coulomb MM force field, (b) SLEF1 zinc MM force field, (c) DFTB3/3OB, and (d) overlay of crystal and DFTB3/MM structure (blue), with average key distances in angstroms labeled. Asp369, His370, and His412 are omitted for clarity. Substrate inorganic phosphate in the crystal structure (3CMR) is not shown.

To further benchmark the applicability of papan class="Chemical">n class="Chemical">DFTB3/3OB to the reactionpan> of interest, we also study an active site model that includes all QM atoms in the QM/MM enpan>zyme model. The Cβ n class="Chemical">carbons are fixed at their positions in the crystal structure during geometry optimization; the positions of the link atoms used to saturate the valence of the Cβ atoms in the active site model are also fixed. The reactant (Michaelis) complex and transition state are located for MpNPP– (methyl p-nitrophenyl phosphate), MmNPP– (methyl m-nitrophenyl phosphate), and n class="Chemical">MPP– (methyl phenyl phosphate) using DFTB3 and B3LYP with the 6-31+G(d,p) basis set. The minimum energy path (MEP) calculations are carried out by one-dimensional adiabatic mapping at the DFTB3 level; the reaction coordinate is the antisymmetric stretch involving the breaking and forming P–O bonds. Subsequently, the saddle point is further refined by conjugated peak refinement (CPR[123]) to obtain precise transition state structure. Single-point energy calculations at DFTB3 and B3LYP geometries are then carried out using B3LYP, M06,[124] PBE, and MP2 methods using a larger basis set of 6-311++G(d,p). The D3[125] dispersion corrections are added for B3LYP, M06, and PBE methods.

Results and Discussions

In the followipan class="Chemical">ng subsections, we present benchmark calculations for n class="Chemical">DFTB3/3OB regarding geometries, n class="Chemical">metal–ligand dissociation energies, and proton affinities for Mg/Zn containing molecules and compare the results with commonly used ab initio methods in the gas phase. We also report DFTB3 results when the old set of MIO parameters is used:[75] DFTB3/MIO use parameters as defined in ref (75), Ud = −0.23, −0.16, −0.13, −0.19, −0.14, and −0.09 au for C, H, N, O, P, and S and ζ = 4.2; Ud for Mg/Zn is fitted to −0.1 to achieve the best sequential ligand dissociation energies and proton affinities. Unless noted otherwise, energies are given for geometries that are optimized at the respective level of theory. For PM6, we note that the choice of heat of formation for protons is essential to proton affinities; we take −54.0 kcal/mol for protons as suggested by the MOPAC program. To compare the potential energy surfaces between different computation levels, zero-point energies (ZPEs) are not included. For the calculations of noncovalent interactions, dispersion interactions are not included because most test molecules discussed here are small, although we are aware of the importance of dispersion in general. We explore explicitly the effects of dispersion to a nontrivial active site model of AP that involves zinc. For DFTB3 calculations, we have used our in-house DFTB code, and all PM6, DFT, and MP2 calculations are performed using the Gaussian09 software package.[126] Following the gas-phase benchmarks, we show results on Mg2+/Zn2+-water structural properties and relative pKa in solution. Finally we apply DFTB3/3OB to enzyme systems to further test the new parameters.

Magnesium

Geometries

The “small” n class="Chemical">magnesiumpan> test set is n class="Chemical">composed of 55 molecules, which are simplified biologically relevant ligated n class="Chemical">magnesium species of the form Mg2+[X] with X = H2O, NH3, SH2, PH3, CO for n = 1–4/6 and X = OH–, SH–, NH2–, and PH2– for n = 1–2. Table 3 summarizes the results, and further details are included in the Supporting Information. Good agreement is seen between MP2 and B3LYP with a small or large basis set. Of the semiempirical methods, geometries are described very well by DFTB3 models and particularly well by DFTB3/3OB. The MADs in metal–ligand distances are 0.013, 0.017, and 0.050 Å for DFTB3/3OB, DFTB/MIO, and PM6, respectively. 3OB improves over MIO, although the magnitude of improvement is modest. The largest metal–ligand distance error of 0.072 Å from DFTB3/3OB is found in Mg–S distance in [Mg(SH2)]2+, which is likely due to the minimal basis nature of DFTB3 for a polarizable ligand. Other deviations in metal–ligand distances are substantially smaller in DFTB3/3OB. Molecules containing S or P are described poorly by PM6 within our test case;[78] PM6 systematically underestimates Mg–S distance by 0.2 Å and Mg–P distance by 0.1 Å. For angles, DFTB3/3OB differs from the reference data by a MAD of 2.9°, which is of the same order with other semiempirical methods. The largest angle derivation of 45.3° is for angle Mg–O–H in [Mg(OH)]+, for which DFTB3/3OB predicts a linear rather than bent structure; interestingly, MP2 also predicts a structure almost linear, while PM6 is close to the B3LYP result (see Supporting Information). Future development of DFTB3 to include multipoles may improve the geometry in this case. It is worth pointing out that the Mg–S–H angle is generally overestimated by ∼13° in DFTB3/3OB. More statistics including MAD for different metal–ligand types are given in the Supporting Information.
Table 3

Mean and Maximum Absolute Deviation of Metal–Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ for the Magnesium Test Set

propertyaNbMP2cB3LYPdPM6DFTB3/MIODFTB3/3OB
r (Å)1200.0090.0090.0500.0170.013
rmax (Å) 0.0510.0240.2840.1410.072
a (deg)1170.90.53.43.62.9
amax (deg) 23.710.238.741.345.3

Max stands for maximum absolute deviation.

Number of comparisons.

Basis set aug-cc-pVTZ.

Basis set 6-31+G(d,p).

Max stands for maximum absolute deviation. n class="Chemical">Number of n class="Chemical">comparisons. Basis set aug-cc-pVTZ. Basis set 6-31+G(d,p).

Sequential Metal–Ligand Dissociation Energies

We define sequepan class="Chemical">ntial ligand dissociation energies (n class="Chemical">sBDEs) n class="Chemical">corresponding to the reactionwhere M is Mg2+/Zn2+ and L is a neutral or charged ligand. Table 4 shows the sequential ligapan class="Chemical">nd dissociation energies for compounds n class="Chemical">containing Mg2+. Values obtained for DFTB3/3OB and other methods are compared to those obtained using G3B3. The differences between B3LYP, MP2, and G3B3 are relatively small; however, complexes with charged ligands give several deviations greater than 5 kcal/mol. DFTB3/3OB shows a MAD of 2.4 kcal/mol for complexes with neutral ligands and 13.8 kcal/mol for complexes with charged ligands. Compared to DFTB3/MIO, DFTB3/3OB improves substantially for complexes involving P, S elements and charged ligands. For the other semiempirical method PM6, the deviation is 9.9 kcal/mol compared to G3B3. The large error is anticipated because geometries containing S and P are often poor. Note that we compare potential energy differences for all the methods except PM6 which intrinsically considers heats of formation. We have also carried out single-point G3B3 calculations at DFTB3/3OB geometries, and the MAD for Mg2+ compounds with neutral and charged ligands is less than 1 kcal/mol. These results highlight the similarities between DFTB3 and B3LYP optimized structures, suggesting that DFTB3/3OB is of particular value for the study of biologically relevant magnesium-containing molecules.
Table 4

Sequential Ligand Dissociation Energies of Magnesium -Containing Molecules Compared to G3B3a

moleculebG3B3MP2cB3LYPcPM6DFTB3/MIODFTB3/3OBG3B3//DFTB3d
[Mg(H2O)2]2+73.3–3.5–0.1–10.9–3.4+0.5+0.6
[Mg(H2O)3]2+59.2–2.0–1.1–11.7–1.5+1.4–0.3
[Mg(H2O)4]2+49.1–2.3–3.0–11.0–3.3–1.9+1.2
[Mg(H2O)5]2+34.7–2.2–5.0–7.4–6.7–6.4–0.2
[Mg(H2O)6]2+32.9–2.5–5.9–11.0–6.8–6.4+0.4
[Mg(CO)2]2+46.3–0.9–0.3+10.0–2.9–3.8–0.6
[Mg(CO)3]2+36.1+0.5–0.9+1.3–0.3–2.3–0.7
[Mg(CO)4]2+30.9–0.1–2.8–1.5–0.1–2.6–0.9
[Mg(NH3)2]2+83.9–3.4–0.3–4.4–13.8–6.4+0.5
[Mg(NH3)3]2+63.1–1.8–2.0–9.2–7.5–1.4+0.2
[Mg(NH3)4]2+50.1–1.8–3.7–8.8–3.9+1.2+0.2
[Mg(NH3)5]2+28.6–1.0–5.3–2.9–5.8–3.5–0.2
[Mg(NH3)6]2+26.3–1.5–6.2–5.5–5.6–2.5–0.2
[Mg(SH2)2]2+62.4–3.0–0.3–7.0–7.9–2.9–1.2
[Mg(SH2)3]2+43.9–1.1–2.6–12.1–4.5+0.1–0.6
[Mg(SH2)4]2+34.5–0.8–4.4–13.3–2.8+1.2+1.7
[Mg(SH2)5]2+20.0+0.2–6.1–12.6+0.0+1.1–0.8
[Mg(SH2)6]2+20.7–0.5–7.6–14.0–1.8–0.1+0.5
[Mg(PH3)2]2+66.9–2.3–0.5–2.4–8.6–0.1+0.7
[Mg(PH3)3]2+44.5–0.8–2.9–18.6–5.4+2.3+0.7
[Mg(PH3)4]2+34.8–0.6–4.5–17.9–3.8+3.2+0.6
[Mg(OH)2]243.1–7.2–4.6–7.1–21.0–14.0+0.1
[Mg(NH2)2]237.9–7.3–6.6–4.6–9.4–12.0+0.9
[Mg(SH)2]201.2–3.1–5.0–30.3–16.1–15.8+0.1
[Mg(PH2)2]188.8–1.8–5.2+11.0–20.9–13.3+0.6
MAD 2.13.59.96.64.30.6
MAX 7.37.630.321.015.81.7

Energies except PM6 are calculated at 0 K excluding zero-point energy and thermal corrections. All numbers are given in kcal/mol.

The nondissociated is listed.

Basis set aug-cc-pVTZ.

G3B3 single-point calculations on top of DFTB3/3OB geometries.

Proton Affinities

Table 5 lists the proton affipan class="Chemical">nities for 22 biological relevant molecules n class="Chemical">Mg2+(XH) where X = O, n class="Chemical">N, S, and P obtained from G3B3, DFTB3/3OB, and other methods. MADs for MP2 and B3LYP are generally small (∼1 kcal/mol). The discrepancy between DFTB3/3OB and G3B3 is 5 kcal/mol, whereas DFTB3/MIO and PM6 yield slightly larger MADs of 7.8 and 6.6 kcal/mol, respectively. We note that G3B3 single points at DFTB3/3OB optimized structures lead to small MAD of 1.5 kcal/mol and MAX of 3.9 kcal/mol (due to the unsatisfactory deprotonated [Mg(SH2)]2+ structures) compared to the original G3B3, again highlighting the good quality of DFTB3/3OB structures. Future improvements should focus on the nitrogen-containing species, an issue we noted in previous studies.[75,77]
Table 5

Gas-Phase Proton Affinities of Magnesium-Containing Molecules Compared to G3B3a

moleculebG3B3MP2cB3LYPcPM6DFTB3/MIODFTB/3OBG3B3//DFTB3d
[Mg(H2O)1]2+95.6+0.6–3.5+0.1–5.5–13.9–0.8
[Mg(H2O)2]2+115.6+0.0–1.4–5.4–2.2–9.0–0.9
[Mg(H2O)3]2+133.1+0.6+0.2–7.1+3.0–1.2–0.4
[Mg(H2O)4]2+149.2–0.6–0.1–11.4+4.8+2.3+0.8
[Mg(H2O)5]2+160.3–2.2–1.9–15.7+3.2+2.0+1.1
[Mg(H2O)6]2+170.5–3.0–2.2–24.3+0.2–0.2+0.8
[Mg(NH3)1]2+120.2+1.5–3.5+2.6–10.5–15.7+0.0
[Mg(NH3)2]2+144.1+0.8–0.4+1.5–10.2–14.1+0.2
[Mg(NH3)3]2+163.9+0.5+0.3+1.9–10.6–13.6+0.5
[Mg(NH3)4]2+181.3–0.1+0.6+1.0–6.4–7.5+1.0
[Mg(NH3)5]2+190.5–0.4+0.9+4.9–5.2–6.1+1.4
[Mg(NH3)6]2+200.7+1.8–0.5+5.0–6.2–5.4+2.3
[Mg(SH2)1]2+83.4+2.1–1.8+2.4–6.7–0.7–3.6
[Mg(SH2)2]2+107.0+0.8+0.5+4.6–6.2+0.8–2.9
[Mg(SH2)3]2+123.1+0.4+1.5+3.3–6.2+2.1–3.5
[Mg(SH2)4]2+135.4+0.0+2.0+1.0–5.7+4.2–1.7
[Mg(SH2)5]2+143.1–0.2+2.0–5.7–6.9+3.9–3.9
[Mg(SH2)6]2+151.1–0.6+1.8–12.4–8.7+3.5–3.0
[Mg(PH3)1]2+96.8+4.1–1.8+12.6–12.8–1.1+0.6
[Mg(PH3)2]2+129.4+1.7+0.4+10.2–15.3–0.4+1.4
[Mg(PH3)3]2+150.1+1.0+1.2+8.2–17.6–1.7+1.4
[Mg(PH3)4]2+164.2+0.7+1.7+3.0–18.1–0.7+1.2
MAD 1.11.46.67.85.01.5
MAX 4.13.524.318.115.73.9

Energies except PM6 are calculated at 0 K excluding zero-point energy and thermal corrections. All numbers are given in kcal/mol.

The protonated form is listed.

Basis set aug-cc-pVTZ.

G3B3 single-point calculations on top of DFTB3/3OB geometries.

Energies except papan class="Chemical">n class="Chemical">PM6 are calculated at 0 K excluding zero-poinpan>t energy and thermal n class="Chemical">corrections. All numbers are given in kcal/mol. The nondissociated is listed. Basis set aug-cc-pVTZ. G3B3 single-point calculations on top of n class="Chemical">DFTB3/3OB geometries. Energies except papan class="Chemical">n class="Chemical">PM6 are calculated at 0 K excluding zero-poinpan>t energy and thermal n class="Chemical">corrections. All numbers are given in kcal/mol. The protonated form is listed. Basis set aug-cc-pVTZ. G3B3 single-point calculations on top of n class="Chemical">DFTB3/3OB geometries.

Large Test Set

To further test n class="Chemical">DFTB3pan>/3OB parameters we have n class="Chemical">compiled a “ln class="Chemical">arge test set” that models biologically relevant ligands of magnesium with larger molecules than those discussed above. The set mainly consists of model magnesium compounds that are simplified active sites of crystal structures collected in a survey.[127] Although Mg2+ is generally 6-coordinated, complexes with coordination number of 4 and 5 are also investigated. All complexes are constructed in the way that all ligands are coordinated to the central magnesium ion rather than via hydrogen bonds. CH3COO–/HCOO– is used to model the Asp/Glu side chain, CH3OH to represent the Ser/Thr side chain, and OCHNH2 to represent carbonyls. In total, the set contains 58 magnesium compounds (see Supporting Information). n class="Chemical">Metalpan>–ligand distances calculated with n class="Chemical">DFTB3/3OB deviate only by 0.018 Å on average and 0.125 Å at most (Table 6). The case with the largest error is the Mg–S distance in [n class="Chemical">Mg(SH2)4(SH)]−. Besides this outlier, all other metal–ligand length errors are substantially smaller within 0.1 Å. DFTB3/3OB outperforms DFTB3/MIO in the large test set. B3LYP with the small basis set 6-31+G(d,p) deviates on average by 0.011 Å, which is comparable with DFTB3/3OB, and the maximum deviation is 0.033 Å, which is much better than 3OB. PM6 strongly underestimates Mg–O distance by 0.04 Å. More statistics including mean absolute deviations for different metal–ligand types are given in the Supporting Information. For the angle, the MADs are 3.1, 4.5, 0.7, and 11.2°, respectively, for DFTB3/3OB, DFTB3/MIO, B3LYP/6-31+G(d,p), and PM6. In the O–Mg–O angles calculated with DFTB3/3OB, about 6% out of 505 angles deviate more than 10°; the large errors stem mainly from erroneous orientation of hydroxide.
Table 6

Mean and Maximum Absolute Deviation of Metal–Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ for the Magnesium Large Test Set

propertyaNbB3LYPcPM6DFTB3/MIODFTB3/3OB
r (Å)3290.0110.0630.0410.018
rmax (Å) 0.0331.4140.4810.125
a (deg)6580.711.24.53.1
amax (deg) 17.464.037.530.8

Max stands for maximum absolute deviation.

Number of comparisons.

Basis set 6-31+G(d,p).

Due to the relatively high affinity of papan class="Chemical">n class="Chemical">ATP/n class="Chemical">ADP to Mg2+ and the importance of this metal chelate in the ATP-binding enzyme, we have constructed model molecules involving Mg and phosphate/model AT(D)P as an additional test (Figure 1). The precise binding mode of Mg2+ to ATP remains unclear despite extensive spectroscopic studies. NMR studies proposed several models for the Mg·ATP complex: β-monodentate,[128] β,γ-bidentate,[129] α,β,γ-tridentate,[130,131] and a mixture of α,β-, β,γ-, and α,γ-bidentate.[132] Raman and infrared spectra suggested a mixture of β,γ-bidentate and α,β,γ-tridentate.[133] Here two stable binding modes are tested, one with Mg coordinating β- and γ-phosphates and the other with Mg coordinating α-, β-, and γ-phosphates.[134] Geometries are described very well by DFTB3/3OB. The Mg–O distance agrees on average within 0.04 Å with B3LYP/aug-cc-pVTZ calculations. The O–Mg–O, O–P–O, and P–O–P angles differ on average 5, 1, and 7°, respectively.
Figure 1

Optimized structures of Mg·AT(D)P·water model molecules at different QM levels. Distances are given in angstroms. The numbers without parentheses were obtained at the B3LYP aug-cc-pVTZ level; those in the parentheses are DFTB3 values. The B3LYP optimized structures are depicted in colors, whereas DFTB3 optimized structures are colored in blue.

Max stands for maximum absolute deviation. n class="Chemical">Number of n class="Chemical">comparisons. Basis set 6-31+G(d,p). Optimized structures of n class="Chemical">Mgpan>·AT(D)P·n class="Chemical">water model molecules at different QM levels. Distances are given in angstroms. The numbers without parentheses were obtained at the B3LYP aug-cc-pVTZ level; those in the parentheses are n class="Chemical">DFTB3 values. The B3LYP optimized structures are depicted in colors, whereas DFTB3 optimized structures are colored in blue. Ligand dissociatiopan class="Chemical">n energies are compared in Table 7 (for detailed data, see Supporting Inclass="Chemical">pan>formation). One neutral n class="Chemical">oxygen ligand is removed from the magnesium complexes selected from the large test set. Because the reactions are chosen somewhat arbitrarily, the statistics given in Table 7 have to be n class="Chemical">considered with care. MADs are reasonably small for all semiempirical methods; the MAD for DFTB3/3OB is 4.6 kcal/mol, even smaller than B3LYP/aug-cc-pVTZ with 5.3 kcal/mol. Interestingly, DFTB3/3OB and B3LYP turn out to systematically underestimate the dissociation energy of the oxygen ligand; the errors seem to be inherited from the sBDE reactions where one water molecule is taken away from [Mg(H2O)5]2+ or [Mg(H2O)6]2+. Further fitting Ud or εp cannot improve these reaction energies significantly. To demonstrate the good performance of DFTB3 for structural properties, we have carried out single-point G3B3 energies at DFTB3 geometries, and the MAD is only 0.8 kca/mol.
Table 7

Error Statistics for the Ligand Dissociation Energies and Proton Affinities of Magnesium Containing Large Molecules Compared to G3B3a

 B3LYPbPM6DFTB3/MIODFTB3/3OBG3B3//DFTB3c
Ligand DissociationEnergies    
MAD5.35.13.54.60.8
MAX7.913.06.67.72.2
Proton Affinities     
MAD1.917.22.72.60.9
MAX4.825.08.06.12.3

For detailed data, see Supporting Information.

Basis set aug-cc-pVTZ.

G3B3 single-point calculations on top of DFTB3/3OB geometries.

Protopan class="Chemical">n affinities are also n class="Chemical">compiled in Table 7. The MAD for n class="Chemical">DFTB3/3OB is 2.6 kcal/mol, similar to B3LYP/aug-cc-pVTZ with 1.9 kcal/mol. Despite the structural difference for the deprotonated structures mentioned above, the G3B3 single-point calculations at DFTB3/3OB geometries show excellent statistics for proton affinities, demonstrating the strength of DFTB3/3OB in reproducing B3LYP/aug-cc-pVTZ geometries or at least reproducing relative energies for multiple points on the potential energy surface. Similar to findings from our previous studies,[78] PM6 appears to be less reliable for proton affinities with large errors. For detailed data, see Supporting Information. Basis set aug-cc-pVTZ. G3B3 single-point calculations on top of n class="Chemical">DFTB3/3OB geometries.

Zinc

Typical biological ligands for papan class="Chemical">n class="Chemical">Zn2+ inpan>clude n class="Chemical">water, n class="Chemical">cysteine, histidine, and glutamic/aspartic acids. Therefore, we performed calculations for a series of complexes that involve the zinc ion and small molecule models for these ligands. Similar to what has been done for magnesium, we have compiled a test set in the form of Zn2+[X] with X = H2O, NH3, SH2, PH3, CO, NH=CH2, and SH–CH3 for n = 1–4/6 and X = OH–, SH–, NH2–, PH2–, NCH2–, and SCH3– for n = 1–2. Here two extra ligands NH=CH2 and SH–CH3 (and their deprotonated form) are added to model histidine and cysteine side chains. The structural comparisons are given in Table 8.
Table 8

Mean and Maximum Absolute Deviation of Metal–Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ for the Zinc Test Set

propertyaNbMP2cB3LYPdPM6DFTB3/MIOeDFTB3/3OB
r (Å)1630.0170.0040.0650.0270.012
rmax (Å) 0.0770.0110.5530.1390.091
a (deg)1800.60.24.23.02.3
amax (deg) 4.34.264.016.214.6

Max stands for maximum absolute deviation.

Number of comparisons.

Basis set aug-cc-pVTZ.

Basis set 6-31+G(d,p).

Molecules containing PH2– or H– do not converge and are excluded from statistics.

The n class="Chemical">DFTB3pan>/3OB results are tested against B3LYP/aug-cc-pVTZ for geometries and enpan>ergetics. Overall good geometries are obtained from n class="Chemical">DFTB3/3OB with metal–ligand distance MAD of 0.012 Å and angle MAD of 2.3°, which are notable improvements over DFTB3/MIO. The largest deviation 0.091 Å of n class="Chemical">metal–ligand length in DFTB3/3OB is for [Zn(N=CH2)]+. The distance between Zn and N is underestimated by ∼0.1 Å if r(ZnN) is shorter than 1.95 Å due to the imperfect ZnN repulsive potential at the short-range. Similar to cases for magnesium, angles Zn–S–H are uniformly shifted up by ∼10° in complexes containing ligand SH2 or SH–CH3 in DFTB3/3OB optimized structures. B3LYP/6-31+G(d,p) is structurally very similar to B3LYP/aug-cc-pVTZ with a distance MAD of merely 0.004 Å. Nevertheless, MP2/aug-cc-pVTZ is rather different from B3LYP/aug-cc-pVTZ with a MAD of 0.017 Å; metal–ligand distances between zinc and C, N, O, P, S are uniformly underestimated by 0.02–0.03 Å. Molecular geometries are described rather poorly by PM6 in our test set; for example, the metal–ligand lengths between zinc and phosphorus deviate on average 0.3 Å. More statistics for different metal–ligand types are given in the Supporting Information. Max stands for maximum absolute deviation. n class="Chemical">Number of n class="Chemical">comparisons. Basis set aug-cc-pVTZ. Basis set 6-31+G(d,p). Molecules n class="Chemical">containpan>inpan>g PH2– or H– do not pan> class="Chemical">converge and are excluded from statistics.

Sequential Ligand Dissociation Energies

Sequential ligapan class="Chemical">nd dissociation energies for zinc species with neutral and negatively charged ligands are summarized in Table 9. n class="Chemical">DFTB3/3OB shows a MAD of 3.6 kcal/mol compared to B3LYP/aug-cc-pVTZ. Again, deviations close to or greater than 10 kcal/mol are found for charged ligands. The difference between the small and large basis set in B3LYP is rather small, and even charged ligands produce errors no larger than 2.7 kcal/mol, indicating B3LYP with small basis set may be useful in ab initio QM/MM calculations. There is a slightly large discrepancy between MP2 and B3LYP; it may be attributed to the differences in structures between these two methods as discussed above. PM6 shows large deviations for sBDE with a MAD of 16.8 kcal/mol. With the unphysical Zn–P geometries removed from statistics, the MAD for PM6 drops to 9.1 kcal/mol, still substantially larger than that for DFTB3. We have also performed single-point calculations of B3LYP/aug-cc-pVTZ on top of DFTB3 geometries. The results are remarkably good with a MAD of only 0.5 kcal/mol.
Table 9

Sequential Ligand Dissociation Energies of Zinc Containing Molecules Compared to B3LYP/aug-cc-pVTZa

moleculebB3LYPcMP2cB3LYPdPM6DFTB3/MIODFTB3/3OBB3LYP//DFTB3e
[Zn(H2O)2]2+89.7–2.9+2.0–15.7–5.2+4.3+0.2
[Zn(H2O)3]2+56.4+1.2+1.5+1.4–1.1–2.2–0.1
[Zn(H2O)4]2+43.0+2.2+1.9+3.2–2.0–0.9+0.8
[Zn(H2O)5]2+25.3+3.2+2.6+1.8–3.8–1.7–0.4
[Zn(H2O)6]2+23.4+3.7+2.3–3.2–3.9–1.5+0.3
[Zn(CO)2]2+68.2+0.4–0.4+28.0–5.8–3.7+0.4
[Zn(CO)3]2+40.3+3.6+0.1+15.3+1.2–7.3+0.3
[Zn(CO)4]2+30.6+4.5+0.3+16.9+3.9–4.6–0.2
[Zn(NH3)2]2+108.3–0.6+1.9–8.1–7.8–4.3+0.4
[Zn(NH3)3]2+59.7+2.9+2.4+8.0+0.2–2.6+0.3
[Zn(NH3)4]2+43.8+3.6+2.0+9.4+4.1+2.8+0.1
[Zn(NH3)5]2+15.3+3.5+2.1+1.7–2.7+3.2–0.2
[Zn(NH3)6]2+13.1+4.3+1.8+1.4–2.3+4.1–0.4
[Zn(SH2)2]2+84.4+0.0+0.5–5.8–2.0+1.0–1.4
[Zn(SH2)3]2+42.9+3.9+0.5–1.6+3.5+1.8–0.1
[Zn(SH2)4]2+29.1+5.2+1.2+0.4+4.8+4.9+2.0
[Zn(PH3)2]2+89.4+1.9+1.1+55.7–7.1+1.2+0.7
[Zn(PH3)3]2+43.0+4.9+0.8+51.9+0.6+3.7+0.6
[Zn(PH3)4]2+30.5+7.0+0.9+65.9+3.0+5.5+0.5
[Zn(NH=CH2)2]2+111.9+0.5+0.9–2.2–1.7+0.1+0.1
[Zn(NH=CH2)3]2+60.1+4.4+1.2+10.5+1.3–1.5–0.4
[Zn(NH=CH2)4]2+43.5+5.9+1.1+9.8+3.7+2.8+0.4
[Zn(SH–CH3)2]2+92.2+2.1+0.1–8.2–5.8–2.8–0.3
[Zn(SH–CH3)3]2+46.0+6.4+0.5–1.2+0.5–0.4+0.4
[Zn(SH–CH3)4]2+31.1+8.1+1.1–0.1+2.3+2.9+0.3
[Zn(OH)2]253.8+1.9+2.7+12.0–15.1–3.4+0.5
[Zn(NH2)2]248.6+4.9+2.6+15.9+11.5–4.5+3.1
[Zn(SH)2]213.2+6.1+0.2–22.1–6.7–9.1–0.2
[Zn(PH2)2]199.3+7.1+0.8+108.8-f–11.7+0.9
[Zn(OCH3)2]235.4+7.6+2.2+6.9–10.0–0.5+0.2
[Zn(NCH2)2]220.6+10.9+0.6+25.7+13.8+8.0+1.1
[Zn(SCH3)2]207.4+8.2+0.5–18.7–7.8–7.5+0.0
MAD 4.21.316.84.73.60.5
MAX 10.92.7108.815.111.73.1

Energies except PM6 are calculated at 0 K excluding zero-point energy and thermal corrections. All numbers are given in kcal/mol.

The nondissociated is listed.

Basis set aug-cc-pVTZ.

Basis set 6-31+G(d,p).

B3LYP/aug-cc-pVTZ single-point calculations on top of DFTB3/3OB geometries.

Molecule does not converge and is excluded from the statistics.

Table 10 shows the protonatiopan class="Chemical">n energies calculated for 24 biologically relevant ligand model molecules obtained using B3LYP, n class="Species">MP2, and semiempirical methods. As expected, B3LYP/6-31+G(d,p) performs very well n class="Chemical">compared to B3LYP with a large basis set. The MAD for DFTB3 is 4.2 kcal/mol, which is substantially better than DFTB3/MIO and PM6; the errors are small enough to render this approach useful. High-level single-point correction at DFTB3/3OB geometries once again delivers excellent proton affinities except the ones involving the SH– ligand whose geometries are less well described as discussed above.
Table 10

Gas-Phase Proton Affinities of Zinc Containing Molecules Compared to B3LYP/aug-cc-pVTZa

moleculebB3LYPcMP2cB3LYPdPM6DFTB3/MIODFTB/3OBB3LYP//DFTB3e
[Zn(H2O)1]2+61.7+5.4–0.6+7.8–1.2–7.7+0.8
[Zn(H2O)2]2+94.2+0.6–0.3–2.5–1.1–2.8+0.5
[Zn(H2O)3]2+121.3–0.7–0.5–9.6+1.9–1.2+0.1
[Zn(H2O)4]2+141.5–1.6–0.7–16.9+2.3–1.3+0.7
[Zn(NH3)1]2+86.9+4.1+0.1+6.5–13.6–10.8+1.8
[Zn(NH3)2]2+127.3–0.8+0.7–2.9–18.5–10.7+0.6
[Zn(NH3)3]2+155.0–1.7+1.0–9.7–18.0–10.6+0.2
[Zn(NH3)4]2+178.1–2.2+1.0–4.8–10.3–8.4+0.7
[Zn(SH2)1]2+56.6+1.6–1.7+16.1–13.0–5.4–4.0
[Zn(SH2)2]2+93.2–2.0–1.5+12.5–10.6–0.2–4.0
[Zn(SH2)3]2+115.0–2.6–2.0+7.7–9.8–0.5–4.0
[Zn(SH2)4]2+131.1–3.0–2.2+6.0–9.1–1.0–2.1
[Zn(PH3)1]2+66.9+3.2–0.7+21.8-f–1.5+0.3
[Zn(PH3)2]2+115.6–0.2–0.3–5.7-f–0.9+0.8
[Zn(PH3)3]2+140.8–1.0–0.1–30.0-f–2.7+0.7
[Zn(PH3)4]2+158.0–1.6+0.1+14.3–18.1–3.2+1.0
[Zn(NH=CH2)1]2+89.1+6.1–0.9+2.3–16.2–2.8+1.1
[Zn(NH=CH2)2]2+140.9–0.4–0.2–12.6–14.1–2.1+0.4
[Zn(NH=CH2)3]2+168.6–1.5+0.1–18.1–14.2–2.2+0.0
[Zn(NH=CH2)4]2+188.1–2.1–0.1–14.3–14.8–1.6+2.4
[Zn(SH–CH3)1]2+74.2+0.3–2.5+12.1–4.6–8.3–2.5
[Zn(SH–CH3)2]2+114.9–3.0–2.3+5.6–10.0–4.2–2.1
[Zn(SH–CH3)3]2+136.7–4.1–2.4+1.1–10.5–4.8–1.7
[Zn(SH–CH3)4]2+152.6–6.0–2.2+0.3–2.6–5.9–1.4
MAD 2.31.010.110.24.21.4
MAX 6.12.530.018.510.84.0

Energies except PM6 are calculated at 0 K excluding zero-point energy and thermal corrections. All numbers are given in kcal/mol.

The protonated form is listed.

Basis set aug-cc-pVTZ.

Basis set 6-31+G(d,p).

B3LYP/aug-cc-pVTZ single-point calculations on top of DFTB3/3OB geometries.

Molecule does not converge and is excluded from the statistics.

Energies except papan class="Chemical">n class="Chemical">PM6 are calculated at 0 K excluding zero-poinpan>t energy and thermal n class="Chemical">corrections. All numbers are given in kcal/mol. The nondissociated is listed. Basis set aug-cc-pVTZ. Basis set 6-31+G(d,p). B3LYP/aug-cc-pVTZ single-point calculations on top of n class="Chemical">DFTB3/3OB geometries. Molecule does not n class="Chemical">converge and is excluded from the statistics. Energies except papan class="Chemical">n class="Chemical">PM6 are calculated at 0 K excluding zero-poinpan>t energy and thermal n class="Chemical">corrections. All numbers are given in kcal/mol. The protonated form is listed. Basis set aug-cc-pVTZ. Basis set 6-31+G(d,p). B3LYP/aug-cc-pVTZ single-point calculations on top of n class="Chemical">DFTB3/3OB geometries. Molecule does not n class="Chemical">converge and is excluded from the statistics.

Mixed Ligands

One goal to develop papan class="Chemical">n class="Chemical">DFTB3/3OB is to describe enpan>zyme active sites. n class="Chemical">Zn-n class="Chemical">containing structures are mainly tetrahedrally bound in metalloenzymes. The most abundant primary ones are CCCC, zinc ion with four coordinating cysteines, followed by CCCH, CCHH, CHHH, HHHH, HHHO, HHOO, HOOO, HHHD, and HHDD, where O stands for water and the remaining are 1-letter amino acid codes.[12] In this subsection, we compare geometries and energetics for 43 zinc complexes with mixed types of simple ligands (including the deprotonated form) to model representative active sites in proteins. Here, H2O is used to represent the Asp/Glu side chain, NH3 to represent the His side chain, and H2S to represent the Cys side chain. More complicated/realistic ligands are discussed in the next subsection on “Large Test Set”. Table 11 summarizes the geometrical properties. The MADs in comparison to B3LYP/aug-cc-pVTZ for 161 metal–ligand lengths are 0.005, 0.019, 0.053, and 0.118 Å for B3LYP/6-31+G(d,p), DFTB3/3OB, DFTB3/MIO, and PM6, respectively. Metal–ligand angles on average deviate by 0.9, 2.6, 3.0, and 6.7°. Again, the 3OB set is a major improvement over the MIO set in metal–ligand lengths. PM6 is not able to give reliable geometries due to a large number of unreasonable metal–ligand distances between zinc and other main group elements.
Table 11

Mean and Maximum Absolute Deviation of Metal-Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ for Zinc Mixed Ligand Test Set

propertyaNbB3LYPcPM6DFTB3/MIOdDFTB3/3OB
r (Å)1610.0050.1180.0530.019
rmax (Å) 0.0270.5140.1790.070
a (deg)2310.96.73.02.6
amax (deg) 22.257.221.624.7

max stands for maximum absolute deviation.

number of comparisons.

basis set 6-31+G(d,p).

Molecules containing PH2– do not converge and are excluded from statistics.

max stands for maximum absolute deviation. number of n class="Chemical">comparisons. basis set 6-31+G(d,p). Molecules n class="Chemical">containpan>inpan>g PH2– do not pan> class="Chemical">converge and are excluded from statistics. n class="Chemical">DFTB3pan>/3OB energetics are predicted with similar accuracy as for complexes with the same type of ligand as shown in Table 12 (for detailed data, see Supporting Information). Once again, the reactions are collected somewhat arbitrarily so the error anan class="Chemical">lysis needs to be treated with care. B3LYP/6-31+G(d,p) gives small error compared to B3LYP with large basis set, demonstrating basis set superposition error is not severe here. DFTB3 gives rather good results in comparison with the reference (MAD of ∼3 kcal/mol), especially after single-point energies using B3LYP/aug-cc-pVTZ (MAD of 0.9 kcal/mol), demonstrating that the 3OB set is transferable from one type ligand to complicated coordinating environment with multiple types of ligands.
Table 12

Error Statistics for the Ligand Dissociation Energies and Proton Affinities of Zinc Containing Mixed Ligand Molecules Compared to B3LYP/aug-cc-pVTZa

 B3LYPbPM6DFTB3/MIODFTB3/3OBB3LYP//DFTB3c
Ligand DissociationEnergies    
MAD1.98.23.13.20.9
MAX2.831.66.86.52.6
Proton Affinities     
MAD0.910.45.53.80.9
MAX1.921.711.37.12.4

For detailed data, see Supporting Information.

Basis set 6-31+G(d,p).

B3LYP/aug-cc-PVTZ single-point calculations on top of DFTB3/3OB geometries.

For detailed data, see Supporting Information. Basis set 6-31+G(d,p). B3LYP/aug-cc-PVTZ single-point calculations on top of n class="Chemical">DFTB3/3OB geometries. To further test the 3OB parameter set, we have chosen lpapan class="Chemical">n class="Chemical">arger molecules to model the biologically relevanpan>t zinc n class="Chemical">coordinating environment. In this “ln class="Chemical">arge test set”, CH3COO–/HCOO– is used to model the Asp/Glu acid side chain, imidazole/NCH2CH3 to represent His, and SHCH3 to model the Cys side chain. Cys can adopt both protonated and deprotonated form in the active site so that a random mixture of SHCH3 and SCH3– is tested in modeling the Cys containing active site. In total we have compiled 51 zinc molecules. Geometries are compared to B3LYP/aug-cc-pVTZ optimized structures. B3LYP with the small basis set deviates on average by only 0.005 Å, and the maximal deviation is only 0.063 Å (Table 13). The MAD for metal–ligand length from DFTB3/3OB optimized structures is 0.024 Å. MAD of metal–ligand lengths Zn–O is 0.033 Å and Zn–S is 0.027 Å; they appear to be worse than the small test set but are good enough for modeling enzyme active sites. The largest deviations for metal–ligand lengths are found in Zn–O distances in [Zn(Im)(H2O)(SCH3)2] and [Zn(H2O)(SHCH3)(SCH3)2], which are underestimated by 0.141 and 0.103 Å, respectively. All others errors are smaller than 0.1 Å. The largest angle deviation 25.8° occurs in angle ZnN–C in [Zn(C3N2H3)]+ (i.e., Zn2+ bound to a single deprotonated imidazole), which is not encountered often in biological systems.
Table 13

Mean and Maximum Absolute Deviation of Metal–Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ for the Zinc Large Test Set

propertyaNbB3LYPcPM6DFTB3/MIODFTB3/3OB
r (Å)1730.0050.1280.0530.024
rmax (Å) 0.0630.9550.1770.141
a (deg)2360.65.34.03.3
amax (deg) 5.163.325.825.8

Max stands for maximum absolute deviation.

Number of comparisons.

Basis set 6-31+G(d,p).

Max stands for maximum absolute deviation. n class="Chemical">Number of n class="Chemical">comparisons. Basis set 6-31+G(d,p). Ligand dissociatiopan class="Chemical">n energies are compiled in Table 14 (for detailed data, see Supporting Inclass="Chemical">pan>formation). n class="Chemical">DFTB3 shows that removal of neutral imidazole may cause an error of more than 10 kcal/mol, indicating interactions between n class="Chemical">Zn–sp2 hybridized N are problematic. Nevertheless, when calculating single-point energies using B3LYP/aug-cc-pVTZ the errors turn out to be small with the MAD of merely 0.5 kcal/mol. Proton affinities are also compared in Table 14. DFTB3/3OB shows excellent results with a MAD of 3.3 kcal/mol, which is fairly remarkable in this set of rather complicated molecules; single-point B3LYP calculations lead to an even smaller MAD of 0.6 kcal/mol.
Table 14

Error Statistics for the Ligand Dissociation Energies and Proton Affinities of Zinc Containing Large Molecules Compared to B3LYP/aug-cc-pVTZa

 B3LYPbPM6DFTB3/MIODFTB3/3OBB3LYP//DFTB3c
Ligand DissociationEnergies    
MAD1.35.93.63.40.5
MAX2.620.214.413.81.3
Proton Affinities     
MAD1.110.14.93.30.6
MAX2.119.312.76.92.6

For detailed data, see Supporting Information.

Basis set 6-31+G(d,p).

B3LYP/aug-cc-PVTZ single-point calculations on top of DFTB3/3OB geometries.

For detailed data, see Supporting Information. Basis set 6-31+G(d,p). B3LYP/aug-cc-PVTZ single-point calculations on top of n class="Chemical">DFTB3/3OB geometries.

Dinuclear Mg/Zn Active Sites

An importapan class="Chemical">nt catalytic motif in metalloenpan>zymes is the bimetallic (or dinuclear) Mg/Zn site.[135,136] In this bimetallic motif, metal ions are commonly coordinated through a monatomic bridging ligand like water/hydroxide or a polyatomic group like carboxylate in a bidentate fashion. The two-metal-ion mechanism[5] was proposed for this type of motif, where one metal ion actively participates in facilitating the dissociation of the leaving group and stabilizing the pentavalent intermediate while the second metal ion holds an important structural role during catalysis. The advantage of bimetallic centers is the charge delocalization so as to lower the barrier of binding large substrates and generating a nucleophile.[137] To assess DFTB3/3OB for these important catalytic motifs, we have constructed four model compounds based on bimetallic Mg/Zn centers in enzymes (Figure 2). In our models, certain ligands are protonated to prevent a high net charge in the gas phase.
Figure 2

Optimized structures of dinuclear Mg2+/Zn2+ complexes of the simplified protein active site (a) PDB ID 1A49; (b) PDB ID 4ECQ; (c) PDB ID 3L4K; (d) PDB ID 1AMP. Distance are given in angstroms. The numbers without parentheses are obtained at the B3LYP 6-31+G(d,p) level; those with parentheses are DFTB3 values.

Figure 2(a) is a simplified active site structure taken from papan class="Chemical">n class="Chemical">pyruvate kinpan>ase crystal structure (n class="Disease">PDB ID 1A49(138)). n class="Chemical">Pyruvate kinase is an enzyme of the glycolytic pathway and phosphorylates ADP to generate ATP and pyruvate. This enzyme can also utilize other divalent metal ions like Mn2+, Ni2+, and Zn2+. Figure 2(b) is simplified from the polymerase η·DNA·dATP complex crystal structure (PDB ID 4ECQ(139)) which is involved in DNA synthesis. The formation of a phosphodiester bond is recently captured by time-resolved X-ray crystallography.[139] In the (a)(b) active sites, one Mg2+ is tridentated to α-, β-, and γ-oxygens in ATP. Figure 2(c) is modified from the type II topoisomerase (topoII)–DNA cleavage complex crystal structure (PDB ID 3L4K(140)). This enzyme cleaves and ligates DNA using tyrosine as the nucleophilic agent. Following the nucleophilic attack toward DNA backbones, a covalent phosphotyrosyl bond that links topoII and partial DNA is formed. It was long believed that a catalytic reaction could be aided via a single Mg2+. More recently, it was found that the canonical two-metal-ion mechanism may be possible.[140,141] Therefore, two Mg2+ ions are modeled in the reactant state. Figure 2(d) is the active site in Aeromonas Protelytica Aminopeptidase (AAP) (PDB ID 1AMP(142)), which is a prototypical member of the bizinc enzyme family. As shown ipan class="Chemical">n Figure 2, n class="Chemical">DFTB3 is tested against B3LYP/6-31+G(d,p) optimized structures. The overall structures from npan> class="Chemical">DFTB3 reproduce B3LYP structures very well, particularly the metal ion–ligand distances. MgMg/ZnZn distances are also excellent with the largest deviation of 0.13 Å. The monodentate or bridging binding modes of carboxylate are correctly described by the 3OB parameter set in comparison to B3LYP. These calculations demonstrate that DFTB3/3OB is useful for modeling bimetallic centers in enzymes with good accuracy. Optimized structures of dinuclear papan class="Chemical">n class="Chemical">Mg2+/npan> class="Chemical">Zn2+ complexes of the simplified protein active site (a) PDB ID 1A49; (b) PDB ID 4ECQ; (c) PDB ID 3L4K; (d) PDB ID 1AMP. Distance are given in angstroms. The numbers without parentheses are obtained at the B3LYP 6-31+G(d,p) level; those with parentheses are DFTB3 values. To further benchmark papan class="Chemical">n class="Chemical">DFTB3/3OB inpan> describing the transition state and reaction enpan>ergy barrier, we study an active site model of AP in the gas phase. AP catalyzes the hydrolytic reactions of various n class="Chemical">phosphates via a two-step mechanism: an oxygen neocleophile first attacks the phosphorus, then a water (hydroxide) replaces the leaving group in the subsequent step that is essentially the reverse of the first step.[143] In our previous work,[48] we studied the first step for the hydrolysis of a well-studied phosphate diester MpNPP– in R166S AP and showed that the transition states are synchronous in nature. To help understand the intrinsic errors in DFTB3/3OB for the reactions of interest, we investigate the active site model in the gas phase with QM calculations. Specifically we study the same reaction for two additional n class="Chemical">phosphate diesters methyl 3-nitrophenyl phosphate (MmNPP–) and methyl phenyl phosphate (MPP–) and compare the energetics and nature of the transition state to MpNPP–. For simplicity, we only study the α orientation of different substrates. As shown ipan class="Chemical">n Figure 3, there is generally good agreement between n class="Chemical">DFTB3/3OB and B3LYP/6-31+G(d,p) results in terms of geometries; the P–O distances, n class="Chemical">Zn2+–Zn2+ distances, and coordination structures are similar at the two QM computational levels for both reactant and transition states. The Zn–O distance error is within 0.03 Å and is a major improvement over MIO parameters which systematically overestimate Zn–O distance by 0.1 Å (MIO data not shown here). The Zn2+Zn2+ distance is underestimated in DFTB3/3OB by ∼0.1 Å in the reactant and is shrunk ∼0.1 Å more in the transition state in comparison to B3LYP distances. In terms of the nature of the transition state, both DFTB3 and B3LYP predict a synchronous type of structure with the tightness coordinate (i.e., P–Oligand + P–Onucleophile) of 4.0–4.2 Å even though the individual P–O bond is overestimated or underestimated by 0.1 Å in DFTB3, indicating DFTB3 provides a good description for the potential energy surface around the transition state for the three substrates.
Figure 3

AP active site gas-phase model with MpNPP–, MmNPP–, and MPP–. Distances are given in angstroms. The numbers without parentheses are obtained at the B3LYP 6-31+G(d,p) level; those in the parentheses are DFTB3 values. (a) MpNPP– reactant state; (b) MpNPP– transition state; (c) MmNPP– reactant state; (d) MmNPP– transition state; (e) MPP– reactant state; and (f) MPP– transition state.

Regarding the reactiopan class="Chemical">n energy barrier, Table 15 shows that the qualitative trend is n class="Chemical">consistenpan>t among various npan> class="Chemical">computational levels although there are substantial differences in the quantitative values. Previous calculations[48] of proton affinities for the leaving group indicate that MP2 gives results close to the experimental values, so MP2/6-311++G(d,p) single-point energies are used as the reference here. Numbers span a broad range with different DFT methods without dispersion, and the results become much closer after adding the D3 dispersion; an explicit consideration of dispersion is important for a large active site discussed here. Compared to MP2, DFTB3 gives a reasonable reaction energy for MpNPP– but underestimates the barrier of MmNPP– by ∼4 kcal/mol and overestimates the barrier of MPP– by ∼9 kcal/mol. However, MP2 single-point energies on top of DFTB3 and B3LYP/6-31+G(d,p) geometries differ by ∼1 kcal/mol for all the substrates, indicating DFTB3 produces satisfactory geometries and also highlighting the importance of single-point energy correction with a reliable ab initio method at DFTB3 geometries. DFTB3 reaction barriers are close to B3LYP results without D3 correction; this is not surprising because DFTB3 is parametrized based on B3LYP (see Methods) and inherits some errors. M06 behaves the best among three DFT functionals as compared to MP2, especially with the D3 dispersion included. PBE systematically underestimates the energy barrier. Further optimization of the D3 dispersion model developed for DFTB3[99] will further improve DFTB3/3OB energies.
Table 15

Calculated Activation Barrier (in kcal/mol) for Diester Hydrolysis in the Active Site Model of AP in the Gas Phasea

  single points at DFTB3 geometries
single points at B3LYP geometries
substrateDFTB3B3LYPM06PBEMP2B3LYPM06PBEMP2
MpNPP8.89.2 (4.2b)5.7 (5.1)5.7 (3.1)7.612.0 (6.7)6.8 (6.1)7.2 (4.3)8.8
MmNPP9.79.2 (9.4)12.1 (12.4)6.8 (7.5)13.615.3 (12.0)13.2 (12.8)11.0 (9.3)13.2
MPP20.016.2 (9.1)12.4 (10.5)11.6 (7.1)9.620.1 (11.6)13.0 (10.8)14.0 (8.6)10.4

The geometries are optimized at either the DFTB3/3OB or B3LYP/6-31+G(d,p) level. Single-point energy calculations are carried out at the B3LYP, M06, PBE, and MP2 method at two levels of geometries using a large basis set of 6-311++G(d,p).

The numbers with parentheses are obtained with corresponding DFT functional plus D3 dispersion.

AP active site gas-phase model with papan class="Chemical">n class="Chemical">MpNPP–, n class="Chemical">MmNPP–, and MPP–. Distances are given in angstroms. The numbers without parentheses are obtained at the B3LYP 6-31+G(d,p) level; those in the parentheses are DFTB3 values. (a) MpNPP– reactant state; (b) MpNPP– transition state; (c) MmNPP– reactant state; (d) MmNPP– transition state; (e) MPP– reactant state; and (f) MPP– transition state. The geometries are optimized at either the n class="Chemical">DFTB3pan>/3OB or B3LYP/6-31+G(d,p) level. Single-point enpan>ergy calculations are carried out at the B3LYP, M06, n class="Chemical">PBE, and MP2 method at two levels of geometries using a large basis set of 6-311++G(d,p). The numbers with parentheses are obtained with n class="Chemical">correspondinpan>g DFT functional plus D3 dispersion.

Mg2+ and Zn2+ Ions Solvation Structure and Relative pKa

The n class="Chemical">metalpan> ion-O radial distribution function (g(r)) and n class="Chemical">coordination numbers are shown in Figure 4 for the two cations in solution. A well-defined first solvation shell is observed in both cases. The first peak in DFTB3/MM hydrated Mg2+ occurs at 2.07 Å, in good agreement with X-ray diffraction data of 2.09 Å ± 0.04 Å[144] and with previous AIMD simulations in the range of 2.08–2.13 Å.[145−149] The MM force field predicts the Mg2+–O distance to be shorter than the DFTB3/MM model with a sharper peak at 2.0 Å. The inner n class="Chemical">coordination sphere remains similar to the gas-phase structure. For the cluster [Mg(H2O)6]2+ in the gas phase, the Mg2+–O distance is 2.10 Å in both DFTB3 and B3LYP/aug-cc-pVTZ optimized structures. For Zn2+, the first maximum from DFTB3/MM is at 2.11 Å, consistent with X-ray absorption measurements 2.06 ± 0.02 Å[150] and earlier AIMD results 2.07–2.13 Å.[145,146,151] The classical MM model gives the first peak at 2.08 Å, although the peak is higher and narrower than in the DFTB3/MM model. The small difference in the location of the first peak in DFTB3/MM may be traced back to the gas-phase model: for the cluster [Zn(H2O)6]2+ in the gas phase, Zn2+–O is 2.16 Å in DFTB3 versus 2.12 Å in B3LYP/aug-cc-pVTZ.
Figure 4

Radial distribution function of water oxygen around Mg2+/Zn2+ in 20 Å water droplet. The pure MM model is based on the CHARMM force field.[105]

The integral of the first solvatiopan class="Chemical">n shell for both cations gives the coordination npan>umber of 6. The ln class="Chemical">arge QM region results predict very similar first solvation shell as with the small QM region, though they differ slightly for the sen class="Chemical">cond solvation shell; the large QM region leads to a broader peak for the second solvation shell than the small QM region and MM model. It is worth noting that in the large QM region simulation of Zn2+ water exchanges between the first and second solvation shells (coordination number remains the same), whereas no exchange of water is seen in Mg2+ simulations. These observations are in agreement with experimental findings; for Mg2+, experimental results suggest that water molecules could reside in the first shell up to a few microseconds,[152,153] while the water residence time for Zn2+ is only hundreds of picoseconds.[154] To further benchmark the epan class="Chemical">nergetics of n class="Chemical">DFTB3/3OB, we use the n class="Chemical">DTSC-TI approach to calculate the relative pKa of Mg2+/Zn2+-bound water with the DFTB3/MM model. Altogether six intermediate states are used to obtain the ∂F/∂λ values. In each window, the data collected during the first 250 ps are discarded as part of the equilibration process, and the remaining 500 ps data are then subject to statistical analysis to obtain the proper average of free energy derivative. The statistical errors associated with the free energy are on the order of 0.2–0.5 kcal/mol. In all cases, the linear dependence of the free energy derivatives on λ holds well with a correlation coefficient (R2) higher than 0.99 (see Figure S1 in Supporting Information). Integrating ∂F/∂λ over λ gives a value of 132.9 kcal/mol for Mg2+ with the small QM region, 124.1 kcal/mol for Zn2+ with the small QM region, 130.0 kcal/mol for Mg2+ with the large QM region, and 125.5 kcal/mol for Zn2+ with the large QM region, leading to a ΔpKa of 6.4 pKa units with the small QM region and 3.3 pKa units with the large QM region. The latter is in good agreement with the experimental value of 3.2 pKa units (12.4 for a Mg2+ solution and 9.2 for hydrated Zn2+).[145,155] The better performance of the large QM region results highlights the importance of using a QM description for the second solvation shell, which allows coordination number fluctuation in the Zn2+ solution, especially at the deprotonated state. We have also calculated the absolute pKa of Mg2+/Zn2+-bound water as a further benchmark. Compared to the experimental values, the closest computational estimate from current work is 14.0 for Mg2+ and 10.4 for Zn2+ (see Supporting Information). Radial distribution function of n class="Chemical">water pan> class="Chemical">oxygen around Mg2+/Zn2+ in 20 Å water droplet. The pure MM model is based on the CHARMM force field.[105]

Structural Flexibility of the Active Site in Myosin

Myosin-II (thereafter referred to as myosipan class="Chemical">n) is one of the best characterized molecular motors, which n class="Chemical">couple ln class="Chemical">arge-scale conformation transitions with ATP binding and hydrolysis. Despite many previous efforts from both experimental and theoretical studies,[156−160] the complex nature of “mechanochemical coupling” still remains to be well understood. As shown ipan class="Chemical">n Figure 5(a), binding modes obtained from the n class="Chemical">DFTB3/MM simulation reproduce the key features of the crystal structure. In the discussion of our earlier work,[115] the weakenpan>ed inpan>teraction betweenpan> n class="Chemical">Mg2+-nucleotide and Ser237 as reflected by the larger distance fluctuation was proposed to indicate the destabilization of Switch I following n class="Chemical">ATP hydrolysis. As a test of the DFTB3/3OB magnesium parameters in the enzyme, DFTB3/MM equilibrium simulations are carried out with two nucleotide chemical states (ATP or ADP·Pi), and the distances between Mg2+ and four nonwater ligands are monitored. As shown in Figures 5(b) and (c), the distances between Mg2+ and ligands are reasonably stable over 1 ns; the most obvious difference observed between the two nucleotide states is the fluctuation of Mg2+OSer237 distance, for which the average and fluctuation are 2.16 and 0.08 Å in the ATP state and 2.24 and 0.11 Å in the ADP·Pi state. The magnitude of fluctuation observed in current DFTB3/MM simulations is notably less compared with previous work (MM simulations and SCC-DFTBPR/MM simulation, in which the MgOSer237 distance reached ∼3.5 Å), which led to coordination number shifting from 6 to 5 after ATP hydrolysis.[115] To further evaluate the situation, PMFs are calculated for the distance between Mg2+ and OSer237 in the ATP and ADP·Pi states. As seen in Figure 5(d), dissociation of Ser237 is indeed energetically more favorable in the ADP·Pi state than the ATP state by a few kcal/mol. Nevertheless, there is a significant energetic cost to fully dissociate the Ser237 side chain from the Mg2+, even in the ADP·Pi state. Further experimental analysis based on infrared spectroscopy can potentially generate additional insights into this issue.[161] Structural properties of the Myosin active site duripan class="Chemical">ng equilibrium QM/MM MD simulations with different nucleotide chemical states (n class="Chemical">ATP or n class="Chemical">ADP·Pi). (a) Overlay of average DFTB3/MM structure and crystal structure (gray) of Myosin-II motor domain active site in the ATP state. The VO4 moiety in the crystal structure (PDB code 1VOM) is replaced by a PO3 phosphate group and a presumptive lytic water molecule. Instantaneous distances between Mg2+ and oxygen atoms in its four nonwater ligands at (b) ATP state and (c) ADP·Pi state. Ser237 refers to OSer237 and Thr186 refers to OThr186. (d) PMF calculations for the ATP/ADP·Pi states. The reaction coordinate is the distance between Mg2+ and OSer237.

Force Fields Comparison in AP Active Site

We have benchmarked papan class="Chemical">n class="Chemical">DFTB3/3OB for the AP active site with a gas-phase model inpan> earlier discussion. As an additional test, n class="Chemical">DFTB3/MM simulations for the R166S AP enzyme are carried out, and the results are compared to two popular MM force fields for zinc. The comparison of equilibrated reactant structures by n class="Chemical">Coulomb,[14] SLEF1,[25] and DFTB3/MM is shown in Figure 6, with the crystal structure with bound phosphate as a reference. Binding modes obtained from the DFTB3/MM simulation reproduce the key features of the crystal structure and previous SCC-DFTBPR/MM simulations[48] very well (shown in Figure 6(c) and (d)). The conventional Coulomb scheme (Figure 6(a)) leads to a much shortened Zn2+Zn2+ distance and significant changes in the active site geometry: Ser102 and Asp327 are stabilized by two zinc ions simultaneously; Asp369 (not shown in Figure 6 for clarity) is bidentate with the Zn12+ in the Coulomb model but monodentate in the crystal and DFTB3/MM structures. By comparison, coordination geometry is notably improved by the SLEF1 force field (Figure 6(b)), even though the Zn2+Zn2+ (3.7 Å) distance is still substantially shorter than that in the crystal structure (4.3 Å). It is perhaps not surprising that SLEF1 does not provide a correct distance between zinc ions as SLEF1 was not parametrized for the binuclear zinc motif and designed to be compatible with the Amber99SB force field rather than with the CHARMM force field. Nevertheless, SLEF1 correctly describes the binding modes of Ser102 and Asp369 to the zinc ions; yet Asp327 still has a strong preference to bridge the two zinc cations due to the short distance between them. Moreover, an extra His372 is pulled closer to bind to Zn22+; this might have been caused by the stronger attractive interaction between the zinc ion and nitrogen in the short-range part of the SLEF1 force field. In short, DFTB3/MM produces encouraging results for modeling a challenging bimetallic zinc active site, demonstrating the greater applicability of DFTB3/MM to metalloenzymes even when only structural properties are considered. Structural properties of the AP active site duripan class="Chemical">ng equilibrium QM/MM MD Simulations. A snapshot for the reactant state with (a) n class="Chemical">Coulomb MM force field, (b) SLEF1 zinc MM force field, (c) n class="Chemical">DFTB3/3OB, and (d) overlay of crystal and DFTB3/MM structure (blue), with average key distances in angstroms labeled. Asp369, His370, and His412 are omitted for clarity. Substrate inorganic phosphate in the crystal structure (3CMR) is not shown.

Concluding Remarks

n class="Chemical">Magnesiumpan> and zinc play many important roles in n class="Chemical">metalloenzymes and various chemical systems. An efficient and accurate QM description for Mg/Zn is critical for the use of theoretical methods to study the reaction mechanism of these systems. In this work, we extend the parametrization of the approximate density functional tight binding theory, DFTB3, to magnesium and zinc. The parametrization is performed in a framework n class="Chemical">consistent with the DFTB3/3OB set for CHNOPS; hence this parameter set can be used to study the chemistry of Mg/Zn moieties in solution and biological systems. Benchmark calculatiopan class="Chemical">ns in the gas phase and in the n class="Chemical">condenpan>sed phase with a QM/MM framework demonstrate that the currenpan>t parametrizationpan> generally leads to reliable structures and semiquantitative enpan>ergetics (with typically a MAD of ∼3–5 kcal/mol compared to high level ab initio calculations). In the gas phase, DFTB3/3OB are tested with a diverse range of molecules that reflect the typical coordination environment of magnesium and zinc in biomolecules, including dinuclear metal sites. We focus on the geometries, ligand dissociation energies, and ligand proton affinities. The results calculated at the n class="Chemical">DFTB3/3OB level are compared to B3LYP, ab initio (G3B3, MP2), and a popular semiempirical method PM6 as well as the older parametrization, MIO. In general, DFTB3/3OB shows substantial improvement over DFTB3/MIO and outperforms PM6 in many aspects within our test sets, particularly in terms of structural properties and ligand proton affinities. Compared to high-level calculations, DFTB3/3OB is very successful at predicting geometries. Large errors can still be found in certain cases, especially with charged ligands. Even for these cases, single-point energy calculations with high level QM methods generally give very reliable energetics, suggesting that changes of structures during typical processes of interest (e.g., ligation dissociation or ligand deprotonation) are well captured with DFTB3/3OB. Due to n class="Chemical">copan>mputational efficienpan>cy and genpan>eral robustness for geometry predictions, n class="Chemical">DFTB3/MM calculations are expected to be effective for studies in condensed-phase systems. For example, our solution benchmark study for the magnesium and zinc ion has reproduced experimental solvation structures and relative pKa of metal bound water, especially when both the first and second solvation shells are treated as DFTB3/3OB. For the enzyme benchmark, n class="Chemical">DFTB3/3OB is successful at producing active site structures in myosin and AP relative to available crystal structures; for AP, DFTB3/3OB also leads to semiquantitative energetics for an active site model, although our calculations using several DFT functionals and MP2 highlight the importance of careful calibration of energetics for large metal sites in enzymes, including consideration of dispersion interactions. These studies have laid the groundwork for using DFTB3/MM simulations with state-of-the-art sampling techniques such as metadynamics[162] and finite-temperature string methods[163−165] to study the chemical processes in myosin and AP; high-level single-point QM/MM calculations are still required for improving the quantitative nature of the free energy surfaces. Finally, we emphasize that there are still remaining limitations in DFTB3, such as the treatment of interaction between Mg/Zn and highly charged/polarizable ligands (e.g., hydroxide and SH–), thus it is essential to continue the formal development of DFTB3, such as adding multipole terms for the charge fluctuations and formulating a better description of polarization and short-range Pauli repulsions.
  114 in total

1.  Generalized Gradient Approximation Made Simple.

Authors: 
Journal:  Phys Rev Lett       Date:  1996-10-28       Impact factor: 9.161

2.  The conformational states of Mg.ATP in water.

Authors:  Jung-Chi Liao; Sean Sun; David Chandler; George Oster
Journal:  Eur Biophys J       Date:  2003-08-07       Impact factor: 1.733

3.  pKa's of ionizable groups in proteins: atomic detail from a continuum electrostatic model.

Authors:  D Bashford; M Karplus
Journal:  Biochemistry       Date:  1990-11-06       Impact factor: 3.162

4.  Application of the computationally efficient self-consistent-charge density-functional tight-binding method to magnesium-containing molecules.

Authors:  Zheng-Li Cai; Philip Lopez; Jeffrey R Reimers; Qiang Cui; Marcus Elstner
Journal:  J Phys Chem A       Date:  2007-06-08       Impact factor: 2.781

5.  Density functional tight binding: values of semi-empirical methods in an ab initio era.

Authors:  Qiang Cui; Marcus Elstner
Journal:  Phys Chem Chem Phys       Date:  2014-07-28       Impact factor: 3.676

6.  Paradynamics: an effective and reliable model for ab initio QM/MM free-energy calculations and related tasks.

Authors:  Nikolay V Plotnikov; Shina C L Kamerlin; Arieh Warshel
Journal:  J Phys Chem B       Date:  2011-05-27       Impact factor: 2.991

7.  Long-range solvent effects on the orbital interaction mechanism of water acidity enhancement in metal ion solutions: a comparative study of the electronic structure of aqueous Mg and Zn dications.

Authors:  Leonardo Bernasconi; Evert Jan Baerends; Michiel Sprik
Journal:  J Phys Chem B       Date:  2006-06-15       Impact factor: 2.991

8.  Crystal structure of Aeromonas proteolytica aminopeptidase: a prototypical member of the co-catalytic zinc enzyme family.

Authors:  B Chevrier; C Schalk; H D'Orchymont; J M Rondeau; D Moras; C Tarnus
Journal:  Structure       Date:  1994-04-15       Impact factor: 5.006

Review 9.  Proton transfer function of carbonic anhydrase: Insights from QM/MM simulations.

Authors:  Demian Riccardi; Shuo Yang; Qiang Cui
Journal:  Biochim Biophys Acta       Date:  2009-08-11

10.  Energies, Geometries, and Charge Distributions of Zn Molecules, Clusters, and Biocenters from Coupled Cluster, Density Functional, and Neglect of Diatomic Differential Overlap Models.

Authors:  Anastassia Sorkin; Donald G Truhlar; Elizabeth A Amin
Journal:  J Chem Theory Comput       Date:  2009-04-02       Impact factor: 6.006

View more
  39 in total

Review 1.  Metal Ion Modeling Using Classical Mechanics.

Authors:  Pengfei Li; Kenneth M Merz
Journal:  Chem Rev       Date:  2017-01-03       Impact factor: 60.622

2.  Benchmarking density functional tight binding models for barrier heights and reaction energetics of organic molecules.

Authors:  Maja Gruden; Ljubica Andjeklović; Akkarapattiakal Kuriappan Jissy; Stepan Stepanović; Matija Zlatar; Qiang Cui; Marcus Elstner
Journal:  J Comput Chem       Date:  2017-07-24       Impact factor: 3.376

3.  Perspective: Quantum mechanical methods in biochemistry and biophysics.

Authors:  Qiang Cui
Journal:  J Chem Phys       Date:  2016-10-14       Impact factor: 3.488

4.  Exploring the applicability of density functional tight binding to transition metal ions. Parameterization for nickel with the spin-polarized DFTB3 model.

Authors:  Milena Vujović; Mioy Huynh; Sebastian Steiner; Pablo Garcia-Fernandez; Marcus Elstner; Qiang Cui; Maja Gruden
Journal:  J Comput Chem       Date:  2018-10-09       Impact factor: 3.376

5.  Catalytic Mechanism of Amyloid-β Peptide Degradation by Insulin Degrading Enzyme: Insights from Quantum Mechanics and Molecular Mechanics Style Møller-Plesset Second Order Perturbation Theory Calculation.

Authors:  Rui Lai; Wei-Jen Tang; Hui Li
Journal:  J Chem Inf Model       Date:  2018-09-06       Impact factor: 4.956

Review 6.  Semiempirical Quantum Mechanical Methods for Noncovalent Interactions for Chemical and Biochemical Applications.

Authors:  Anders S Christensen; Tomáš Kubař; Qiang Cui; Marcus Elstner
Journal:  Chem Rev       Date:  2016-04-13       Impact factor: 60.622

7.  Single-step Replacement of an Unreactive C-H Bond by a C-S Bond Using Polysulfide as the Direct Sulfur Source in Anaerobic Ergothioneine Biosynthesis.

Authors:  Ronghai Cheng; Lian Wu; Rui Lai; Chao Peng; Nathchar Naowarojna; Weiyao Hu; Xinhao Li; Stephen A Whelan; Norman Lee; Juan Lopez; Changming Zhao; Youhua Yong; Jiahui Xue; Xuefeng Jiang; Mark W Grinstaff; Zixin Deng; Jiesheng Chen; Qiang Cui; Jiahai Zhou; Pinghua Liu
Journal:  ACS Catal       Date:  2020-07-16       Impact factor: 13.084

8.  Analysis of Density Functional Tight Binding with Natural Bonding Orbitals.

Authors:  Xiya Lu; Juan Duchimaza-Heredia; Qiang Cui
Journal:  J Phys Chem A       Date:  2019-08-15       Impact factor: 2.781

9.  QM/MM free energy simulations: recent progress and challenges.

Authors:  Xiya Lu; Dong Fang; Shingo Ito; Yuko Okamoto; Victor Ovchinnikov; Qiang Cui
Journal:  Mol Simul       Date:  2016-07-05       Impact factor: 2.178

10.  QM/MM Analysis of Transition States and Transition State Analogues in Metalloenzymes.

Authors:  D Roston; Q Cui
Journal:  Methods Enzymol       Date:  2016-07-01       Impact factor: 1.600

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.