Xiya Lu1, Michael Gaus, Marcus Elstner, Qiang Cui. 1. Department of Chemistry and Theoretical Chemistry Institute, University of Wisconsin-Madison , 1101 University Avenue, Madison, Wisconsin 53706, United States.
Abstract
We report the parametrization of the approximate density functional theory, DFTB3, for magnesium and zinc for chemical and biological applications. The parametrization strategy follows that established in previous work that parametrized several key main group elements (O, N, C, H, P, and S). This 3OB set of parameters can thus be used to study many chemical and biochemical systems. The parameters are benchmarked using both gas-phase and condensed-phase systems. The gas-phase results are compared to DFT (mostly B3LYP), ab initio (MP2 and G3B3), and PM6, as well as to a previous DFTB parametrization (MIO). The results indicate that DFTB3/3OB is particularly successful at predicting structures, including rather complex dinuclear metalloenzyme active sites, while being semiquantitative (with a typical mean absolute deviation (MAD) of ∼3-5 kcal/mol) for energetics. Single-point calculations with high-level quantum mechanics (QM) methods generally lead to very satisfying (a typical MAD of ∼1 kcal/mol) energetic properties. DFTB3/MM simulations for solution and two enzyme systems also lead to encouraging structural and energetic properties in comparison to available experimental data. The remaining limitations of DFTB3, such as the treatment of interaction between metal ions and highly charged/polarizable ligands, are also discussed.
We report the parametrization of the approximate depan class="Chemical">nsity functional theory, n class="Chemical">DFTB3, for n class="Chemical">magnesium and zinc for chemical and biological applications. The parametrization strategy follows that established in previous work that parametrized several key main group elements (O, N, C, H, P, and S). This 3OB set of parameters can thus be used to study many chemical and biochemical systems. The parameters are benchmarked using both gas-phase and condensed-phase systems. The gas-phase results are compared to DFT (mostly B3LYP), ab initio (MP2 and G3B3), and PM6, as well as to a previous DFTB parametrization (MIO). The results indicate that DFTB3/3OB is particularly successful at predicting structures, including rather complex dinuclear metalloenzyme active sites, while being semiquantitative (with a typical mean absolute deviation (MAD) of ∼3-5 kcal/mol) for energetics. Single-point calculations with high-level quantum mechanics (QM) methods generally lead to very satisfying (a typical MAD of ∼1 kcal/mol) energetic properties. DFTB3/MM simulations for solution and two enzyme systems also lead to encouraging structural and energetic properties in comparison to available experimental data. The remaining limitations of DFTB3, such as the treatment of interaction between metal ions and highly charged/polarizable ligands, are also discussed.
n class="Chemical">Magnesiumpan> and zinc are two of the most
n class="Chemical">common metalcofactors of
numerous n class="Chemical">metalloenzymes and other biomolecules.[1−3] The binding
of Mg2+ to DNA and RNA helps to maintain their structures
by stabilizing highly negatively charged phosphates. ATP, the main
source of energy in cells, is bound to Mg2+ in its biologically
active form.[4] In addition, a large number
of enzymes involved in the biochemistry of nucleic acids bind Mg2+ for catalytic activity; for example, Mg2+ is
essential for DNA replication and phosphoryl transfers.[5] Zinc is the second most abundant cation in the
human body, so Zn2+ has important biological functions
in numerous biomolecular systems[6,7] such as zinc-finger
class of transcription factors, signaling proteins, transport/storage
proteins, as well as enzymes. The zinc ion contributes to catalysis
in more than 300 enzymes and serves a structural role in many proteins
and enzymes.[8−10]
To describe the structural properties of n class="Chemical">Mg2+pan> and n class="Chemical">Zn2+ in solution and biomolecules, molecular
mechanical (MM)
force field parameters have been developed.[11−15] Some of the advanced force field models such as SIBFA,[16] AMOEBA,[17,18] and Drude oscillator
models[19] treat electronic polarization
and charge transfer[20] terms explicitly
and have been demonstrated to describe potential energy surfaces in
close agreement with quantum mechanics (QM) calculations.[21,22] For chemical reactions that involve these n class="Chemical">metal ions, however, MM
models are no longer appropriate. Even for structural properties,
when the binding mode of ligands is flexible,[23−25] it is challenging
to describe the metal site with a MM model.
An appropriate modelipan class="Chemical">ng
approach in tn class="Chemical">his n class="Chemical">context is the QM/MM framework,
where the metal site of interest is treated with QM while the rest
of the enzyme/solution is described by an empirical force field. For
relatively rigid metal sites, ab initio QM and QM/MM
methods based on either reaction path or free energy simulations have
been demonstrated to be powerful for mechanistic analysis.[26−30] The cost of these calculations for flexible systems, however, remains
very high. For instance, one key feature of zinc coordination is its
flexibility:[23−25] a zinc ion can adopt multiple binding modes with
the coordination number ranging from 4 to 6, underlining the importance
of developing efficient QM(/MM) methods that complement ab
initio based methods. The dynamical nature of the metal binding
mode is particularly important to enzymes that feature solvent accessible
active sites.[31−33] For these applications, semiempirical QM-based QM/MM
simulations can potentially strike the balance between computational
accuracy and conformational sampling.
The traditional semiempirical
QM methods based opan class="Chemical">n the n class="Chemical">Neglect-of-Diatomic-Differenpan>tial-Overlap
(npan> class="Chemical">NDDO) approximation, such as MNDO,[34,35] MNDO/d,[36] AM1,[37] and PM3,[38] however, give in general rather poor geometries
for Mg/Zn-containing molecules. For example, metal–ligand lengths
computed from AM1,[39,40] PM3, MNDO, and MNDO/d[40] deviate by roughly 0.1 Å on average in
comparison to B3LYP/6-311++G** for small Mgcomplexes.[41] Recently developed AM1/d parameters[42] for magnesium provide a clear improvement in
accuracy compared to AM1 by incorporating d orbitals. According to
a recent test,[43] the mean absolute deviations
(MADs) of metal–ligand distances for AM1,[44] PM3, PM3(ZnB),[45] MNDO/d,[46] and PM6[47] are more
than 0.05 Å compared to CCSD(T) values for small zinc-containing
compounds. The importance of a reliable treatment for zinc sites to
the description of enzyme catalysis is illustrated by our recent study
of alkaline phosphatase (AP),[48,59] where we showed that
previous AM1/d-PhoT/MM calculations[50,51] likely led
to incorrect transition states due to distortions of the bimetallic
zinc motif.
The Self-n class="Chemical">Copan>nsistenpan>t-Charge Density-Functional Tight-Binding
(SCC-DFTB)
method provides a promising alternative to NDDO approaches.[52−54] It is derived by expanding the DFT total energy up to sen class="Chemical">cond order
around a reference density, which is usually taken to be the sum of
atomic charge densities. The computational speed of SCC-DFTB is comparable
to NDDO, and it has been rather extensively benchmarked for reaction
energies, geometries, and vibrational frequencies for a large number
of small molecules.[55−61] The fact that SCC-DFTB is formulated in a DFT framework suggests
that it is potentially more reliable for metal ions than NDDO-based
methods, which are based on Hartree–Fock. Specifically for
Mg[41] and Zn,[62,63] SCC-DFTB(/MM)[64−66] has been successfully applied to a rather diverse set of biological
and chemical applications that include these ions.[48,49,67−74] More recently, DFTB has been extended to include third-order contributions,
termed as DFTB3.[75] Compared to the second-order
formulation, DFTB3considers the charge dependence of the atomic Hubbard
parameter and therefore provides an improved description of chemical
properties such as proton affinity,[53,75,76] which is important in many biological applications.
With the most recent parametrization referred to as 3OB,[77,78] the electronic properties and therefore atomization energies have
been substantially improved; as a result, the DFTB3 approach is shown
to have comparable performance as DFT with double-ζ-plus-polarization
basis sets for many systems of biological and chemical interest, although
certain limitations remain.[78,79]
n class="Chemical">Copan>nsidering the
recenpan>t success of n class="Chemical">DFTB3/3OB, here we extend the
3OB parametrization of DFTB3 to magnesium and zinc, again focusing
on chemical and biological applications. In the next section, we briefly
summarize the DFTB3 methodology, the procedures for parametrization,
and simulation details for benchmark calculations. We then present
data from multiple levels of test sets in the gas phase, which are
followed by additional benchmark calculations in both solution and
enzymes. Finally, we summarize tn class="Chemical">his work with several conclusions.
Computational
Methods
Theory
The n class="Chemical">DFTB3pan> parametrization of n class="Chemical">magnesium and zinc
follows the protocol outlined in the previous publications[77,80] and extends the 3OB parameter set that so far includes elements
C, H, n class="Chemical">N, O,[77] S, and P.[78] We first briefly review the formulation of DFTB3.[54,75]
The total energy of n class="Chemical">DFTB3 is given by
Here H0 is the effective Kohn–Sham
Hamiltopan class="Chemical">nian evaluated at the atomic reference density ρ0, which is determined by solving the Kohn–Sham equations
for the atom in the presence of an additional harmonic n class="Chemical">confining potenpan>tial
with a n class="Chemical">confining radius rwf for the atomic
basis set and a different radius rdens for the density. Off-diagonal H0 integrals
are computed in a two-center approximation together with the PBE[81] exchange-correlation functional and an atomic
basis set consisting of Slater functions. The Slater functions are
defined by lmax as the highest orbital
angular momentum included, nmax, and α0, α1, ..., α4, the exponents
of the Slater functions. In principle, basis functions for each orbital
angular momentum can be compressed differently; in this work, however,
s and p orbitals are confined with the same compression radius for
Mg, and s, p, and d orbitals (compression radius for d-orbitals is
defined by the parameter rwfd) are confined
with the same compression radius for Zn. The diagonal elements H0 are equal to the calculated atomic eigenvalues
ε (x = s, p, or
d orbital). No confinement of the orbitals is applied here in order
to yield the correct dissociation limit. We optimize εp as discussed in the following section.
The atomic Hubbard
parameter Ua in
the sepapan class="Chemical">n class="Chemical">conpan>d-order term of the energy Eγ describes the electron–electron interaction within one atom
and enpan>ters the γ-function, which interpolates the on-site and
the long-range electronic interaction of atom pairs. The Hubbard parameters
are calculated from DFT as the first derivative of the highest occupied
orbital with respect to its occupation number. For the third-order
term EΓ, an additional parameter
per elemenpan>t Ud, the derivative of the
Hubbard parameter with respect to chn class="Chemical">arge has to be determined. For
CHNO it was calculated as the second derivative of the highest occupied
orbital with respect to its occupation number;[77] for Mg and Zn we find a significant advantage to optimize
it.
Reference Systems and Parameter Fitting
As emphasized
earlier,[54] most parameters in the papan class="Chemical">n class="Chemical">DFTB3
approach are directly npan> class="Chemical">computed based on atoms or atom pairs at the
DFT (PBE) level, while some parameters are optimized based on specific
reference systems. In this way, accuracy appropriate for chemical
applications can be obtained without significantly reducing the transferability
of the DFTB3 model. The parametrization for Mg/Zn used herein follows
that specified previously for the parametrization of CHNOPS;[77,78] two distinct groups of parameters have to be optimized, the electronic
parameters appearing in the first three terms of the DFTB3 total energy
(eq 1) and the repulsive potential.
The
electronic parameters are summarized in Table 1. The following parameters are optimized to improve the overall performance:
Table 1
Overview of Electronic Parameters
(in Atomic Units if Not Unitless)a
parameter
Mg
Zn
lmax
1
2
nmax
2
2
α0
0.50
0.50
α1
1.11
1.39
α2
2.45
3.87
α3
5.42
10.78
α4
12.00
30.00
rwf
5.5
4.2
rwfd
0.0
4.2
rdens
14.0
9.0
εs
–0.17267741
–0.21408495
εp
–0.02
0.02
εd
–
–0.38831893
Espin
0.0
0.0
U
0.2246
0.2663
Ud
–0.02
–0.03
As described in
the text, U, Espin, εs,d, nmax, and α are consistent with the standard choices of DFTB
parametrization
and not subject to optimizations. By contrast, rwf, rwfd, rdens, εp, and Ud are adjusted based on properties of molecules in the fitting set.
•rwf and rdepan class="Chemical">ns: The general procedure was to start from the electronic
parameters used in the n class="Chemical">MIO set[75,77] (which are rn class="Chemical">Mgwf = 5.0, rMgdens = 12.0, rZnwf = 4.9, rZndens = 10.0). rwf and rdens are fitted in a “brute force” manner to
achieve good performance on bond (metal–ligand), angle, vibrational
frequencies, as well as reaction energies.
•εp: The eigenvalues are ipan class="Chemical">n principle n class="Chemical">computed
directly from atomic DFT calculations and enpan>ter as diagonal elemenpan>ts
into the Hamiltonian matrix without the additional n class="Chemical">confining potential.
Exceptions are made here to magnesium and zinc since the p orbitals
are often not heavily involved in the coordination of Mg2+ and Zn2+. εp is calculated to be −0.04914
au for Mg and −0.04261 au for Zn. εp are fitted
in the way that p orbitals are less involved in molecular bonding,
so as to improve sequential ligation energies and proton affinities.
•Ud: The Hubbard derivative
is generally papan class="Chemical">n class="Chemical">computed from atomic DFT calculationpan>s. In the case of
n class="Chemical">Mg and Zn, they are −0.0496 au and −0.0544 au, respectively.
We found a significant improvement of proton affinities after optimizing Ud for n class="Chemical">Mg and Zn. It is important to note that
fitting of the Hubbard derivatives does not significantly alter most
properties of neutral molecules like equilibrium geometries, so with
all the other parameters held fixed, Ud is tuned as the last step in the parametrization procedure.
Although the repulsive potential, which papan class="Chemical">n class="Chemical">conpan>tains nuclear–nuclear
repulsion and double-n class="Chemical">counting terms that depend on the reference density,
can be directly computed, practical application to molecular systems
with reasonable accuracy requires fitting the repulsive potential.
The repulsive potential between atom type A and type B (Vabrep in eq 1) is described by a spline function in the covalent
binding region. A cutoff is introduced at which the function and first
three derivatives reach zero. The remaining degrees of freedom for
the spline function are fitted to high-level ab initio values of geometries, reaction energies, and vibrational stretching
frequencies. The overall quality of the repulsive potential is sensitive
to the chosen division points. Therefore, establishing reliable repulsive
potential parameters remains a time-consuming step despite the progress
of partially automatized fitting procedures.[80,82]
As described in
the text, U, Espipan class="Chemical">n, εs,d, nmax, and α are n class="Chemical">consistenpan>t with the standard choices of npan> class="Chemical">DFTB
parametrization
and not subject to optimizations. By contrast, rwf, rwfd, rdens, εp, and Ud are adjusted based on properties of molecules in the fitting set.
The repulsive potentials are
fitted to several properties as summarized
ipan class="Chemical">n Table 2. Reference values are calculated
from ab initio methods that are known to produce
results close to experimental data. For n class="Chemical">magnesium, optimized geometries
of small closed-shell molecules are calculated from B3LYP[83−85]/aug-cc-pVTZ, and reaction enpan>ergies are calculated from the G3B3[86−88] method. For zinc, both optimized geometries and reaction enpan>ergies
are calculated from B3LYP/aug-cc-pVTZ. Additional equations for the
fitting process are prepared using a few vibrational stretching frequenpan>cies
determined from BLYP/aug-cc-pVTZ (unscaled) using the harmonic approximation.
B3LYP with good quality basis set has beenpan> shown to describe structures
of n class="Chemical">Zn complexes in good agreement with X-ray[89] and CCSD(T)[43,90] data. However, heats of formation
(HOF) predicted from B3LYP/aug-cc-pVTZ deviate ∼10 kcal/mol
on average from experimental data of eight small complexes even though
CCSD(T)/aug-cc-pVTZ does not do much better with a mean unsigned error
of ∼7 kcal/mol.[91] B97-1,[92] B97-2,[93] and B98[94,95] have also been tested because they have been reported to be reliable
for reproducing experimental data for small transition metalcomplexes;[96] geometries from the B97 family functionals and
B3LYP are very similar, and reaction energies differ by ∼2
kcal/mol (data shown in the Supporting Information). Despite shortcomings in HOF, we have chosen to use the B3LYP functional
because it is computationally less intensive so that large sets of
consistent reference data are accessible, and it systematically gives
reliable results for binding energies and proton affinities, which
are of major importance to our work. Since the systems studied here
are relatively small in size, dispersion interactions are not expected
to impact the structural and energetic properties of interest. For
a fairly large active site model for a zinc enzyme, we do discuss
briefly the impact of dispersion. For even larger cases, we note that
empirical dispersion models have been developed and tested for both
SCC-DFTB[97,98] and DFTB3.[99]
reference
potential energy differences in kcal/mol
[Mg(H2O)2]2+
→
[Mg(H2O)]2+ + H2O
73.3
[Mg(H2O)3]2+
→
[Mg(H2O)2]2+ + H2O
59.2
[Mg(CO)4]2+
→
[Mg(CO)3]2+ + CO
30.9
[Mg(NH3)]2+
→
Mg2+ + NH3
97.7
[Mg(NH3)2]2+
→
[Mg(NH3)]2+ + NH3
83.9
[Mg(NH3)3]2+
→
[Mg(NH3)2]2+ + NH3
63.1
[Mg(NH3)4]2+
→
[Mg(NH3)3]2+ + NH3
50.1
[Mg(NH3)5]2+
→
[Mg(NH3)4]2+ + NH3
28.6
[Mg(NH3)6]2+
→
[Mg(NH3)5]2+ + NH3
26.3
[Mg(SH2)2]2+
→
[Mg(SH2)]2+ + SH2
62.4
[Mg(SH2)3]2+
→
[Mg(SH2)2]2+ + SH2
43.9
[Mg(SH2)4]2+
→
[Mg(SH2)3]2+ + SH2
34.5
[Mg(SH2)5]2+
→
[Mg(SH2)4]2+ + SH2
20.0
[Mg(PH3)2]2+
→
[Mg(PH3)]2+ + PH3
66.9
[Mg(PH3)4]2+
→
[Mg(PH3)3]2+ + PH3
34.8
[Zn(H2O)]2+
→
Zn2+ + H2O
104.3
[Zn(H2O)4]2+
→
[Zn(H2O)3]2+ + H2O
43.0
[Zn(CO)]2+
→
Zn2+ + CO
81.6
[Zn(CO)2]2+
→
[Zn(CO)]2+ + 2CO
149.8
[Zn(CO)3]2+
→
[Zn(CO)]2+ + 3CO
190.1
[Zn(SH2)]2+
→
Zn2+ + SH2
126.1
[Zn(SH2)2]2+
→
Zn2+ + 2SH2
210.5
[Zn(PH3)4]2+
→
[Zn(PH3)3]2+ + PH3
30.5
See text for the methods used as
references for different properties.
Condensed-Phase Benchmark Calculations
To supplement
the gas-phase calculatiopan class="Chemical">ns, simulations are carried out with n class="Chemical">DFTB3/MM
to investigate if the parametrized model works in anpan> explicit n class="Chemical">condensed-phase
environment. They are important tests because the ultimate goal is
to use n class="Chemical">DFTB3/3OB in enzyme simulations.
Mg2+ and Zn2+ Ion Solvation and pKa Calculations
Both MM and QM/MM simulatiopan class="Chemical">ns
are performed using CHARMM.[100] The generalized
solvent boundary potential (n class="Chemical">GSBP)[101,102] is used to
treat long-range electrostatic interactions in MD simulations. The
system is set up with an inpan>ner region of 20 Å. All molecules
are located in the inner region, and the outer region is described
with a n class="Chemical">constant dielectric continuum with εw = 80.
To be consistent with the GSBP protocol, electrostatic interactions
among inner region atoms are treated with the extended electrostatic
model[103] in which interactions beyond 12
Å are modeled with multipolar expansions, including the dipolar
and quadrupolar terms. The reaction field matrix M is
evaluated using 400 spherical harmonics for GSBP simulations. The
ion is fixed at the center of the sphere. Water molecules are described
with the modified version of TIP3P.[104] In
the classical MM simulations, standard CHARMM force field[105] parameters are used for Mg2+ and
Zn2+. The QM/MM calculations employ the same n class="Chemical">water sphere
as in MM simulations. The QM region includes the metal ion together
with the nearest 6 water molecules (referred to as small QM region)
or the nearest 25 water molecules (referred to as large QM region)
which include the first and the second solvation shells. All the remaining
water molecules are represented by the TIP3P water model. In the large
QM region simulations, to ensure the QM water molecules are always
closest to the metal ion, Flexible Inner Region Ensemble Separator
(FIRES[106]) restraining potential is imposed
to any outer sphere MM water molecules that are closer to the ion
than the most distant inner sphere QM water molecule. The FIRES potential
takes the form EFIRES = (1/2)∑kFIRES(r – Rinner)2 where Rinner is defined as the distance between the
metal ion and the most distant inner QM water molecule from it. kFIRES is set to be 100 kcal/(mol·Å2). FIRES restraints are not necessary for the small QM region
simulations because water molecules in the first solvation shell are
strongly bound to the metal ion. All bonds involving hydrogen are
constrained with the SHAKE algorithm,[107] allowing a 1 fs time step for MD propagation. Simulations are equilibrated
for 500 ps followed by a production run of 500 ps.
See text for the methods used as
references for differepan class="Chemical">nt properties.
To further evaluate the performance of papan class="Chemical">n class="Chemical">DFTB3/3OB for
npan> class="Chemical">Mg and Zn
from the energetics perspective, we perform calculations of relative
pKa of Mg2+/Zn2+-bound water in a water sphere using the thermodynamic integration
(TI)[108,109] approach within the dual topology single
coordinate (DTSC) scheme; absolute pKa results are discussed in the Supporting Information.[66,110] In this approach, the dominant contribution
to the total free energy of deprotonation is from the electrostatic
free energy change (ΔFM) associated with converting the
acidic proton to a dummy atom (D), i.e., transforming M2+·(H2O)6 to M2+·(H2O)5(HOD)−, where M represents
the metal ion, Mg2+ or Zn2+. The corresponding
free energy derivative is given bywhich represents the QM/MM
energy difference averaged for a specific coupling parameter λ
using the same set of coordinates for both protonation states. The
total electrostatic energy contribution (ΔFM) is determined by integrating the free energy derivatives (∂F/∂λ) over λ from 0 to 1. Estimation
of the absolute pKa is in principle possible
with the DTSC-TI protocol, but the solvation free energy of a proton
has to be taken into account. To avoid the difficulty of getting accurate
experimental/calculated values for this term, we compute the relative
pKa between Mg2+·(H2O)6 and Zn2+·(H2O)6 in water so that other contributions are canceled out to
a great extent, such as the zero-point energy difference between the
protonated and deprotonated states and van der Waals interactions
involving the acidic proton.[66,110] Since ΔFM is electrostatic in nature, we expect that the free energy derivative
depends largely linearly on the coupling parameter λ in this
simple system; this is supported by actual calculations (see below).
We carry out pKa calculations with both
small QM region and large QM region as defined above. To avoid dissociation
of hydroxide from the metal ion in the fully deprotonated state (λ
= 1 window M2+·(H2O)5(HOD)−), a NOE potential is added to the distance between
M2+ and OH–. The NOE potential takes
the formin which rmin and rmax set the interval between
which the restraining
potential is zero; they are taken to be 1.8 and 2.5 Å, respectively. kmax is set to be 1000 kcal/(mol·Å2). To prevent the dummy atom D from being too close to the
metal site in the fully deprotonated state, a weak angular constraint
is applied to the M–O–D angle to keep it no smaller
than 120°.
Myosin-II Enzyme Setup: Test for a Mg Site
The enzyme
model is built from a high-resolutiopan class="Chemical">n X-ray structure for the n class="Chemical">Mg·n class="Chemical">ADP·vanadata
bound myosin[111] (PDB code 1VOM), which corresponds
to the hydrolyzing state of the D. discoideum myosin-II
motor domain.[112,113] As described in our earlier
work,[68,114,115] starting
from the PDB structure, the ADP·vanadata is replaced by ATP or
ADP·Pi. The QM region includes the triphosphate and part of the
ribose group of ATP, three water molecules (include the lytic water)
in the active site, Ser 181, Ser 236, the Mg2+ ion, and
all its ligands (Thr 186 and Ser 237 and two water molecules). Only
side chains of protein residues are included in the QM region, and
link atoms are introduced to saturate the valence of the QM boundary
atoms. ATP is assumed to be fully deprotonated, and the titratable
groups are kept in their normal protonation states (i.e., Lys, Arg
are protonated, Asp, Glu are deprotonated) which are consistent with
a simple estimate of pKa using the Poisson–Boltzmann
(PB) approach.[116] GSBP is used as the simulation
protocol. The system is divided into a 24 Å spherical inner region
centered at the Mg ion. Newtonian equations-of-motion are solved for
the MD region (within 20 Å), and Langevin equations-of-motion
are solved for the buffer region (20–24 Å) with a temperature
bath of 300 K. Protein atoms in the buffer region are harmonically
constrained with force constants determined from the crystallographic
B-factors.[117] Protein atoms in the MM region
are described by the all-atom CHARMM22 force field,[105] and water molecules are described with the TIP3P model.
All bonds involving hydrogen are constrained using SHAKE, and the
time step is set to 1 fs. All the water molecules in the inner region
are subject to a weak GEO type of restraining potential to keep them
inside the inner sphere with the MMFP module in CHARMM. The static
field due to outer-region atoms, ϕs(io), is evaluated with the linear PB
equations. The reaction field matrix M is evaluated with
625 spherical harmonics. In the PB calculations, the protein dielectric
constant is set as 1; the water dielectric constant is set as 80;
and 0.0 M saltconcentration is used. Our previous[118] and recent[119] analyses indicated
that the GSBP protocol is reliable for a site well shielded from the
bulk solvent.
To study the structural flexibility in the active
site of myosipan class="Chemical">n before and after n class="Chemical">ATP hydrolysis, potential of mean
force (PMF) simulations have been carried out. The reaction coordinate
is defined as the distance between Mg2+ and atom O in Ser
237 (see Figure 5). The umbrella sampling approach[120] is used to restrain the system along the reaction
coordinate with a force constant of 150 kcal/mol·Å–2. Thirty-two windows are used for PMF, and a 500 ps simulation is
performed for each window. Convergence of PMF is monitored by examining
the overlap of reaction coordinate distributions sampled in different
windows. Weighted histogram analysis (WHAM)[121] is used to postanalyze the probability distribution to obtain the
PMF along the reaction coordinate. Errors are calculated from block
average.
Figure 5
Structural
properties of the Myosin active site during equilibrium
QM/MM MD simulations with different nucleotide chemical states (ATP
or ADP·Pi). (a) Overlay of average DFTB3/MM structure and crystal
structure (gray) of Myosin-II motor domain active site in the ATP
state. The VO4 moiety in the crystal structure (PDB code 1VOM) is replaced by
a PO3 phosphate group and a presumptive lytic water molecule.
Instantaneous distances between Mg2+ and oxygen atoms in
its four nonwater ligands at (b) ATP state and (c) ADP·Pi state.
Ser237 refers to OSer237 and Thr186 refers to OThr186. (d) PMF calculations for the ATP/ADP·Pi states. The reaction
coordinate is the distance between Mg2+ and OSer237.
Alkaline Phosphatase Enzyme Setup: Test for
a Bimetallic Zinc
Site
The enzyme model is papan class="Chemical">n class="Chemical">conpan>structed based on the X-ray structures
for the n class="Species">E. coli Alkaline Phosphatase (AP) mutant
R166S with bound n class="Chemical">inorganic phosphate at 2.05 Å resolution (PDB
code 3CMR(122)). The substrate methyl p-nitrophenyl
phosphate (MpNPP–) is first “mutated”
to the orientation with the −OMe group oriented toward the
magnesium ion (denoted as α orientation) starting from the PDB
structure. All basic and acidic amino acids are kept in their physiological
protonation states except for Ser102 in AP, which is accepted to be
the nucleophile and deprotonated in the reactive complex. Water molecules
are added following the standard protocol of superimposing the system
with a water droplet of 25 Å radius centered at Zn12+ (see Figure 6 for atomic labels) and removing
water molecules within 2.8 Å from any heavy atoms resolved in
the crystal structure. GSBP is used to treat long-range electrostatic
interactions in MD simulations. In the QM/MM simulations, as described
in details in our previous work,[48,49] the QM region
includes the two zinc ions and their six ligands (Asp51, Asp369, His370,
Asp327, His412, His331), Ser102, and the substrate MpNPP–. Only side chains of protein residues are included in the QM region,
and link atoms are added between Cα and Cβ atoms. A NOE potential is added to the C–O bonds in Asp51
in AP to prevent artificial polarization.[48] The integration time step is 1 fs, and all bonds involving hydrogens
are constrained using SHAKE. The DFTB3/MM results are also compared
to MM simulations using a conventional nonbonded zinc model[14] (referred to as a Coulomb scheme) or short–long
effective functions (SLEF1)[25] model developed
by Zhang and co-workers. Protein atoms in the MM region are described
by the all-atom CHARMM22 force field, and water molecules are described
with the TIP3P model.
Figure 6
Structural properties of the AP active site
during equilibrium
QM/MM MD Simulations. A snapshot for the reactant state with (a) Coulomb
MM force field, (b) SLEF1 zinc MM force field, (c) DFTB3/3OB, and
(d) overlay of crystal and DFTB3/MM structure (blue), with average
key distances in angstroms labeled. Asp369, His370, and His412 are
omitted for clarity. Substrate inorganic phosphate in the crystal
structure (3CMR) is not shown.
To further benchmark the applicability
of papan class="Chemical">n class="Chemical">DFTB3/3OB to the reactionpan> of interest, we also study an active
site model that includes all QM atoms in the QM/MM enpan>zyme model. The
Cβ n class="Chemical">carbons are fixed at their positions in the crystal
structure during geometry optimization; the positions of the link
atoms used to saturate the valence of the Cβ atoms
in the active site model are also fixed. The reactant (Michaelis)
complex and transition state are located for MpNPP– (methyl p-nitrophenyl phosphate), MmNPP– (methyl m-nitrophenyl phosphate), and n class="Chemical">MPP– (methyl phenyl phosphate) using DFTB3 and B3LYP with the 6-31+G(d,p)
basis set. The minimum energy path (MEP) calculations are carried
out by one-dimensional adiabatic mapping at the DFTB3 level; the reaction
coordinate is the antisymmetric stretch involving the breaking and
forming P–O bonds. Subsequently, the saddle point is further
refined by conjugated peak refinement (CPR[123]) to obtain precise transition state structure. Single-point energy
calculations at DFTB3 and B3LYP geometries are then carried out using
B3LYP, M06,[124] PBE, and MP2 methods using
a larger basis set of 6-311++G(d,p). The D3[125] dispersion corrections are added for B3LYP, M06, and PBE methods.
Results and Discussions
In the followipan class="Chemical">ng subsections,
we present benchmark calculations
for n class="Chemical">DFTB3/3OB regarding geometries, n class="Chemical">metal–ligand dissociation
energies, and proton affinities for Mg/Zncontaining molecules and
compare the results with commonly used ab initio methods
in the gas phase. We also report DFTB3 results when the old set of
MIO parameters is used:[75] DFTB3/MIO use
parameters as defined in ref (75), Ud = −0.23, −0.16,
−0.13, −0.19, −0.14, and −0.09 au for
C, H, N, O, P, and S and ζ = 4.2; Ud for Mg/Zn is fitted to −0.1 to achieve the best sequential
ligand dissociation energies and proton affinities. Unless noted otherwise,
energies are given for geometries that are optimized at the respective
level of theory. For PM6, we note that the choice of heat of formation
for protons is essential to proton affinities; we take −54.0
kcal/mol for protons as suggested by the MOPAC program. To compare
the potential energy surfaces between different computation levels,
zero-point energies (ZPEs) are not included. For the calculations
of noncovalent interactions, dispersion interactions are not included
because most test molecules discussed here are small, although we
are aware of the importance of dispersion in general. We explore explicitly
the effects of dispersion to a nontrivial active site model of AP
that involves zinc. For DFTB3 calculations, we have used our in-house
DFTBcode, and all PM6, DFT, and MP2 calculations are performed using
the Gaussian09 software package.[126] Following
the gas-phase benchmarks, we show results on Mg2+/Zn2+-water structural properties and relative pKa in solution. Finally we apply DFTB3/3OB to enzyme systems
to further test the new parameters.
Magnesium
Geometries
The “small” n class="Chemical">magnesiumpan> test
set is n class="Chemical">composed of 55 molecules, which are simplified biologically
relevant ligated n class="Chemical">magnesium species of the form Mg2+[X] with X = H2O, NH3,
SH2, PH3, CO for n = 1–4/6
and X = OH–, SH–, NH2–, and
PH2– for n = 1–2. Table 3 summarizes the results, and further details are included in the Supporting Information. Good agreement is seen
between MP2 and B3LYP with a small or large basis set. Of the semiempirical
methods, geometries are described very well by DFTB3 models and particularly
well by DFTB3/3OB. The MADs in metal–ligand distances are 0.013,
0.017, and 0.050 Å for DFTB3/3OB, DFTB/MIO, and PM6, respectively.
3OB improves over MIO, although the magnitude of improvement is modest.
The largest metal–ligand distance error of 0.072 Å from
DFTB3/3OB is found in Mg–S distance in [Mg(SH2)]2+, which is likely due to the minimal basis nature of DFTB3
for a polarizable ligand. Other deviations in metal–ligand
distances are substantially smaller in DFTB3/3OB. Molecules containing
S or P are described poorly by PM6 within our test case;[78] PM6 systematically underestimates Mg–S
distance by 0.2 Å and Mg–P distance by 0.1 Å. For
angles, DFTB3/3OB differs from the reference data by a MAD of 2.9°,
which is of the same order with other semiempirical methods. The largest
angle derivation of 45.3° is for angle Mg–O–H in
[Mg(OH)]+, for which DFTB3/3OB predicts a linear rather
than bent structure; interestingly, MP2 also predicts a structure
almost linear, while PM6 is close to the B3LYP result (see Supporting Information). Future development of
DFTB3 to include multipoles may improve the geometry in this case.
It is worth pointing out that the Mg–S–H angle is generally
overestimated by ∼13° in DFTB3/3OB. More statistics including
MAD for different metal–ligand types are given in the Supporting Information.
Table 3
Mean and
Maximum Absolute Deviation
of Metal–Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ
for the Magnesium Test Set
propertya
Nb
MP2c
B3LYPd
PM6
DFTB3/MIO
DFTB3/3OB
r (Å)
120
0.009
0.009
0.050
0.017
0.013
rmax (Å)
0.051
0.024
0.284
0.141
0.072
a (deg)
117
0.9
0.5
3.4
3.6
2.9
amax (deg)
23.7
10.2
38.7
41.3
45.3
Max stands for maximum absolute
deviation.
Number of comparisons.
Basis set aug-cc-pVTZ.
Basis set 6-31+G(d,p).
Max stands for maximum absolute
deviation.n class="Chemical">Number of n class="Chemical">comparisons.
Basis set aug-cc-pVTZ.Basis set 6-31+G(d,p).
Sequential Metal–Ligand
Dissociation Energies
We define sequepan class="Chemical">ntial ligand dissociation
energies (n class="Chemical">sBDEs) n class="Chemical">corresponding
to the reactionwhere M is Mg2+/Zn2+ and L is a neutral or
charged ligand.
Table 4 shows the sequential
ligapan class="Chemical">nd dissociation energies for compounds
n class="Chemical">containing Mg2+. Values obtained for DFTB3/3OB and other
methods are compared to those obtained using G3B3. The differences
between B3LYP, MP2, and G3B3 are relatively small; however, complexes
with charged ligands give several deviations greater than 5 kcal/mol.
DFTB3/3OB shows a MAD of 2.4 kcal/mol for complexes with neutral ligands
and 13.8 kcal/mol for complexes with charged ligands. Compared to
DFTB3/MIO, DFTB3/3OB improves substantially for complexes involving
P, S elements and charged ligands. For the other semiempirical method
PM6, the deviation is 9.9 kcal/mol compared to G3B3. The large error
is anticipated because geometries containing S and P are often poor.
Note that we compare potential energy differences for all the methods
except PM6 which intrinsically considers heats of formation. We have
also carried out single-point G3B3 calculations at DFTB3/3OB geometries,
and the MAD for Mg2+compounds with neutral and charged
ligands is less than 1 kcal/mol. These results highlight the similarities
between DFTB3 and B3LYP optimized structures, suggesting that DFTB3/3OB
is of particular value for the study of biologically relevant magnesium-containing
molecules.
Table 4
Sequential Ligand Dissociation Energies
of Magnesium -Containing Molecules Compared to G3B3a
moleculeb
G3B3
MP2c
B3LYPc
PM6
DFTB3/MIO
DFTB3/3OB
G3B3//DFTB3d
[Mg(H2O)2]2+
73.3
–3.5
–0.1
–10.9
–3.4
+0.5
+0.6
[Mg(H2O)3]2+
59.2
–2.0
–1.1
–11.7
–1.5
+1.4
–0.3
[Mg(H2O)4]2+
49.1
–2.3
–3.0
–11.0
–3.3
–1.9
+1.2
[Mg(H2O)5]2+
34.7
–2.2
–5.0
–7.4
–6.7
–6.4
–0.2
[Mg(H2O)6]2+
32.9
–2.5
–5.9
–11.0
–6.8
–6.4
+0.4
[Mg(CO)2]2+
46.3
–0.9
–0.3
+10.0
–2.9
–3.8
–0.6
[Mg(CO)3]2+
36.1
+0.5
–0.9
+1.3
–0.3
–2.3
–0.7
[Mg(CO)4]2+
30.9
–0.1
–2.8
–1.5
–0.1
–2.6
–0.9
[Mg(NH3)2]2+
83.9
–3.4
–0.3
–4.4
–13.8
–6.4
+0.5
[Mg(NH3)3]2+
63.1
–1.8
–2.0
–9.2
–7.5
–1.4
+0.2
[Mg(NH3)4]2+
50.1
–1.8
–3.7
–8.8
–3.9
+1.2
+0.2
[Mg(NH3)5]2+
28.6
–1.0
–5.3
–2.9
–5.8
–3.5
–0.2
[Mg(NH3)6]2+
26.3
–1.5
–6.2
–5.5
–5.6
–2.5
–0.2
[Mg(SH2)2]2+
62.4
–3.0
–0.3
–7.0
–7.9
–2.9
–1.2
[Mg(SH2)3]2+
43.9
–1.1
–2.6
–12.1
–4.5
+0.1
–0.6
[Mg(SH2)4]2+
34.5
–0.8
–4.4
–13.3
–2.8
+1.2
+1.7
[Mg(SH2)5]2+
20.0
+0.2
–6.1
–12.6
+0.0
+1.1
–0.8
[Mg(SH2)6]2+
20.7
–0.5
–7.6
–14.0
–1.8
–0.1
+0.5
[Mg(PH3)2]2+
66.9
–2.3
–0.5
–2.4
–8.6
–0.1
+0.7
[Mg(PH3)3]2+
44.5
–0.8
–2.9
–18.6
–5.4
+2.3
+0.7
[Mg(PH3)4]2+
34.8
–0.6
–4.5
–17.9
–3.8
+3.2
+0.6
[Mg(OH)2]
243.1
–7.2
–4.6
–7.1
–21.0
–14.0
+0.1
[Mg(NH2)2]
237.9
–7.3
–6.6
–4.6
–9.4
–12.0
+0.9
[Mg(SH)2]
201.2
–3.1
–5.0
–30.3
–16.1
–15.8
+0.1
[Mg(PH2)2]
188.8
–1.8
–5.2
+11.0
–20.9
–13.3
+0.6
MAD
2.1
3.5
9.9
6.6
4.3
0.6
MAX
7.3
7.6
30.3
21.0
15.8
1.7
Energies except PM6 are calculated
at 0 K excluding zero-point energy and thermal corrections. All numbers
are given in kcal/mol.
The
nondissociated is listed.
Basis set aug-cc-pVTZ.
G3B3 single-point calculations on
top of DFTB3/3OB geometries.
Proton Affinities
Table 5 lists
the proton affipan class="Chemical">nities for 22 biological relevant molecules n class="Chemical">Mg2+(XH) where X = O, n class="Chemical">N, S, and P obtained from G3B3, DFTB3/3OB, and other
methods. MADs for MP2 and B3LYP are generally small (∼1 kcal/mol).
The discrepancy between DFTB3/3OB and G3B3 is 5 kcal/mol, whereas
DFTB3/MIO and PM6 yield slightly larger MADs of 7.8 and 6.6 kcal/mol,
respectively. We note that G3B3 single points at DFTB3/3OB optimized
structures lead to small MAD of 1.5 kcal/mol and MAX of 3.9 kcal/mol
(due to the unsatisfactory deprotonated [Mg(SH2)]2+ structures) compared to the original
G3B3, again highlighting the good quality of DFTB3/3OB structures.
Future improvements should focus on the nitrogen-containing species,
an issue we noted in previous studies.[75,77]
Table 5
Gas-Phase Proton Affinities of Magnesium-Containing
Molecules Compared to G3B3a
moleculeb
G3B3
MP2c
B3LYPc
PM6
DFTB3/MIO
DFTB/3OB
G3B3//DFTB3d
[Mg(H2O)1]2+
95.6
+0.6
–3.5
+0.1
–5.5
–13.9
–0.8
[Mg(H2O)2]2+
115.6
+0.0
–1.4
–5.4
–2.2
–9.0
–0.9
[Mg(H2O)3]2+
133.1
+0.6
+0.2
–7.1
+3.0
–1.2
–0.4
[Mg(H2O)4]2+
149.2
–0.6
–0.1
–11.4
+4.8
+2.3
+0.8
[Mg(H2O)5]2+
160.3
–2.2
–1.9
–15.7
+3.2
+2.0
+1.1
[Mg(H2O)6]2+
170.5
–3.0
–2.2
–24.3
+0.2
–0.2
+0.8
[Mg(NH3)1]2+
120.2
+1.5
–3.5
+2.6
–10.5
–15.7
+0.0
[Mg(NH3)2]2+
144.1
+0.8
–0.4
+1.5
–10.2
–14.1
+0.2
[Mg(NH3)3]2+
163.9
+0.5
+0.3
+1.9
–10.6
–13.6
+0.5
[Mg(NH3)4]2+
181.3
–0.1
+0.6
+1.0
–6.4
–7.5
+1.0
[Mg(NH3)5]2+
190.5
–0.4
+0.9
+4.9
–5.2
–6.1
+1.4
[Mg(NH3)6]2+
200.7
+1.8
–0.5
+5.0
–6.2
–5.4
+2.3
[Mg(SH2)1]2+
83.4
+2.1
–1.8
+2.4
–6.7
–0.7
–3.6
[Mg(SH2)2]2+
107.0
+0.8
+0.5
+4.6
–6.2
+0.8
–2.9
[Mg(SH2)3]2+
123.1
+0.4
+1.5
+3.3
–6.2
+2.1
–3.5
[Mg(SH2)4]2+
135.4
+0.0
+2.0
+1.0
–5.7
+4.2
–1.7
[Mg(SH2)5]2+
143.1
–0.2
+2.0
–5.7
–6.9
+3.9
–3.9
[Mg(SH2)6]2+
151.1
–0.6
+1.8
–12.4
–8.7
+3.5
–3.0
[Mg(PH3)1]2+
96.8
+4.1
–1.8
+12.6
–12.8
–1.1
+0.6
[Mg(PH3)2]2+
129.4
+1.7
+0.4
+10.2
–15.3
–0.4
+1.4
[Mg(PH3)3]2+
150.1
+1.0
+1.2
+8.2
–17.6
–1.7
+1.4
[Mg(PH3)4]2+
164.2
+0.7
+1.7
+3.0
–18.1
–0.7
+1.2
MAD
1.1
1.4
6.6
7.8
5.0
1.5
MAX
4.1
3.5
24.3
18.1
15.7
3.9
Energies except PM6 are calculated
at 0 K excluding zero-point energy and thermal corrections. All numbers
are given in kcal/mol.
The
protonated form is listed.
Basis set aug-cc-pVTZ.
G3B3 single-point calculations on
top of DFTB3/3OB geometries.
Energies except papan class="Chemical">n class="Chemical">PM6 are calculated
at 0 K excluding zero-poinpan>t energy and thermal n class="Chemical">corrections. All numbers
are given in kcal/mol.
The
nondissociated is listed.Basis set aug-cc-pVTZ.G3B3 single-point calculations on
top of n class="Chemical">DFTB3/3OB geometries.
Energies except papan class="Chemical">n class="Chemical">PM6 are calculated
at 0 K excluding zero-poinpan>t energy and thermal n class="Chemical">corrections. All numbers
are given in kcal/mol.
The
protonated form is listed.Basis set aug-cc-pVTZ.G3B3 single-point calculations on
top of n class="Chemical">DFTB3/3OB geometries.
Large Test Set
To further test n class="Chemical">DFTB3pan>/3OB parameters
we have n class="Chemical">compiled a “ln class="Chemical">arge test set” that models biologically
relevant ligands of magnesium with larger molecules than those discussed
above. The set mainly consists of model magnesiumcompounds that are
simplified active sites of crystal structures collected in a survey.[127] Although Mg2+ is generally 6-coordinated,
complexes with coordination number of 4 and 5 are also investigated.
All complexes are constructed in the way that all ligands are coordinated
to the central magnesium ion rather than via hydrogen bonds. CH3COO–/HCOO– is used to
model the Asp/Glu side chain, CH3OH to represent the Ser/Thr
side chain, and OCHNH2 to represent carbonyls. In total,
the set contains 58 magnesiumcompounds (see Supporting
Information).
n class="Chemical">Metalpan>–ligand distances calculated
with n class="Chemical">DFTB3/3OB deviate only by 0.018 Å on average and 0.125 Å
at most (Table 6). The case with the largest
error is the Mg–S distance in [n class="Chemical">Mg(SH2)4(SH)]−. Besides this outlier, all other metal–ligand
length errors are substantially smaller within 0.1 Å. DFTB3/3OB
outperforms DFTB3/MIO in the large test set. B3LYP with the small
basis set 6-31+G(d,p) deviates on average by 0.011 Å, which is
comparable with DFTB3/3OB, and the maximum deviation is 0.033 Å,
which is much better than 3OB. PM6 strongly underestimates Mg–O
distance by 0.04 Å. More statistics including mean absolute deviations
for different metal–ligand types are given in the Supporting Information. For the angle, the MADs
are 3.1, 4.5, 0.7, and 11.2°, respectively, for DFTB3/3OB, DFTB3/MIO,
B3LYP/6-31+G(d,p), and PM6. In the O–Mg–O angles calculated
with DFTB3/3OB, about 6% out of 505 angles deviate more than 10°;
the large errors stem mainly from erroneous orientation of hydroxide.
Table 6
Mean and Maximum Absolute Deviation
of Metal–Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ
for the Magnesium Large Test Set
propertya
Nb
B3LYPc
PM6
DFTB3/MIO
DFTB3/3OB
r (Å)
329
0.011
0.063
0.041
0.018
rmax (Å)
0.033
1.414
0.481
0.125
a (deg)
658
0.7
11.2
4.5
3.1
amax (deg)
17.4
64.0
37.5
30.8
Max stands for maximum absolute
deviation.
Number of comparisons.
Basis set 6-31+G(d,p).
Due to the relatively high affinity of papan class="Chemical">n class="Chemical">ATP/n class="Chemical">ADP to Mg2+ and the importance of thismetal chelate in the ATP-binding enzyme,
we have constructed model molecules involving Mg and phosphate/model
AT(D)P as an additional test (Figure 1). The
precise binding mode of Mg2+ to ATP remains unclear despite
extensive spectroscopic studies. NMR studies proposed several models
for the Mg·ATPcomplex: β-monodentate,[128] β,γ-bidentate,[129] α,β,γ-tridentate,[130,131] and a mixture
of α,β-, β,γ-, and α,γ-bidentate.[132] Raman and infrared spectra suggested a mixture
of β,γ-bidentate and α,β,γ-tridentate.[133] Here two stable binding modes are tested, one
with Mgcoordinating β- and γ-phosphates and the other
with Mgcoordinating α-, β-, and γ-phosphates.[134] Geometries are described very well by DFTB3/3OB.
The Mg–O distance agrees on average within 0.04 Å with
B3LYP/aug-cc-pVTZ calculations. The O–Mg–O, O–P–O,
and P–O–P angles differ on average 5, 1, and 7°,
respectively.
Figure 1
Optimized structures of Mg·AT(D)P·water
model molecules
at different QM levels. Distances are given in angstroms. The numbers
without parentheses were obtained at the B3LYP aug-cc-pVTZ level;
those in the parentheses are DFTB3 values. The B3LYP optimized structures
are depicted in colors, whereas DFTB3 optimized structures are colored
in blue.
Max stands for maximum absolute
deviation.n class="Chemical">Number of n class="Chemical">comparisons.
Basis set 6-31+G(d,p).Optimized structures of n class="Chemical">Mgpan>·AT(D)P·n class="Chemical">water
model molecules
at different QM levels. Distances are given in angstroms. The numbers
without parentheses were obtained at the B3LYP aug-cc-pVTZ level;
those in the parentheses are n class="Chemical">DFTB3 values. The B3LYP optimized structures
are depicted in colors, whereas DFTB3 optimized structures are colored
in blue.
Ligand dissociatiopan class="Chemical">n energies are
compared in Table 7 (for detailed data, see Supporting Inclass="Chemical">pan>formation). One neutral n class="Chemical">oxygen ligand
is removed from the magnesiumcomplexes
selected from the large test set. Because the reactions are chosen
somewhat arbitrarily, the statistics given in Table 7 have to be n class="Chemical">considered with care. MADs are reasonably small
for all semiempirical methods; the MAD for DFTB3/3OB is 4.6 kcal/mol,
even smaller than B3LYP/aug-cc-pVTZ with 5.3 kcal/mol. Interestingly,
DFTB3/3OB and B3LYP turn out to systematically underestimate the dissociation
energy of the oxygen ligand; the errors seem to be inherited from
the sBDE reactions where one water molecule is taken away from [Mg(H2O)5]2+ or [Mg(H2O)6]2+. Further fitting Ud or
εp cannot improve these reaction energies significantly.
To demonstrate the good performance of DFTB3 for structural properties,
we have carried out single-point G3B3 energies at DFTB3 geometries,
and the MAD is only 0.8 kca/mol.
Table 7
Error Statistics for the Ligand Dissociation
Energies and Proton Affinities of Magnesium Containing Large Molecules
Compared to G3B3a
B3LYPb
PM6
DFTB3/MIO
DFTB3/3OB
G3B3//DFTB3c
Ligand Dissociation
Energies
MAD
5.3
5.1
3.5
4.6
0.8
MAX
7.9
13.0
6.6
7.7
2.2
Proton Affinities
MAD
1.9
17.2
2.7
2.6
0.9
MAX
4.8
25.0
8.0
6.1
2.3
For detailed data, see Supporting Information.
Basis set aug-cc-pVTZ.
G3B3 single-point calculations
on
top of DFTB3/3OB geometries.
Protopan class="Chemical">n affinities are also
n class="Chemical">compiled in Table 7. The MAD for n class="Chemical">DFTB3/3OB
is 2.6 kcal/mol, similar to B3LYP/aug-cc-pVTZ
with 1.9 kcal/mol. Despite the structural difference for the deprotonated
structures mentioned above, the G3B3 single-point calculations at
DFTB3/3OB geometries show excellent statistics for proton affinities,
demonstrating the strength of DFTB3/3OB in reproducing B3LYP/aug-cc-pVTZ
geometries or at least reproducing relative energies for multiple
points on the potential energy surface. Similar to findings from our
previous studies,[78] PM6 appears to be less
reliable for proton affinities with large errors.
For detailed data, see Supporting Information.Basis set aug-cc-pVTZ.G3B3 single-point calculations
on
top of n class="Chemical">DFTB3/3OB geometries.
Zinc
Typical biological ligands for papan class="Chemical">n class="Chemical">Zn2+ inpan>clude n class="Chemical">water, n class="Chemical">cysteine, histidine, and glutamic/aspartic acids.
Therefore, we performed calculations for a series of complexes that
involve the zinc ion and small molecule models for these ligands.
Similar to what has been done for magnesium, we have compiled a test
set in the form of Zn2+[X] with X = H2O, NH3, SH2, PH3, CO, NH=CH2, and SH–CH3 for n = 1–4/6 and X = OH–, SH–, NH2–, PH2–, NCH2–, and SCH3– for n = 1–2. Here two extra ligands NH=CH2 and SH–CH3 (and their deprotonated form)
are added to model histidine and cysteine side chains. The structural
comparisons are given in Table 8.
Table 8
Mean and
Maximum Absolute Deviation
of Metal–Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ
for the Zinc Test Set
propertya
Nb
MP2c
B3LYPd
PM6
DFTB3/MIOe
DFTB3/3OB
r (Å)
163
0.017
0.004
0.065
0.027
0.012
rmax (Å)
0.077
0.011
0.553
0.139
0.091
a (deg)
180
0.6
0.2
4.2
3.0
2.3
amax (deg)
4.3
4.2
64.0
16.2
14.6
Max stands for maximum absolute
deviation.
Number of comparisons.
Basis set aug-cc-pVTZ.
Basis set 6-31+G(d,p).
Molecules containing PH2– or
H– do not converge and are excluded from statistics.
The
n class="Chemical">DFTB3pan>/3OB results are tested against B3LYP/aug-cc-pVTZ for geometries
and enpan>ergetics. Overall good geometries are obtained from n class="Chemical">DFTB3/3OB
with metal–ligand distance MAD of 0.012 Å and angle MAD
of 2.3°, which are notable improvements over DFTB3/MIO. The largest
deviation 0.091 Å of n class="Chemical">metal–ligand length in DFTB3/3OB
is for [Zn(N=CH2)]+. The distance between
Zn and N is underestimated by ∼0.1 Å if r(ZnN) is shorter than 1.95 Å due to the imperfect Zn–N
repulsive potential at the short-range. Similar to cases for magnesium,
angles Zn–S–H are uniformly shifted up by ∼10°
in complexes containing ligand SH2 or SH–CH3 in DFTB3/3OB optimized structures. B3LYP/6-31+G(d,p) is structurally
very similar to B3LYP/aug-cc-pVTZ with a distance MAD of merely 0.004
Å. Nevertheless, MP2/aug-cc-pVTZ is rather different from B3LYP/aug-cc-pVTZ
with a MAD of 0.017 Å; metal–ligand distances between
zinc and C, N, O, P, S are uniformly underestimated by 0.02–0.03
Å. Molecular geometries are described rather poorly by PM6 in
our test set; for example, the metal–ligand lengths between
zinc and phosphorus deviate on average 0.3 Å. More statistics
for different metal–ligand types are given in the Supporting Information.
Max stands for maximum absolute
deviation.n class="Chemical">Number of n class="Chemical">comparisons.
Basis set aug-cc-pVTZ.Basis set 6-31+G(d,p).Molecules n class="Chemical">containpan>inpan>g PH2– or
H– do not pan> class="Chemical">converge and are excluded from statistics.
Sequential Ligand Dissociation
Energies
Sequential
ligapan class="Chemical">nd dissociation energies for zinc species with neutral and negatively
charged ligands are summarized in Table 9.
n class="Chemical">DFTB3/3OB shows a MAD of 3.6 kcal/mol compared to B3LYP/aug-cc-pVTZ.
Again, deviations close to or greater than 10 kcal/mol are found for
charged ligands. The difference between the small and large basis
set in B3LYP is rather small, and even charged ligands produce errors
no larger than 2.7 kcal/mol, indicating B3LYP with small basis set
may be useful in ab initio QM/MM calculations. There
is a slightly large discrepancy between MP2 and B3LYP; it may be attributed
to the differences in structures between these two methods as discussed
above. PM6 shows large deviations for sBDE with a MAD of 16.8 kcal/mol.
With the unphysical Zn–P geometries removed from statistics,
the MAD for PM6 drops to 9.1 kcal/mol, still substantially larger
than that for DFTB3. We have also performed single-point calculations
of B3LYP/aug-cc-pVTZ on top of DFTB3 geometries. The results are remarkably
good with a MAD of only 0.5 kcal/mol.
Table 9
Sequential
Ligand Dissociation Energies
of Zinc Containing Molecules Compared to B3LYP/aug-cc-pVTZa
moleculeb
B3LYPc
MP2c
B3LYPd
PM6
DFTB3/MIO
DFTB3/3OB
B3LYP//DFTB3e
[Zn(H2O)2]2+
89.7
–2.9
+2.0
–15.7
–5.2
+4.3
+0.2
[Zn(H2O)3]2+
56.4
+1.2
+1.5
+1.4
–1.1
–2.2
–0.1
[Zn(H2O)4]2+
43.0
+2.2
+1.9
+3.2
–2.0
–0.9
+0.8
[Zn(H2O)5]2+
25.3
+3.2
+2.6
+1.8
–3.8
–1.7
–0.4
[Zn(H2O)6]2+
23.4
+3.7
+2.3
–3.2
–3.9
–1.5
+0.3
[Zn(CO)2]2+
68.2
+0.4
–0.4
+28.0
–5.8
–3.7
+0.4
[Zn(CO)3]2+
40.3
+3.6
+0.1
+15.3
+1.2
–7.3
+0.3
[Zn(CO)4]2+
30.6
+4.5
+0.3
+16.9
+3.9
–4.6
–0.2
[Zn(NH3)2]2+
108.3
–0.6
+1.9
–8.1
–7.8
–4.3
+0.4
[Zn(NH3)3]2+
59.7
+2.9
+2.4
+8.0
+0.2
–2.6
+0.3
[Zn(NH3)4]2+
43.8
+3.6
+2.0
+9.4
+4.1
+2.8
+0.1
[Zn(NH3)5]2+
15.3
+3.5
+2.1
+1.7
–2.7
+3.2
–0.2
[Zn(NH3)6]2+
13.1
+4.3
+1.8
+1.4
–2.3
+4.1
–0.4
[Zn(SH2)2]2+
84.4
+0.0
+0.5
–5.8
–2.0
+1.0
–1.4
[Zn(SH2)3]2+
42.9
+3.9
+0.5
–1.6
+3.5
+1.8
–0.1
[Zn(SH2)4]2+
29.1
+5.2
+1.2
+0.4
+4.8
+4.9
+2.0
[Zn(PH3)2]2+
89.4
+1.9
+1.1
+55.7
–7.1
+1.2
+0.7
[Zn(PH3)3]2+
43.0
+4.9
+0.8
+51.9
+0.6
+3.7
+0.6
[Zn(PH3)4]2+
30.5
+7.0
+0.9
+65.9
+3.0
+5.5
+0.5
[Zn(NH=CH2)2]2+
111.9
+0.5
+0.9
–2.2
–1.7
+0.1
+0.1
[Zn(NH=CH2)3]2+
60.1
+4.4
+1.2
+10.5
+1.3
–1.5
–0.4
[Zn(NH=CH2)4]2+
43.5
+5.9
+1.1
+9.8
+3.7
+2.8
+0.4
[Zn(SH–CH3)2]2+
92.2
+2.1
+0.1
–8.2
–5.8
–2.8
–0.3
[Zn(SH–CH3)3]2+
46.0
+6.4
+0.5
–1.2
+0.5
–0.4
+0.4
[Zn(SH–CH3)4]2+
31.1
+8.1
+1.1
–0.1
+2.3
+2.9
+0.3
[Zn(OH)2]
253.8
+1.9
+2.7
+12.0
–15.1
–3.4
+0.5
[Zn(NH2)2]
248.6
+4.9
+2.6
+15.9
+11.5
–4.5
+3.1
[Zn(SH)2]
213.2
+6.1
+0.2
–22.1
–6.7
–9.1
–0.2
[Zn(PH2)2]
199.3
+7.1
+0.8
+108.8
-f
–11.7
+0.9
[Zn(OCH3)2]
235.4
+7.6
+2.2
+6.9
–10.0
–0.5
+0.2
[Zn(NCH2)2]
220.6
+10.9
+0.6
+25.7
+13.8
+8.0
+1.1
[Zn(SCH3)2]
207.4
+8.2
+0.5
–18.7
–7.8
–7.5
+0.0
MAD
4.2
1.3
16.8
4.7
3.6
0.5
MAX
10.9
2.7
108.8
15.1
11.7
3.1
Energies except PM6 are calculated
at 0 K excluding zero-point energy and thermal corrections. All numbers
are given in kcal/mol.
The
nondissociated is listed.
Basis set aug-cc-pVTZ.
Basis set 6-31+G(d,p).
B3LYP/aug-cc-pVTZ single-point calculations
on top of DFTB3/3OB geometries.
Molecule does not converge and is
excluded from the statistics.
Table 10 shows
the protonatiopan class="Chemical">n energies calculated for 24 biologically relevant ligand
model molecules obtained using B3LYP, n class="Species">MP2, and semiempirical methods.
As expected, B3LYP/6-31+G(d,p) performs very well n class="Chemical">compared to B3LYP
with a large basis set. The MAD for DFTB3is 4.2 kcal/mol, which is
substantially better than DFTB3/MIO and PM6; the errors are small
enough to render this approach useful. High-level single-point correction
at DFTB3/3OB geometries once again delivers excellent proton affinities
except the ones involving the SH– ligand whose geometries
are less well described as discussed above.
Table 10
Gas-Phase Proton Affinities of Zinc
Containing Molecules Compared to B3LYP/aug-cc-pVTZa
moleculeb
B3LYPc
MP2c
B3LYPd
PM6
DFTB3/MIO
DFTB/3OB
B3LYP//DFTB3e
[Zn(H2O)1]2+
61.7
+5.4
–0.6
+7.8
–1.2
–7.7
+0.8
[Zn(H2O)2]2+
94.2
+0.6
–0.3
–2.5
–1.1
–2.8
+0.5
[Zn(H2O)3]2+
121.3
–0.7
–0.5
–9.6
+1.9
–1.2
+0.1
[Zn(H2O)4]2+
141.5
–1.6
–0.7
–16.9
+2.3
–1.3
+0.7
[Zn(NH3)1]2+
86.9
+4.1
+0.1
+6.5
–13.6
–10.8
+1.8
[Zn(NH3)2]2+
127.3
–0.8
+0.7
–2.9
–18.5
–10.7
+0.6
[Zn(NH3)3]2+
155.0
–1.7
+1.0
–9.7
–18.0
–10.6
+0.2
[Zn(NH3)4]2+
178.1
–2.2
+1.0
–4.8
–10.3
–8.4
+0.7
[Zn(SH2)1]2+
56.6
+1.6
–1.7
+16.1
–13.0
–5.4
–4.0
[Zn(SH2)2]2+
93.2
–2.0
–1.5
+12.5
–10.6
–0.2
–4.0
[Zn(SH2)3]2+
115.0
–2.6
–2.0
+7.7
–9.8
–0.5
–4.0
[Zn(SH2)4]2+
131.1
–3.0
–2.2
+6.0
–9.1
–1.0
–2.1
[Zn(PH3)1]2+
66.9
+3.2
–0.7
+21.8
-f
–1.5
+0.3
[Zn(PH3)2]2+
115.6
–0.2
–0.3
–5.7
-f
–0.9
+0.8
[Zn(PH3)3]2+
140.8
–1.0
–0.1
–30.0
-f
–2.7
+0.7
[Zn(PH3)4]2+
158.0
–1.6
+0.1
+14.3
–18.1
–3.2
+1.0
[Zn(NH=CH2)1]2+
89.1
+6.1
–0.9
+2.3
–16.2
–2.8
+1.1
[Zn(NH=CH2)2]2+
140.9
–0.4
–0.2
–12.6
–14.1
–2.1
+0.4
[Zn(NH=CH2)3]2+
168.6
–1.5
+0.1
–18.1
–14.2
–2.2
+0.0
[Zn(NH=CH2)4]2+
188.1
–2.1
–0.1
–14.3
–14.8
–1.6
+2.4
[Zn(SH–CH3)1]2+
74.2
+0.3
–2.5
+12.1
–4.6
–8.3
–2.5
[Zn(SH–CH3)2]2+
114.9
–3.0
–2.3
+5.6
–10.0
–4.2
–2.1
[Zn(SH–CH3)3]2+
136.7
–4.1
–2.4
+1.1
–10.5
–4.8
–1.7
[Zn(SH–CH3)4]2+
152.6
–6.0
–2.2
+0.3
–2.6
–5.9
–1.4
MAD
2.3
1.0
10.1
10.2
4.2
1.4
MAX
6.1
2.5
30.0
18.5
10.8
4.0
Energies except PM6 are calculated
at 0 K excluding zero-point energy and thermal corrections. All numbers
are given in kcal/mol.
The protonated form is listed.
Basis set aug-cc-pVTZ.
Basis set 6-31+G(d,p).
B3LYP/aug-cc-pVTZ single-point
calculations on top of DFTB3/3OB geometries.
Molecule does not converge and
is excluded from the statistics.
Energies except papan class="Chemical">n class="Chemical">PM6 are calculated
at 0 K excluding zero-poinpan>t energy and thermal n class="Chemical">corrections. All numbers
are given in kcal/mol.
The
nondissociated is listed.Basis set aug-cc-pVTZ.Basis set 6-31+G(d,p).B3LYP/aug-cc-pVTZ single-point calculations
on top of n class="Chemical">DFTB3/3OB geometries.
Molecule does not n class="Chemical">converge and is
excluded from the statistics.
Energies except papan class="Chemical">n class="Chemical">PM6 are calculated
at 0 K excluding zero-poinpan>t energy and thermal n class="Chemical">corrections. All numbers
are given in kcal/mol.
The protonated form is listed.Basis set aug-cc-pVTZ.Basis set 6-31+G(d,p).B3LYP/aug-cc-pVTZ single-point
calculations on top of n class="Chemical">DFTB3/3OB geometries.
Molecule does not n class="Chemical">converge and
is excluded from the statistics.
Mixed Ligands
One goal to developpapan class="Chemical">n class="Chemical">DFTB3/3OB is to describe
enpan>zyme active sites. n class="Chemical">Zn-n class="Chemical">containing structures are mainly tetrahedrally
bound in metalloenzymes. The most abundant primary ones are CCCC,
zinc ion with four coordinating cysteines, followed by CCCH, CCHH,
CHHH, HHHH, HHHO, HHOO, HOOO, HHHD, and HHDD, where O stands for water
and the remaining are 1-letter amino acid codes.[12] In this subsection, we compare geometries and energetics
for 43 zinc complexes with mixed types of simple ligands (including
the deprotonated form) to model representative active sites in proteins.
Here, H2O is used to represent the Asp/Glu side chain,
NH3 to represent the His side chain, and H2S
to represent the Cys side chain. More complicated/realistic ligands
are discussed in the next subsection on “Large Test Set”. Table 11 summarizes
the geometrical properties. The MADs in comparison to B3LYP/aug-cc-pVTZ
for 161 metal–ligand lengths are 0.005, 0.019, 0.053, and 0.118
Å for B3LYP/6-31+G(d,p), DFTB3/3OB, DFTB3/MIO, and PM6, respectively.
Metal–ligand angles on average deviate by 0.9, 2.6, 3.0, and
6.7°. Again, the 3OB set is a major improvement over the MIO
set in metal–ligand lengths. PM6 is not able to give reliable
geometries due to a large number of unreasonable metal–ligand
distances between zinc and other main group elements.
Table 11
Mean and Maximum Absolute Deviation
of Metal-Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ
for Zinc Mixed Ligand Test Set
propertya
Nb
B3LYPc
PM6
DFTB3/MIOd
DFTB3/3OB
r (Å)
161
0.005
0.118
0.053
0.019
rmax (Å)
0.027
0.514
0.179
0.070
a (deg)
231
0.9
6.7
3.0
2.6
amax (deg)
22.2
57.2
21.6
24.7
max stands for maximum absolute
deviation.
number of comparisons.
basis set 6-31+G(d,p).
Molecules containing PH2– do not
converge and are excluded from statistics.
max stands for maximum absolute
deviation.number of n class="Chemical">comparisons.
basis set 6-31+G(d,p).Molecules n class="Chemical">containpan>inpan>g PH2– do not
pan> class="Chemical">converge and are excluded from statistics.
n class="Chemical">DFTB3pan>/3OB energetics are predicted with similar accuracy
as for
complexes with the same type of ligand as shown in Table 12 (for detailed data, see Supporting
Information). Once again, the reactions are collected somewhat
arbitrarily so the error anan class="Chemical">lysis needs to be treated with care. B3LYP/6-31+G(d,p)
gives small error compared to B3LYP with large basis set, demonstrating
basis set superposition error is not severe here. DFTB3 gives rather
good results in comparison with the reference (MAD of ∼3 kcal/mol),
especially after single-point energies using B3LYP/aug-cc-pVTZ (MAD
of 0.9 kcal/mol), demonstrating that the 3OB set is transferable from
one type ligand to complicated coordinating environment with multiple
types of ligands.
Table 12
Error Statistics for the Ligand Dissociation
Energies and Proton Affinities of Zinc Containing Mixed Ligand Molecules
Compared to B3LYP/aug-cc-pVTZa
B3LYPb
PM6
DFTB3/MIO
DFTB3/3OB
B3LYP//DFTB3c
Ligand Dissociation
Energies
MAD
1.9
8.2
3.1
3.2
0.9
MAX
2.8
31.6
6.8
6.5
2.6
Proton Affinities
MAD
0.9
10.4
5.5
3.8
0.9
MAX
1.9
21.7
11.3
7.1
2.4
For detailed data, see Supporting Information.
Basis set 6-31+G(d,p).
B3LYP/aug-cc-PVTZ single-point
calculations on top of DFTB3/3OB geometries.
For detailed data, see Supporting Information.Basis set 6-31+G(d,p).B3LYP/aug-cc-PVTZ single-point
calculations on top of n class="Chemical">DFTB3/3OB geometries.
To further test the
3OB parameter set,
we have chosen lpapan class="Chemical">n class="Chemical">arger molecules to model the biologically relevanpan>t
zincn class="Chemical">coordinating environment. In this “ln class="Chemical">arge test set”,
CH3COO–/HCOO– is used to model
the Asp/Glu acid side chain, imidazole/NCH2CH3 to represent His, and SHCH3 to model the Cys side chain.
Cys can adopt both protonated and deprotonated form in the active
site so that a random mixture of SHCH3 and SCH3– is
tested in modeling the Cyscontaining active site. In total we have
compiled 51 zinc molecules. Geometries are compared to B3LYP/aug-cc-pVTZ
optimized structures. B3LYP with the small basis set deviates on average
by only 0.005 Å, and the maximal deviation is only 0.063 Å
(Table 13). The MAD for metal–ligand
length from DFTB3/3OB optimized structures is 0.024 Å. MAD of
metal–ligand lengths Zn–O is 0.033 Å and Zn–S
is 0.027 Å; they appear to be worse than the small test set but
are good enough for modeling enzyme active sites. The largest deviations
for metal–ligand lengths are found in Zn–O distances
in [Zn(Im)(H2O)(SCH3)2] and [Zn(H2O)(SHCH3)(SCH3)2], which
are underestimated by 0.141 and 0.103 Å, respectively. All others
errors are smaller than 0.1 Å. The largest angle deviation 25.8°
occurs in angle Zn–N–C in [Zn(C3N2H3)]+ (i.e., Zn2+ bound to a single
deprotonated imidazole), which is not encountered often in biological
systems.
Table 13
Mean and Maximum Absolute Deviation
of Metal–Ligand Lengths and Angles in Comparison to B3LYP/aug-cc-pVTZ
for the Zinc Large Test Set
propertya
Nb
B3LYPc
PM6
DFTB3/MIO
DFTB3/3OB
r (Å)
173
0.005
0.128
0.053
0.024
rmax (Å)
0.063
0.955
0.177
0.141
a (deg)
236
0.6
5.3
4.0
3.3
amax (deg)
5.1
63.3
25.8
25.8
Max stands for maximum absolute
deviation.
Number of comparisons.
Basis set 6-31+G(d,p).
Max stands for maximum absolute
deviation.n class="Chemical">Number of n class="Chemical">comparisons.
Basis set 6-31+G(d,p).Ligand dissociatiopan class="Chemical">n energies are
compiled in Table 14 (for detailed data, see Supporting Inclass="Chemical">pan>formation). n class="Chemical">DFTB3 shows that removal
of neutral imidazole may cause an error
of more than 10 kcal/mol, indicating interactions betweenn class="Chemical">Zn–sp2 hybridized N are problematic. Nevertheless, when calculating
single-point energies using B3LYP/aug-cc-pVTZ the errors turn out
to be small with the MAD of merely 0.5 kcal/mol. Proton affinities
are also compared in Table 14. DFTB3/3OB shows
excellent results with a MAD of 3.3 kcal/mol, which is fairly remarkable
in this set of rather complicated molecules; single-point B3LYP calculations
lead to an even smaller MAD of 0.6 kcal/mol.
Table 14
Error
Statistics for the Ligand Dissociation
Energies and Proton Affinities of Zinc Containing Large Molecules
Compared to B3LYP/aug-cc-pVTZa
B3LYPb
PM6
DFTB3/MIO
DFTB3/3OB
B3LYP//DFTB3c
Ligand Dissociation
Energies
MAD
1.3
5.9
3.6
3.4
0.5
MAX
2.6
20.2
14.4
13.8
1.3
Proton Affinities
MAD
1.1
10.1
4.9
3.3
0.6
MAX
2.1
19.3
12.7
6.9
2.6
For detailed data, see Supporting Information.
Basis set 6-31+G(d,p).
B3LYP/aug-cc-PVTZ single-point
calculations on top of DFTB3/3OB geometries.
For detailed data, see Supporting Information.Basis set 6-31+G(d,p).B3LYP/aug-cc-PVTZ single-point
calculations on top of n class="Chemical">DFTB3/3OB geometries.
Dinuclear Mg/Zn Active Sites
An
importapan class="Chemical">nt catalytic
motif in metalloenpan>zymes is the bimetallic (or dinuclear) Mg/Zn site.[135,136] In this bimetallic motif, metal ions are commonly coordinated through
a monatomic bridging ligand like water/hydroxide or a polyatomic group
like carboxylate in a bidentate fashion. The two-metal-ion mechanism[5] was proposed for this type of motif, where one
metal ion actively participates in facilitating the dissociation of
the leaving group and stabilizing the pentavalent intermediate while
the second metal ion holds an important structural role during catalysis.
The advantage of bimetallic centers is the charge delocalization so
as to lower the barrier of binding large substrates and generating
a nucleophile.[137] To assess DFTB3/3OB for
these important catalytic motifs, we have constructed four model compounds
based on bimetallic Mg/Zn centers in enzymes (Figure 2). In our models, certain
ligands are protonated to prevent a high net charge in the gas phase.
Figure 2
Optimized structures of dinuclear Mg2+/Zn2+ complexes of the simplified protein active site (a) PDB
ID 1A49; (b)
PDB ID 4ECQ;
(c) PDB ID 3L4K; (d) PDB ID 1AMP. Distance are given
in angstroms. The numbers without parentheses are obtained at the
B3LYP 6-31+G(d,p) level; those with parentheses are DFTB3 values.
Figure 2(a) is a simplified active site
structure taken from papan class="Chemical">n class="Chemical">pyruvate kinpan>ase crystal structure (n class="Disease">PDB ID 1A49(138)). n class="Chemical">Pyruvate kinase is an enzyme of the glycolytic pathway
and phosphorylates ADP to generate ATP and pyruvate. This enzyme can
also utilize other divalent metal ions like Mn2+, Ni2+, and Zn2+. Figure 2(b)
is simplified from the polymerase η·DNA·dATPcomplex
crystal structure (PDB ID 4ECQ(139)) which is involved in
DNA synthesis. The formation of a phosphodiester bond is recently
captured by time-resolved X-ray crystallography.[139] In the (a)(b) active sites, one Mg2+ is tridentated
to α-, β-, and γ-oxygens in ATP. Figure 2(c) is modified from the type II topoisomerase (topoII)–DNA
cleavage complex crystal structure (PDB ID 3L4K(140)). This
enzyme cleaves and ligates DNA using tyrosine as the nucleophilic
agent. Following the nucleophilic attack toward DNA backbones, a covalent
phosphotyrosyl bond that links topoII and partial DNA is formed. It
was long believed that a catalytic reaction could be aided via a single
Mg2+. More recently, it was found that the canonical two-metal-ion
mechanism may be possible.[140,141] Therefore, two Mg2+ ions are modeled in the reactant state. Figure 2(d) is the active site in Aeromonas Protelytica
Aminopeptidase (AAP) (PDB ID 1AMP(142)), which is a prototypical
member of the bizinc enzyme family.
As shown ipan class="Chemical">n Figure 2, n class="Chemical">DFTB3 is tested against
B3LYP/6-31+G(d,p) optimized structures. The overall structures from
npan> class="Chemical">DFTB3 reproduce B3LYP structures very well, particularly the metal
ion–ligand distances. Mg–Mg/Zn–Zn distances are
also excellent with the largest deviation of 0.13 Å. The monodentate
or bridging binding modes of carboxylate are correctly described by
the 3OB parameter set in comparison to B3LYP. These calculations demonstrate
that DFTB3/3OB is useful for modeling bimetallic centers in enzymes
with good accuracy.
Optimized structures of dinuclear papan class="Chemical">n class="Chemical">Mg2+/npan> class="Chemical">Zn2+ complexes of the simplified protein active site (a) PDB
ID 1A49; (b)
PDB ID 4ECQ;
(c) PDB ID 3L4K; (d) PDB ID 1AMP. Distance are given
in angstroms. The numbers without parentheses are obtained at the
B3LYP 6-31+G(d,p) level; those with parentheses are DFTB3 values.
To further benchmark papan class="Chemical">n class="Chemical">DFTB3/3OB
inpan> describing the transition state
and reaction enpan>ergy barrier, we study an active site model of AP in
the gas phase. AP catalyzes the hydrolytic reactions of various n class="Chemical">phosphates
via a two-step mechanism: an oxygen neocleophile first attacks the
phosphorus, then a water (hydroxide) replaces the leaving group in
the subsequent step that is essentially the reverse of the first step.[143] In our previous work,[48] we studied the first step for the hydrolysis of a well-studied phosphate
diester MpNPP– in R166S AP and showed that the transition
states are synchronous in nature. To help understand the intrinsic
errors in DFTB3/3OB for the reactions of interest, we investigate
the active site model in the gas phase with QM calculations. Specifically
we study the same reaction for two additional n class="Chemical">phosphate diesters methyl
3-nitrophenyl phosphate (MmNPP–) and methyl phenyl
phosphate (MPP–) and compare the energetics and
nature of the transition state to MpNPP–. For simplicity,
we only study the α orientation of different substrates.
As shown ipan class="Chemical">n Figure 3, there is generally good agreement between n class="Chemical">DFTB3/3OB and
B3LYP/6-31+G(d,p) results in terms of geometries; the P–O distances,
n class="Chemical">Zn2+–Zn2+ distances, and coordination
structures are similar at the two QM computational levels for both
reactant and transition states. The Zn–O distance error is
within 0.03 Å and is a major improvement over MIO parameters
which systematically overestimate Zn–O distance by 0.1 Å
(MIO data not shown here). The Zn2+–Zn2+ distance is underestimated in DFTB3/3OB by ∼0.1 Å in
the reactant and is shrunk ∼0.1 Å more in the transition
state in comparison to B3LYP distances. In terms of the nature of
the transition state, both DFTB3 and B3LYP predict a synchronous type
of structure with the tightness coordinate (i.e., P–Oligand + P–Onucleophile) of 4.0–4.2 Å even
though the individual P–O bond is overestimated or underestimated
by 0.1 Å in DFTB3, indicating DFTB3 provides a good description
for the potential energy surface around the transition state for the
three substrates.
Figure 3
AP active site gas-phase model with MpNPP–, MmNPP–, and MPP–. Distances are given in
angstroms. The numbers without parentheses are obtained at the B3LYP
6-31+G(d,p) level; those in the parentheses are DFTB3 values. (a)
MpNPP– reactant state; (b) MpNPP– transition state; (c) MmNPP– reactant state; (d)
MmNPP– transition state; (e) MPP– reactant state; and (f) MPP– transition state.
Regarding the reactiopan class="Chemical">n energy barrier, Table 15 shows that the qualitative trend is n class="Chemical">consistenpan>t
among various
npan> class="Chemical">computational levels although there are substantial differences in
the quantitative values. Previous calculations[48] of proton affinities for the leaving group indicate that
MP2 gives results close to the experimental values, so MP2/6-311++G(d,p)
single-point energies are used as the reference here. Numbers span
a broad range with different DFT methods without dispersion, and the
results become much closer after adding the D3 dispersion; an explicit
consideration of dispersion is important for a large active site discussed
here. Compared to MP2, DFTB3 gives a reasonable reaction energy for
MpNPP– but underestimates the barrier of MmNPP– by ∼4 kcal/mol and overestimates the barrier
of MPP– by ∼9 kcal/mol. However, MP2 single-point
energies on top of DFTB3 and B3LYP/6-31+G(d,p) geometries differ by
∼1 kcal/mol for all the substrates, indicating DFTB3 produces
satisfactory geometries and also highlighting the importance of single-point
energy correction with a reliable ab initio method
at DFTB3 geometries. DFTB3 reaction barriers are close to B3LYP results
without D3 correction; this is not surprising because DFTB3 is parametrized
based on B3LYP (see Methods) and inherits
some errors. M06 behaves the best among three DFT functionals as compared
to MP2, especially with the D3 dispersion included. PBE systematically
underestimates the energy barrier. Further optimization of the D3
dispersion model developed for DFTB3[99] will
further improve DFTB3/3OB energies.
Table 15
Calculated Activation
Barrier (in
kcal/mol) for Diester Hydrolysis in the Active Site Model of AP in
the Gas Phasea
single
points at DFTB3 geometries
single
points at B3LYP geometries
substrate
DFTB3
B3LYP
M06
PBE
MP2
B3LYP
M06
PBE
MP2
MpNPP–
8.8
9.2 (4.2b)
5.7 (5.1)
5.7 (3.1)
7.6
12.0 (6.7)
6.8 (6.1)
7.2 (4.3)
8.8
MmNPP–
9.7
9.2 (9.4)
12.1 (12.4)
6.8 (7.5)
13.6
15.3 (12.0)
13.2 (12.8)
11.0 (9.3)
13.2
MPP–
20.0
16.2 (9.1)
12.4 (10.5)
11.6 (7.1)
9.6
20.1 (11.6)
13.0 (10.8)
14.0 (8.6)
10.4
The geometries are optimized at
either the DFTB3/3OB or B3LYP/6-31+G(d,p) level. Single-point energy
calculations are carried out at the B3LYP, M06, PBE, and MP2 method
at two levels of geometries using a large basis set of 6-311++G(d,p).
The numbers with parentheses
are
obtained with corresponding DFT functional plus D3 dispersion.
AP active site gas-phase model with papan class="Chemical">n class="Chemical">MpNPP–, n class="Chemical">MmNPP–, and MPP–. Distances are given in
angstroms. The numbers without parentheses are obtained at the B3LYP
6-31+G(d,p) level; those in the parentheses are DFTB3 values. (a)
MpNPP– reactant state; (b) MpNPP– transition state; (c) MmNPP– reactant state; (d)
MmNPP– transition state; (e) MPP– reactant state; and (f) MPP– transition state.
The geometries are optimized at
either the n class="Chemical">DFTB3pan>/3OB or B3LYP/6-31+G(d,p) level. Single-point enpan>ergy
calculations are carried out at the B3LYP, M06, n class="Chemical">PBE, and MP2 method
at two levels of geometries using a large basis set of 6-311++G(d,p).
The numbers with parentheses
are
obtained with n class="Chemical">correspondinpan>g DFT functional plus D3 dispersion.
Mg2+ and Zn2+ Ions
Solvation Structure
and Relative pKa
The n class="Chemical">metalpan> ion-O
radial distribution function (g(r)) and n class="Chemical">coordination numbers are shown in Figure 4 for the two cations
in solution. A well-defined first solvation shell is observed in both
cases. The first peak in DFTB3/MM hydrated Mg2+ occurs
at 2.07 Å, in good agreement with X-ray diffraction data of 2.09
Å ± 0.04 Å[144] and with previous
AIMD simulations in the range of 2.08–2.13 Å.[145−149] The MM force field predicts the Mg2+–O distance
to be shorter than the DFTB3/MM model with a sharper peak at 2.0 Å.
The inner n class="Chemical">coordination sphere remains similar to the gas-phase structure.
For the cluster [Mg(H2O)6]2+ in the
gas phase, the Mg2+–O distance is 2.10 Å in
both DFTB3 and B3LYP/aug-cc-pVTZ optimized structures. For Zn2+, the first maximum from DFTB3/MM is at 2.11 Å, consistent
with X-ray absorption measurements 2.06 ± 0.02 Å[150] and earlier AIMD results 2.07–2.13 Å.[145,146,151] The classical MM model gives
the first peak at 2.08 Å, although the peak is higher and narrower
than in the DFTB3/MM model. The small difference in the location of
the first peak in DFTB3/MM may be traced back to the gas-phase model:
for the cluster [Zn(H2O)6]2+ in the
gas phase, Zn2+–O is 2.16 Å in DFTB3 versus
2.12 Å in B3LYP/aug-cc-pVTZ.
Figure 4
Radial
distribution function of water oxygen around Mg2+/Zn2+ in 20 Å water droplet. The pure MM model is
based on the CHARMM force field.[105]
The integral of the first
solvatiopan class="Chemical">n shell for both cations gives the coordination npan>umber of
6. The ln class="Chemical">arge QM region results predict very similar first solvation
shell as with the small QM region, though they differ slightly for
the sen class="Chemical">cond solvation shell; the large QM region leads to a broader
peak for the second solvation shell than the small QM region and MM
model. It is worth noting that in the large QM region simulation of
Zn2+water exchanges between the first and second solvation
shells (coordination number remains the same), whereas no exchange
of water is seen in Mg2+ simulations. These observations
are in agreement with experimental findings; for Mg2+,
experimental results suggest that water molecules could reside in
the first shell up to a few microseconds,[152,153] while the water residence time for Zn2+ is only hundreds
of picoseconds.[154]
To further benchmark
the epan class="Chemical">nergetics of n class="Chemical">DFTB3/3OB, we use the n class="Chemical">DTSC-TI
approach to calculate the relative pKa of Mg2+/Zn2+-bound water with the DFTB3/MM
model. Altogether six intermediate states are used to obtain the ∂F/∂λ values. In each window, the data collected
during the first 250 ps are discarded as part of the equilibration
process, and the remaining 500 ps data are then subject to statistical
analysis to obtain the proper average of free energy derivative. The
statistical errors associated with the free energy are on the order
of 0.2–0.5 kcal/mol. In all cases, the linear dependence of
the free energy derivatives on λ holds well with a correlation
coefficient (R2) higher than 0.99 (see
Figure S1 in Supporting Information). Integrating
∂F/∂λ over λ gives a value
of 132.9 kcal/mol for Mg2+ with the small QM region, 124.1
kcal/mol for Zn2+ with the small QM region, 130.0 kcal/mol
for Mg2+ with the large QM region, and 125.5 kcal/mol for
Zn2+ with the large QM region, leading to a ΔpKa of 6.4 pKa units
with the small QM region and 3.3 pKa units
with the large QM region. The latter is in good agreement with the
experimental value of 3.2 pKa units (12.4
for a Mg2+ solution and 9.2 for hydrated Zn2+).[145,155] The better performance of the large QM region
results highlights the importance of using a QM description for the
second solvation shell, which allows coordination number fluctuation
in the Zn2+ solution, especially at the deprotonated state.
We have also calculated the absolute pKa of Mg2+/Zn2+-bound water as a further benchmark.
Compared to the experimental values, the closest computational estimate
from current work is 14.0 for Mg2+ and 10.4 for Zn2+ (see Supporting Information).
Radial
distribution function of n class="Chemical">water pan> class="Chemical">oxygen around Mg2+/Zn2+ in 20 Å water droplet. The pure MM model is
based on the CHARMM force field.[105]
Structural Flexibility
of the Active Site in Myosin
Myosin-II (thereafter referred
to as myosipan class="Chemical">n) is one of the best characterized
molecular motors, which n class="Chemical">couple ln class="Chemical">arge-scale conformation transitions
with ATP binding and hydrolysis. Despite many previous efforts from
both experimental and theoretical studies,[156−160] the complex nature of “mechanochemical coupling” still
remains to be well understood.
As shown ipan class="Chemical">n Figure 5(a), binding modes obtained
from the n class="Chemical">DFTB3/MM simulation reproduce the key features of the crystal
structure. In the discussion of our earlier work,[115] the weakenpan>ed inpan>teraction betweenpan> n class="Chemical">Mg2+-nucleotide
and Ser237 as reflected by the larger distance fluctuation was proposed
to indicate the destabilization of Switch I following n class="Chemical">ATP hydrolysis.
As a test of the DFTB3/3OB magnesium parameters in the enzyme, DFTB3/MM
equilibrium simulations are carried out with two nucleotide chemical
states (ATP or ADP·Pi), and the distances between Mg2+ and four nonwater ligands are monitored. As shown in Figures 5(b) and (c), the distances between Mg2+ and ligands are reasonably stable over 1 ns; the most obvious difference
observed between the two nucleotide states is the fluctuation of Mg2+–OSer237 distance, for which the average
and fluctuation are 2.16 and 0.08 Å in the ATP state and 2.24
and 0.11 Å in the ADP·Pi state. The magnitude of fluctuation
observed in current DFTB3/MM simulations is notably less compared
with previous work (MM simulations and SCC-DFTBPR/MM simulation, in
which the Mg–OSer237 distance reached ∼3.5
Å), which led to coordination number shifting from 6 to 5 after
ATP hydrolysis.[115] To further evaluate
the situation, PMFs are calculated for the distance between Mg2+ and OSer237 in the ATP and ADP·Pi states.
As seen in Figure 5(d), dissociation of Ser237
is indeed energetically more favorable in the ADP·Pi state than
the ATP state by a few kcal/mol. Nevertheless, there is a significant
energetic cost to fully dissociate the Ser237 side chain from the
Mg2+, even in the ADP·Pi state. Further experimental
analysis based on infrared spectroscopy can potentially generate additional
insights into this issue.[161]
Structural
properties of the Myosin active site duripan class="Chemical">ng equilibrium
QM/MM MD simulations with different nucleotide chemical states (n class="Chemical">ATP
or n class="Chemical">ADP·Pi). (a) Overlay of average DFTB3/MM structure and crystal
structure (gray) of Myosin-II motor domain active site in the ATP
state. The VO4 moiety in the crystal structure (PDB code 1VOM) is replaced by
a PO3 phosphate group and a presumptive lytic water molecule.
Instantaneous distances between Mg2+ and oxygen atoms in
its four nonwater ligands at (b) ATP state and (c) ADP·Pi state.
Ser237 refers to OSer237 and Thr186 refers to OThr186. (d) PMF calculations for the ATP/ADP·Pi states. The reaction
coordinate is the distance between Mg2+ and OSer237.
Force Fields Comparison
in AP Active Site
We have benchmarked
papan class="Chemical">n class="Chemical">DFTB3/3OB for the AP active site with a gas-phase model inpan> earlier
discussion. As an additional test, n class="Chemical">DFTB3/MM simulations for the R166S
AP enzyme are carried out, and the results are compared to two popular
MM force fields for zinc. The comparison of equilibrated reactant
structures by n class="Chemical">Coulomb,[14] SLEF1,[25] and DFTB3/MM is shown in Figure 6, with the crystal structure
with bound phosphate as a reference. Binding modes obtained from the
DFTB3/MM simulation reproduce the key features of the crystal structure
and previous SCC-DFTBPR/MM simulations[48] very well (shown in Figure 6(c) and (d)).
The conventional Coulomb scheme (Figure 6(a))
leads to a much shortened Zn2+–Zn2+ distance
and significant changes in the active site geometry: Ser102 and Asp327
are stabilized by two zinc ions simultaneously; Asp369 (not shown
in Figure 6 for clarity) is bidentate with
the Zn12+ in the Coulomb model but monodentate in the crystal
and DFTB3/MM structures. By comparison, coordination geometry is notably
improved by the SLEF1 force field (Figure 6(b)), even though the Zn2+–Zn2+ (3.7
Å) distance is still substantially shorter than that in the crystal
structure (4.3 Å). It is perhaps not surprising that SLEF1 does
not provide a correct distance between zinc ions as SLEF1 was not
parametrized for the binuclear zinc motif and designed to be compatible
with the Amber99SB force field rather than with the CHARMM force field.
Nevertheless, SLEF1 correctly describes the binding modes of Ser102
and Asp369 to the zinc ions; yet Asp327 still has a strong preference
to bridge the two zinc cations due to the short distance between them.
Moreover, an extra His372 is pulled closer to bind to Zn22+; this might have been caused by the stronger attractive interaction
between the zinc ion and nitrogen in the short-range part of the SLEF1
force field. In short, DFTB3/MM produces encouraging results for modeling
a challenging bimetallic zinc active site, demonstrating the greater
applicability of DFTB3/MM to metalloenzymes even when only structural
properties are considered.
Structural properties of the AP active site
duripan class="Chemical">ng equilibrium
QM/MM MD Simulations. A snapshot for the reactant state with (a) n class="Chemical">Coulomb
MM force field, (b) SLEF1 zinc MM force field, (c) n class="Chemical">DFTB3/3OB, and
(d) overlay of crystal and DFTB3/MM structure (blue), with average
key distances in angstroms labeled. Asp369, His370, and His412 are
omitted for clarity. Substrate inorganic phosphate in the crystal
structure (3CMR) is not shown.
Concluding Remarks
n class="Chemical">Magnesiumpan> and zinc play many important
roles in n class="Chemical">metalloenzymes
and various chemical systems. An efficient and accurate QM description
for Mg/Zn is critical for the use of theoretical methods to study
the reaction mechanism of these systems. In this work, we extend the
parametrization of the approximate density functional tight binding
theory, DFTB3, to magnesium and zinc. The parametrization is performed
in a framework n class="Chemical">consistent with the DFTB3/3OB set for CHNOPS; hence
this parameter set can be used to study the chemistry of Mg/Zn moieties
in solution and biological systems.
Benchmark calculatiopan class="Chemical">ns in
the gas phase and in the n class="Chemical">condenpan>sed phase
with a QM/MM framework demonstrate that the currenpan>t parametrizationpan>
generally leads to reliable structures and semiquantitative enpan>ergetics
(with typically a MAD of ∼3–5 kcal/mol compared to high
level ab initio calculations). In the gas phase,
DFTB3/3OB are tested with a diverse range of molecules that reflect
the typical coordination environment of magnesium and zinc in biomolecules,
including dinuclear metal sites. We focus on the geometries, ligand
dissociation energies, and ligand proton affinities. The results calculated
at the n class="Chemical">DFTB3/3OB level are compared to B3LYP, ab initio (G3B3, MP2), and a popular semiempirical method PM6 as well as the
older parametrization, MIO. In general, DFTB3/3OB shows substantial
improvement over DFTB3/MIO and outperforms PM6 in many aspects within
our test sets, particularly in terms of structural properties and
ligand proton affinities. Compared to high-level calculations, DFTB3/3OB
is very successful at predicting geometries. Large errors can still
be found in certain cases, especially with charged ligands. Even for
these cases, single-point energy calculations with high level QM methods
generally give very reliable energetics, suggesting that changes of
structures during typical processes of interest (e.g., ligation dissociation
or ligand deprotonation) are well captured with DFTB3/3OB.
Due
to n class="Chemical">copan>mputational efficienpan>cy and genpan>eral robustness for geometry
predictions, n class="Chemical">DFTB3/MM calculations are expected to be effective for
studies in condensed-phase systems. For example, our solution benchmark
study for the magnesium and zinc ion has reproduced experimental solvation
structures and relative pKa of metal bound
water, especially when both the first and second solvation shells
are treated as DFTB3/3OB. For the enzyme benchmark, n class="Chemical">DFTB3/3OB is successful
at producing active site structures in myosin and AP relative to available
crystal structures; for AP, DFTB3/3OB also leads to semiquantitative
energetics for an active site model, although our calculations using
several DFT functionals and MP2 highlight the importance of careful
calibration of energetics for large metal sites in enzymes, including
consideration of dispersion interactions. These studies have laid
the groundwork for using DFTB3/MM simulations with state-of-the-art
sampling techniques such as metadynamics[162] and finite-temperature string methods[163−165] to study the chemical processes in myosin and AP; high-level single-point
QM/MM calculations are still required for improving the quantitative
nature of the free energy surfaces. Finally, we emphasize that there
are still remaining limitations in DFTB3, such as the treatment of
interaction between Mg/Zn and highly charged/polarizable ligands (e.g.,
hydroxide and SH–), thus it is essential to continue
the formal development of DFTB3, such as adding multipole terms for
the charge fluctuations and formulating a better description of polarization
and short-range Pauli repulsions.