In protein-ligand binding, the electrostatic environments of the two binding partners may vary significantly in bound and unbound states, which may lead to protonation changes upon binding. In cases where ligand binding results in a net uptake or release of protons, the free energy of binding is pH-dependent. Nevertheless, conventional free energy calculations and molecular docking protocols typically do not rigorously account for changes in protonation that may occur upon ligand binding. To address these shortcomings, we present a simple methodology based on Wyman's binding polynomial formalism to account for the pH dependence of binding free energies and demonstrate its use on cucurbit[7]uril (CB[7]) host-guest systems. Using constant pH molecular dynamics and a reference binding free energy that is taken either from experiment or from thermodynamic integration computations, the pH-dependent binding free energy is determined. This computational protocol accurately captures the large pKa shifts observed experimentally upon CB[7]:guest association and reproduces experimental binding free energies at different levels of pH. We show that incorrect assignment of fixed protonation states in free energy computations can give errors of >2 kcal/mol in these host-guest systems. Use of the methods presented here avoids such errors, thus suggesting their utility in computing proton-linked binding free energies for protein-ligand complexes.
In protein-ligand binding, the electrostatic environments of the two binding partners may vary significantly in bound and unbound states, which may lead to protonation changes upon binding. In cases where ligand binding results in a net uptake or release of protons, the free energy of binding is pH-dependent. Nevertheless, conventional free energy calculations and molecular docking protocols typically do not rigorously account for changes in protonation that may occur upon ligand binding. To address these shortcomings, we present a simple methodology based on Wyman's binding polynomial formalism to account for the pH dependence of binding free energies and demonstrate its use on cucurbit[7]uril (CB[7]) host-guest systems. Using constant pH molecular dynamics and a reference binding free energy that is taken either from experiment or from thermodynamic integration computations, the pH-dependent binding free energy is determined. This computational protocol accurately captures the large pKa shifts observed experimentally upon CB[7]:guest association and reproduces experimental binding free energies at different levels of pH. We show that incorrect assignment of fixed protonation states in free energy computations can give errors of >2 kcal/mol in these host-guest systems. Use of the methods presented here avoids such errors, thus suggesting their utility in computing proton-linked binding free energies for protein-ligand complexes.
The
changes in the electrostatic environment that accompany binding
of small molecules, nucleic acids, or other proteins may thus induce
changes in the protonation states of titratable groups in the protein.[1−8] Recently, Aguilar et al. conducted a computational survey of various
protein–protein, protein–small molecule, and protein–nucleic
acid complexes to ascertain the prevalence of protonation change in
the protein receptor upon biomolecular association. Notably, in 60%
of the protein–small molecule complexes considered, at least
one titratable residue in the protein was found to assume different
protonation states in its free and bound states.[9] Furthermore, protonation changes that accompany small molecule
binding to proteins are not limited to the protein partner: an estimated
60–80% of orally administered drugs are weak acids or bases,
whose protonation states can also be tuned by the cellular pH and
electrostatic environment of their protein binding partners.[10−13] In cases where protein–ligand binding accompanies a net transfer
of protons to either binding partner, the binding process is pH-dependent;
i.e., the observed binding free energy is a function of pH.Conventionally, both computational docking and more rigorous free
energy computations, such as the thermodynamic integration (TI) and
free energy perturbation (FEP) methods, employ fixed protonation states
that are identical for free and bound states in the computation of
binding affinities. Clearly, in cases where ligand binding is linked
to the (un)binding of protons, such approximations will lead to error.
Improper assignment of protonation states in binding free energy computations
may result in significant errors, making correct assignment of pKa and protonation state essential to obtaining
accurate free energies.As stated, simulations of protein–ligand
systems are typically
preceded by the assignment of fixed protonation states to titratable
groups on the two binding partners, often using programs such as H++[14−16] and PROPKA[17−20] to do so. Further, docking studies often employ empirical prediction
algorithms, which sometimes use Hammett and Taft relations, to assign
fixed protonation states to the free ligands being docked.[21,22] These approaches, however, fail to account for changes in protonation
that may follow from the altered electrostatic environment surrounding
the two binding partners upon complex formation. Several computational
methods, however, have been developed that permit the protonation
of titratable residues to respond to changes in the electrostatic
environment.[2,23−28] For instance, various flavors of constant pH molecular dynamics
(CpHMD) methodologies have emerged to incorporate pH as an added external
thermodynamic parameter to conventional molecular dynamics (MD) simulations,
allowing fluctuations in the protonation of titratable residues to
accompany conformational sampling.[29−34] To date, CpHMD simulations have been used to successfully predict
pKa values of titratable groups in proteins[29−37] and nucleic acids,[38−40] as well as to explain the mechanism behind the pH-dependent
conformational changes critical to the function of proteins such as
nitrophorin[41] and rhodopsin.[42]The CpHMD method provides a framework
through which the pH dependence
of binding processes can be examined. To the best of our knowledge,
there is currently no standard protocol available to rigorously account
for proton-linked ligand binding. Multiple experimental and computational
groups, however, have utilized the binding polynomial formalism devised
by Wyman[43] to calculate the changes in
binding free energy that accompany binding-induced protonation changes
for both protein–protein[3,4,44] and protein–nucleic acid binding.[39,45,46] Motivated by Mason and Jensen’s usage
of this binding polynomial formalism to estimate the free energies
of binding for protein–protein complexes using the PROPKA web
server,[4] we adopt a similar approach in
conjunction with the CpHMD method by Mongan et al.[33] to obtain pH-dependent free energy profiles in
silico for the binding of small molecules to the cucurbit[7]uril
(CB[7]) host.CB[7] is a synthetic molecule with seven repeating
glycoluril units
bridged by methylene groups (Figure 1).[47,48] This 7-fold symmetric host has gained much attention due to its
ability to encapsulate drug-like small molecules with high affinity
as a stable host–guest complex.[49−55] Benzimidazole (BZ) and a series of its derivatives (Figure 2) comprise a class of widely used fungicides and
anthelmintic drugs[56−58] that have been shown to bind to the CB[7] host and
undergo the pKa shifts as large as 4 pK units upon complex formation (Table 1).[59] At neutral pH, these weakly acidic
guests are predominantly deprotonated when free in solution, but each
binds a single proton upon encapsulation by CB[7]. Both the acid/base
behaviors of BZ-derived guests and the small size and relative rigidity
of CB[7] compared to a typical biomolecule make the CB[7]:BZ complexes
ideal model systems to test theoretical methods for computing pH-dependent
binding free energies.
Figure 1
Structure of cucurbit[7]uril (CB[7]) host: (A) glycoluril
unit;
(B) top view of CB[7].
Figure 2
Chemical structures of benzimidazole and its derivatives.
Table 1
Experimental
pKa Shifts of Benzimidazole Guests upon
Binding to CB[7][59] a
guest
pKaF
pKaC,exp
ΔpK
BZ
5.5
9.0
3.5
TBZ
4.6
8.6
4.0
FBZ
4.8
8.6
3.8
ABZ
3.5
6.1
2.6
CBZ
4.5
7.0
2.5
pKaF denotes the pKa of the free
guest, and pKaC represents
the pKa of the guest in complex with CB[7].
Structure of cucurbit[7]uril (CB[7]) host: (A) glycoluril
unit;
(B) top view of CB[7].Chemical structures of benzimidazole and its derivatives.In this work, we accurately reproduce the pKa shifts of the various BZ derivatives upon
binding to CB[7],
using CpHMD simulations. Coupling these pKa data with reference binding free energies taken either from experiment
or from TI computations allows us to obtain a full description of
CB[7]:guest binding free energies as functions of pH. Additionally,
we show that improper assignment of guest protonation states in binding
free energy computations can produce errors in excess of 2 kcal/mol
at neutral pH, highlighting the importance of accurately accounting
for the pH effects in free energy calculations or docking.
Theory
Binding
Polynomial Formalism for Computing the pH Dependence
of Binding Free Energies
Mason and Jensen recently examined
the pH dependence of protein–protein binding[4] through an application of the binding polynomial formalism
developed by Wyman[43] and used by Tanford
to describe protein folding/unfolding.[60] Following the theoretical foundations of these groups, the binding
of a titratable ligand (L) to a general macromolecular receptor (R)
can be considered through a general equation for ligand association
governed by the apparent equilibrium constant, Kapp:where the
brackets indicate that the ligand
(L) and complex (RL) ensembles may contain different protonated forms
of the titratable ligand species. In the case of a ligand with a single
titratable site binding to CB[7], which itself does not titrate in
the biological range of pH levels, Kapp can be written aswhere
the concentrations, rather than activities,
of the given species are reported assuming ideal dilute solutions.
Building from the thermodynamic cycle used to describe the proton-linked
ligand binding to CB[7] (Scheme 1), Kapp can be rewritten according to eq 3, in which the concentrations of all species are
presented in binding polynomials with respect to the concentrations
of the deprotonated complex and ligand species:
Scheme 1
Thermodynamic Cycle for Complex Formation
between a Receptor (R)
and a Titratable Ligand (L)
Using the acid dissociation constants for the free ligand
(KaF) and ligand–receptor
complex
(KaC), as illustrated by the
vertical reactions in Scheme 1 (eqs 4 and 5),where the proton activity is denoted by aH, eq 3 can
be rewritten in terms of the overall free energy of binding for the
ligand L to the receptor R (ΔG°):where
the proton activity and acid dissociation
constants have been converted to their respective logarithmic constants,
pH and pKa. The pH dependence of the binding
free energy can thus be obtained having only the pKa values of the ligand molecule free in solution (pKaF) and in complex with the receptor
(pKaC), as well as the free
energy of binding for a reference reaction shown in eq 6 (the top reaction in Scheme 1), ΔGref°, in which there is no net uptake or release of protons. This formalism
for obtaining ΔG° as a function of pH
can further be applied to cases where multiple ligand and receptor
groups titrate in the pH range considered, assuming that proton binding
occurs independently. In other words, eq 6 can
only be applied when all titratable groups are uncoupled from each
other.As protein active sites often contain multiple titratable
groups
whose protonation states are coupled to perform a given function,
it will sometimes be wrong to assume that all titratable groups remain
uncoupled upon ligand binding. In such cases, Wyman[43,61] derived a relation between Kapp and
pH such thatwhere, using
the notation used by Tanford,[60] ΔvH is the change in the number of
bound protons in the receptor–ligand
complex, relative to the number of protons bound to the ligand and
receptor individually. Utilizing the unit charge of a proton, this
relation is equivalent to the difference in total charge, Z, between reactants and product in eq 1. With ΔZ = ZLR – (ZL + ZR), integration of eq 7 provides a thermodynamic
relation that holds for proton-linked ligand binding in cases where
titratable sites may interact (eq 8):where ZR is omitted,
since the CB[7] receptor under consideration does not titrate in the
pH range considered in this study. Since the integration is performed
with respect to pH in the second term in eq 8, the reference binding free energy corresponds to the binding free
energy at a specific pH.Both eqs 6 and 8 thus
provide frameworks for computing the pH-dependent binding free energy
by adding a correction term to the reference free energy of binding.
In the case of eq 6, the reference free energy,
ΔGref°, is obtained for receptor–ligand
binding with protonation states fixed, such that no net change of
protonation occurs. Analogously, the reference free energy in eq 8 is required to be the free energy of binding at
a given value of pH. These two reference free energies are not necessarily
equivalent; however, the reference reaction can be chosen such that
they have the same value.
Constant pH Molecular Dynamics
Baptista
and co-workers
developed constant pH molecular dynamics (CpHMD) with stochastic titration
to enable concurrent sampling of both conformational and protonation
spaces according to the semigrand canonical ensemble.[32] Here, we use the simplified CpHMD formulation implemented
in the standard release of AMBER 12[62] that
is similar to Baptista’s formulation except that the simulation
is performed in implicit solvent with generalized Born electrostatics.[33] In this method, an MD simulation is propagated
from initial sets of coordinates and protonation states. After a chosen
number of MD steps, the simulation is halted, at which point a Monte
Carlo (MC) step evaluates whether a random titratable residue in the
system should change protonation states. The acceptance of this new
protonation state is contingent on the application of the Metropolis
criterion to the computed transition free energy, ΔGtrans, obtained using eq 9, where
pH enters as an external thermodynamic parameter and kBT is the Boltzmann constant multiplied
by the temperature of the system.For the value of pH at which the simulation
is conducted, the difference in electrostatic free energy that accompanies
the change in protonation being considered, ΔGelec, is computed with respect to the difference in electrostatic
free energy that accompanies the analogous change in protonation for
a model compound, ΔGelec,ref, which
has a known pKa value (pKa,ref). In this manner, any nonclassical contributions
to the transition free energy cancel. For a given CB[7]:guest system,
the model compound that enters eq 9 is the guest
molecule free in solution, its pKa,ref is the experimentally obtained pKa value
of the free guest (pKaF, Table 1), and ΔGelec,ref is defined to be the electrostatic free energy that equally populates
the protonated and deprotonated forms of the free guest when the solution
pH is equal to the experimental pKa of
the free guest. If the transition is accepted, then MD is continued
with the new protonation state for the titratable residue; otherwise,
MD continues without change in the protonation state. Repeated application
of these steps builds an ensemble of protonation states along the
MD trajectory.
Use of Constant pH Molecular Dynamics in
the Binding Polynomial
Scheme
The CpHMD method is applied to obtain values for pKaC in eq 6 and
ΔZ in eq 8 to provide
pH-dependent correction terms to the reference binding free energies.
In the case of eq 6, values of pKaC are obtained from simulating the CB[7]:guest
system at a range of pH values. Each CpHMD simulation obtains a fractional
protonation for the titratable guest being considered. By tabulating
the fraction of deprotonated guest species (s) computed
at each value of pH, application of the Hill equation can be used
to predict pKaC as the midpoint
of the titration, as well as the Hill coefficient, n (eq 10):This
method can reliably extract the pKa when
the titratable residue exhibits typical
titration behavior.[63] In all fits, the
Hill coefficient obtained is approximately 1, which is anticipated
in the absence of cooperativity.In the case of eq 8, the partial charges for the guest free in solution, ZL, and the partial charges for the guest in
complex with CB[7], ZLR, can similarly
be obtained from CpHMD simulations. BZ and its derivatives have charges
of +1 when protonated and 0 when deprotonated. Consequently, ZLR and ZL are equivalent
to the fraction of protonated species (1 – s) obtained from CpHMD simulations performed on the CB[7]:guest complex
and the free guest, respectively.
Methods
Parameterization
of CB[7] and Benzimidazole Ligands for Molecular
Dynamics Simulations
Partial charges for the CB[7] host have
previously been derived[64] using the restrained
electrostatic potential (RESP) procedure,[65−67] conventionally
used to parameterize nonstandard residues for molecular simulations
performed with AMBER force fields. Analogously, the geometries of
benzimidazole (BZ), albendazole (ABZ), carbendazim (CBZ), fuberidizole
(FBZ), and thiabendazole (TBZ, Figure 2) are
optimized at the B3LYP/6-31G(d) level of theory[68−71] using the Gaussian 09 suite of
programs.[72] Subsequently, the electrostatic
potentials (ESPs) associated with the optimized geometries of these
guests are computed using MK radii[73] at
the HF/6-31G(d) level of theory. The ESPs of the different guest molecules
are submitted to the antechamber module[67] in the AmberTools 12 suite of programs,[62] which applies the RESP procedure to extract atomic point charges
for use in molecular dynamics (MD) simulations. All other CB[7] and
guest ligand force field terms, including Lennard-Jones parameters,
are taken from the general AMBER force field (GAFF).[74]
Docking of Guest Molecules to CB[7]
To generate starting
coordinates for MD simulations of different CB[7]:guest complexes,
the various BZ derivatives are docked rigidly into the CB[7] cavity
using the extra precision mode (XP) in Schrodinger’s Glide
program.[75−77] Each CB[7]:guest docking experiment yields a single
pose for the CB[7]:guest complex, and all guest molecules bind CB[7]
similarly. For illustrative purposes, the resulting CB[7]:FBZ complex
obtained from Glide is shown in Figure 3. It
is also worth noting that the docked poses of deprotonated and protonated
guests in complex with CB[7] are similar. The hydrophobic core of
the BZ guests is encapsulated by the CB[7] cavity, orienting the ligands
similarly regardless of protonation, while additional furanyl, thiazole,
amido, or thioether R-groups seen in the BZ derivatives protrude outside
of the entrance to CB[7]. All poses show good agreement with experiments.[59]
Figure 3
Structure of CB[7]:fuberidazole complex generated by docking.
Structure of CB[7]:fuberidazole complex generated by docking.
Constant pH Molecular Dynamics
Simulation Details
CpHMD
simulations are performed using the AMBER 12 suite of programs for
the range of pH values between 2 and 12 at increments of 0.5.[33,62] All simulations employ the OBC generalized Born (GB) implicit solvent
model (igb = 5)[78] with a salt concentration
of 0.1 M. Starting from the docked CB[7]:guest structures, all systems
are minimized for 5000 steps while applying positional constraints
to all heavy atoms with a force constant of 20 kcal/mol Å2. Following minimization, the system is heated to 300 K over
the course of 500 ps using a Langevin thermostat[79] while maintaining the positional constraints applied to
all heavy atoms with a force constant of 5 kcal/mol Å2. After heating, a 1 ns equilibration simulation is performed at
300 K. Production simulations are then performed for 5 ns, with MC
steps taken every 10 fs. In all equilibration and production steps,
the bonds involving hydrogen are constrained using the SHAKE algorithm[80] and a cutoff of 30 Å for the computation
of nonbonded interactions is enforced.
Computing Absolute Binding
Free Energy with Thermodynamic Integration
The calculation
of the pH-dependent binding free energy requires
a reference binding energy obtained either in the absence of protonation
change (eq 6) or at a specified pH value (eq 8). TI computations are performed to obtain the absolute
binding free energy between CB[7] and guest molecules that are deprotonated
both free in solution and in complex. In TI, the free energy change
is evaluated aswhere U is the total potential
energy of the system coupled to λ, which varies smoothly between
the initial state of λ = 0 and the final state of λ =
1.[81] The reference binding free energy
is obtained from the thermodynamic cycle shown in Scheme 2 and is calculated usingwhere ΔG1 is the free energy for gradually turning on restraints (see
below),
ΔG2 is the free energy for decoupling
the guest while bound to the host in the presence of the restraints,
ΔG3° is the free energy for turning off the
restraint and correcting for the standard state, and ΔG4 is the solvation free energy for the decoupled
guest (Scheme 2).
Scheme 2
Thermodynamic Cycle
for an Absolute Binding Free Energy Calculation
The outer circle represents
a CB[7] host, and the inner blue circle shows a guest molecule in
the reference deprotonated state. ΔG1 is the free energy for gradually turning on the restraints; ΔG2 is for decoupling the guest from the host
in the presence of the restraints; ΔG3° is the analytical
correction for removing the restraints; and ΔG4 is the solvation free energy for the guest.
Thermodynamic Cycle
for an Absolute Binding Free Energy Calculation
The outer circle represents
a CB[7] host, and the inner blue circle shows a guest molecule in
the reference deprotonated state. ΔG1 is the free energy for gradually turning on the restraints; ΔG2 is for decoupling the guest from the host
in the presence of the restraints; ΔG3° is the analytical
correction for removing the restraints; and ΔG4 is the solvation free energy for the guest.The electrostatic and van der Waals (vdW) contributions
to ΔG2 and ΔG4 are
computed separately, the latter using the softcore potential algorithm.[82−84] To improve the convergence for these computations, the virtual bond
algorithm developed by Karplus and co-workers is applied, where a
set of restraints are used to fix the position and orientation of
the guest relative to CB[7]. The free energy for turning on the restraint,
ΔG1, is computed using TI. The free
energy for turning off the restraint, ΔG3°, is calculated
using an analytical expression, which corrects for the presence of
restraints and also accounts for the standard state:[85]Here V° is
the standard
state volume of 1661 Å3 for ideal gas, raA, sin θA, and sin θB are the distance and angle values used for each restraint, having
corresponding harmonic force constants (K’s
in eq 13), which are 5 kcal/mol Å2 for the distance restraint and 20 kcal/mol rad2 for the
angle and dihedral restraints. The second term in eq 13 accounts for the symmetry in the system, where σR···L, σR, and σL are the symmetry numbers for the host–guest complex,
CB[7], and the guest molecule, respectively. For our system, σR···L and σL are 1 and σR is 14.The pmemd implementation of
TI in AMBER 14 is
used to calculate the reference binding free energy ΔGref,TI° for each guest to CB[7].[86,87] The reference ionization
state is chosen to be deprotonated because all experimental values
of ΔGref° were measured with the guests deprotonated.[59] For the calculation of the electrostatic contribution
to ΔG2 and ΔG4, 11 equally spaced λ values are used (0.0, 0.1,
0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0). For the calculation
of the vdW contribution to ΔG2 and
ΔG4, 21 λ values are used
(0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.725, 0.75, 0.775, 0.8,
0.825, 0.85, 0.875, 0.9, 0.925, 0.95, 0.975, 1.0). For the computation
of the free energy for turning on the restraints, ΔG1, 16 λ values are used (0.0, 0.01, 0.02, 0.03,
0.05, 0.075, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0). The
unequal spacing of λ windows is needed to capture a smoother
transition of ∂U(λ)/∂λ
along the λ parameter and reduce errors in integration. Integration
is performed numerically using the trapezoidal rule, and uncertainties
in the free energies are propagated as standard deviations.Each CB[7]:guest complex was solvated with TIP3P water[88] with a region of 12 Å in any direction
using the tleap program.[89] The system was
minimized for 5000 steps and heated to 300 K over 500 ps in the NVT
ensemble using a Langevin thermostat,[79] followed by an equilibration for 500 ps in the NPT ensemble using
a Berendsen barostat[90] with isotropic position
scaling to bring the system to a stable density. All production simulations
are performed in the NVT ensemble and are extended until the cumulative
free energy computed for each individual transformation converges
(changes in ΔG < 0.01 kcal/mol).
Results
Review
of Experimental Results
Previously, Koner et
al. observed enhancements in stabilities and solubilities of benzimidazole
(BZ) derivatives upon encapsulation by the cucurbit[7]uril (CB[7])
host.[59] The authors obtained values of
pKaF and pKaC by fitting the data from UV titrations and 1H NMR spectroscopy.[91] Henceforth,
pKaF,exp and pKaC,exp will differentiate experimental pKa values from their respective computed values,
pKaF,calc and pKaC,calc. The experimental data showed large
shifts in pKa ranging between 2.5 and
4 pK units upon complex formation with CB[7] (Table 1). Additionally, association constants of the complexes
were obtained at basic pH where guests were presumably deprotonated
in both bound and unbound states; association constants were also
obtained for the binding of protonated guests through application
of the thermodynamic cycle (see Scheme 2 in ref (59)). In all cases, measurements
of the binding free energies for different CB[7]:guest complexes indicated
that the protonated guests are favored in the CB[7] cavity (discussed
later; see Table 3).
Table 3
Binding
Free Energies of the Guests
upon Complex Formation with CB[7], Computed Using the Hybrid Approach
with eq 6a
guest
ΔGref,exp°
ΔGexp°+
ΔGhybrid°+
ΔGref,exp° (pH 7)
ΔGhybrid° (pH 7)
BZ
–4.4
–9.2
–8.8
–7.1
–6.7
TBZ
–3.0
–8.6
–7.9
–5.2
–4.7
FBZ
–2.3
–7.6
–7.6
–4.5
–4.6
ABZ
–6.6
–10.2
–11.5
–6.7
–7.1
CBZ
–6.0
–9.5
–10.0
–6.4
–6.8
All energies are reported in
kcal/mol. ΔGref,exp° is the experimental[59] binding free energy for the reference deprotonated
guest; ΔGexp°+ is the binding free energy for the protonated
guest derived from ΔGref,exp°; and ΔGhybrid°+ is
the free energy obtained by using pKaC,calc with ΔGref,exp° in eq 6.
pKaF denotes the pKa of the free
guest, and pKaC represents
the pKa of the guest in complex with CB[7].
pKa Shifts upon CB[7]:Guest Complex
Formation
To compute the pKa values
of various BZ derivatives in complex with CB[7], we perform CpHMD
simulations on five CB[7]:guest complexes. In Figure 4, representative titration curves are shown for benzimidazole
(BZ) and albendazole (ABZ), both in complex with CB[7] and free in
solution. Similar curves corresponding to the other guests can be
found in Figure S1 (Supporting Information). Titration of the free guest in solution offers preliminary examination
of the ability of the CpHMD to reproduce the experimental pKa; for example, in the case of BZ, the pKaF,calc value matches the pKaF,exp value of 5.5, indicating proper
calibration of the CpHMD method. From the titration curve of BZ free
in solution (Figure 4A, green curve), it is
apparent that free BZ is protonated at values of pH less than 4.5
and deprotonated at pH levels above 6.5. Between these pH levels,
an ensemble of protonated and deprotonated states exists. Relative
to the titration curve for free BZ, the titration curve for the CB[7]:BZ
complex is shifted toward more basic values of pH (Figure 4A, purple curve). Indeed, the value of pKaC,calc for BZ is found to be 8.7—a
shift of more than 3 pK units above its pKaF (Table 2); consequently, complexed BZ is protonated at pH below 7.5, indicating
the preferred protonation state of BZ at neutral (typical physiological)
pH differs depending on its bound state. The observed preference for
the protonated guest in the cavity of CB[7] is due to the additional
hydrogen bond between the titratable proton on BZ and one of the carbonyl
oxygens at the entrance to the CB[7] cavity (Figure 5). It is worth noting that the Hill equation provides a reasonable
estimate of the pKa values for BZ both
free and in complex with CB[7], with fitting errors of ∼0.01
pK units. Furthermore, the pKaC,calc value for BZ underestimates its pKaC,exp by only 0.3 pK units (Table 2).
Figure 4
Titration curves from
constant pH MD simulations of the guests
free in solution (green) and in complex with CB[7] (purple): (A) benzimidazole;
(B) albendazole.
Table 2
Comparison
of pKaC Values Obtained from
CpHMD Simulations (pKaC,calc) with Experimental Data (pKaC,exp)[59] a
guest
pKaC,exp
pKaC,calc
n
BZ
9.0
8.71 ± 0.01
0.99
TBZ
8.6
8.19 ± 0.01
1.01
FBZ
8.6
8.61 ± 0.01
1.01
ABZ
6.1
7.10 ± 0.01
0.99
CBZ
7.0
7.40 ± 0.01
1.03
The corresponding Hill coefficients
(n) are also shown.
Figure 5
Hydrogen bonds formed between the protonated benzimidazole with
the carbonyl oxygens of CB[7].
Titration curves from
constant pH MD simulations of the guests
free in solution (green) and in complex with CB[7] (purple): (A) benzimidazole;
(B) albendazole.The corresponding Hill coefficients
(n) are also shown.Hydrogen bonds formed between the protonated benzimidazole with
the carbonyl oxygens of CB[7].The chemical structure of ABZ differs from that of BZ by
the presence
of amido and thioether R-groups attached to the BZ core. Additionally,
the experimentally determined pKa for
the CB[7]:ABZ complex remains acidic (pKaC,exp = 6.1). The titration curves obtained from CpHMD
simulations of free and complexed ABZ are shown in Figure 4B. Qualitatively, the titration behavior of ABZ
appears similar to that of BZ, as its pKaC,calc of 7.1 is shifted toward a more basic value from
its pKaF,exp of 3.5. At neutral
pH, these data suggest that ABZ is fully deprotonated when free in
solution, whereas both its protonated and deprotonated forms are significantly
populated when in complex with CB[7]. While the errors obtained for
fitting the Hill equation to the titration data are minimal with errors
observed for BZ and all other guests of less than 0.01 pK units, the value of pKaC,calc for the CB[7]:ABZ complex has the greatest deviation from the experiment
(ΔpK = 1.0, Table 2).The titration curves for the other guests follow a similar trend,
where formation of the CB[7]:guest complex increases the pKa of the guest (Figure S1, Supporting Information). These shifts are in line with the
experimentally determined pKa values (Table 2), with the largest deviation seen for ABZ as stated
above. Overall, the CpHMD method provides accurate predictions of
pKaC,calc values, with a mean
average error (MAE) of 0.42 pK units with respect
to experiment (pKaC,exp, Table 2).
pH Dependence of the Binding Free Energy
As discussed
above, the pKa values of the BZ-derived
guests differ when bound to CB[7] and when free in solution (Figure 4, Table 2). Since there are
no other titratable groups in the CB[7]:guest complexes in the pH
ranges studied here, the binding of the guests to CB[7] can have a
net uptake of protons, which makes their binding free energies depend
on the solution pH. In this section, we compute binding free energies
as functions of pH using eqs 6 and 8. Both of these equations can be used to obtain the pH-dependent
binding free energy by adding a pH-dependent correction term to a
reference binding free energy. In eq 6, this
reference free energy corresponds to the free energy of binding in
the absence of proton binding. In contrast, eq 8 requires that the reference free energy be obtained at a specific
pH. The reference binding free energies in these two equations can
be identical if obtained at a specific value of pH where the protonation
states do not change. Since experimental association constants were
obtained at pH levels where both the free and bound guests are deprotonated,[59] we use reference binding free energies for the
association of deprotonated guests with CB[7] in this work.As a simple illustration of how pH-dependent binding free energies
may be obtained, we first use the binding free energy measured experimentally
for each of the different CB[7]:guest systems (ΔGref,exp°) as the reference free energy term in eqs 6 and 8. We refer to this as a “hybrid”
approach, as it obtains a pH-dependent binding free energy (ΔGhybrid°) from the experimental reference binding free energy (ΔGref,exp°) and CpHMD-derived terms. These terms are either the
pKa for the complex, pKaC,calc, used in eq 6 or charge differences between the binding partners in complex and
free in solution, ΔZ, in eq 8. While all results described here have been obtained using
eq 6 with pH-dependent corrections requiring
values of pKaC,calc, identical
results have also been obtained using eq 8 (Figure
S3 and Table S1, Supporting Information).Plots of binding free energies as functions of pH for CB[7]
complexes
with BZ, FBZ, and ABZ are shown in Figure 6A–C. While these binding free energies are referenced to ΔGref,exp° (Figure 6, red line), the use
of pKaC,calc in eq 6 and ΔZ in eq 8 to generate the full curve as a function of pH can be assessed
by how well the computed binding free energy at acidic pH (ΔGhybrid°+ in Table 3) matches
the analogous value derived from experiment (ΔGexp°+,
blue line in Figure 6). For all CB[7]:guest
complexes, the values of ΔGhybrid°+ deviate less than
1.35 kcal/mol from the respective experimental values (Table 3), with the greatest error observed for ABZ. These
errors are entirely due to the errors in computing values of pKaC,calc, as the value for ΔGexp°+ was derived using experimentally obtained values of pKaC, pKaF, and ΔGref°.[59]
Figure 6
Binding free
energies as functions of pH (black line). The top
row is computed by the hybrid approach using the experimental reference
binding energies (ΔGref,exp°, red line), and the bottom row
uses the full computational approach with the reference binding energies
computed by thermodynamic integration (ΔGref,TI°, green
line). Experimentally derived binding free energies for the protonated
guests are shown in blue. (A, D) Benzimidazole; (B, E) fuberidazole;
(C, F) albendazole.
Binding free
energies as functions of pH (black line). The top
row is computed by the hybrid approach using the experimental reference
binding energies (ΔGref,exp°, red line), and the bottom row
uses the full computational approach with the reference binding energies
computed by thermodynamic integration (ΔGref,TI°, green
line). Experimentally derived binding free energies for the protonated
guests are shown in blue. (A, D) Benzimidazole; (B, E) fuberidazole;
(C, F) albendazole.All energies are reported in
kcal/mol. ΔGref,exp° is the experimental[59] binding free energy for the reference deprotonated
guest; ΔGexp°+ is the binding free energy for the protonated
guest derived from ΔGref,exp°; and ΔGhybrid°+ is
the free energy obtained by using pKaC,calc with ΔGref,exp° in eq 6.From Figure 6A–C, it is evident that
all guests bind more favorably when protonated. Indeed, the binding
free energies observed for deprotonated guests (at extremely basic
pH) are 3.4–5.6 kcal/mol more positive (less favorable) than
those obtained at acidic pH when the guests are protonated. This tendency
is most pronounced in the CB[7]:FBZ complex (Figure 6B), for which the binding free energy obtained when FBZ is
predominantly protonated (−7.56 kcal/mol) is over 5 kcal/mol
more favorable than its respective value when FBZ is deprotonated
(−2.33 kcal/mol). This observation is consistent with experiment[59] and stems from the favorable hydrogen bond between
the guest and CB[7] (Figure 5).Taking
a closer look at the pH-dependent binding free energies
of the CB[7]:BZ complex, it is apparent that the binding free energy
spans 4.8 kcal/mol between pH levels 4.5 and 10, a range that essentially
encompasses the pH levels of most biological reactions (Figure 6A).[92] At physiological
pH (∼7), free BZ is predominantly deprotonated, whereas BZ
in complex with CB[7] is protonated (Figure 4A). In conventional free energy computations, ligand protonation
states are typically assigned as the preferred protonation state for
the free ligand. Consistent with this convention, BZ would be considered
deprotonated in free energy computations performed at pH 7. In making
this assumption, the binding free energy deviates from the pH-dependent
binding free energy obtained here by ∼2.3 kcal/mol for the
CB[7]:BZ complex. Similar deviations are noted for the binding of
other guests as well, with the magnitudes ranging between 0.4 and
2.3 kcal/mol (Table 3 and Figure S2, Supporting Information).
Full Prediction of the
pH-Dependent Free Energy Profile
To demonstrate the utility
of our method when experimental binding
free energies are unavailable, we perform thermodynamic integration
(TI) computations based on the thermodynamic cycle shown in Scheme 2 to obtain ΔGref,TI°, the
reference binding free energies of the CB[7]:guest complexes with
the guests deprotonated. The pH-dependent correction terms obtained
either with eq 6 or 8 are
then referenced to ΔGref,TI° to obtain a full computational
prediction (CpHMD/TI) of the pH-dependent free energy profiles.The free energy profiles of CB[7]:guest complexes using ΔGref,TI° are shown in Figure 6D–F, while the
computed values of ΔGref,TI° are reported in Table 4 for comparison with experiment (ΔGref,exp°, Table 4). All values of ΔGref,TI° agree well with experiment, showing absolute errors that are less
than 1.3 kcal/mol. Further, the error with respect to experiment (ΔGexp°+) in the predicted values for the binding free energy of protonated
guests (ΔGTI°+) are similarly low (<1.4 kcal/mol).
Errors in ΔGTI°+ arise from both the computation of pKaC,calc (or ΔZ, when using eq 8; Table S2, Supporting Information) using the CpHMD method and the binding
free energy computation with TI. These errors are not always of the
same sign; for example, the deviation from ΔGexp°+ obtained
for ABZ decreased from 1.4 kcal/mol when using ΔGref,exp° in the hybrid approach to 0.7 kcal/mol when using the ΔGref,TI° reference, indicating some cancellation of error in the full computational
approach. In contrast, the deviation for TBZ increased from 0.7 kcal/mol
when using the experimental ΔGref,exp° reference
to 1.4 kcal/mol when using ΔGref,TI°. Regardless, results
obtained using the full computational approach with ΔGref,TI° show errors that are similar in magnitude to those observed using
ΔGref,exp° in the hybrid experimental/computational
approach. Furthermore, the errors associated with the full computational
protocol can be lower than the errors that arise from performing binding
free energy computations with fixed protonation states assigned to
the unbound CB[7] and guest molecules (Table 4).
Table 4
Binding Free Energies of the Guests,
Computed Using the Full Computational Approach (CpHMD/TI) and Compared
to Experiment[59] a
guest
ΔGref,exp°
ΔGref,TI°
ΔGexp°+
ΔGTI°+
ΔGref,exp° (pH 7)
ΔGref,TI° (pH 7)
BZ
–4.4
–4.1 ± 2.0
–9.2
–8.5
–7.1
–6.4
TBZ
–3.0
–2.3 ± 2.6
–8.6
–7.3
–5.2
–4.0
FBZ
–2.3
–2.1 ± 2.6
–7.6
–7.3
–4.5
–4.3
ABZ
–6.6
–6.0 ± 3.0
–10.2
–11.0
–6.7
–6.5
CBZ
–6.0
–4.8 ± 2.7
–9.5
–8.7
–6.4
–5.5
All energies are reported in
kcal/mol. ΔGref,exp° is the experimental[59] binding free energy for the reference deprotonated
guest; ΔGref,TI° is the absolute binding free energy obtained
from TI computations for the reference state; ΔGexp°+ is
the binding free energy for the protonated guest derived from ΔGref,exp°; and ΔGTI°+ is the binding free energy obtained
by using pKaC,calc with ΔGref,TI° in eq 6.
All energies are reported in
kcal/mol. ΔGref,exp° is the experimental[59] binding free energy for the reference deprotonated
guest; ΔGref,TI° is the absolute binding free energy obtained
from TI computations for the reference state; ΔGexp°+ is
the binding free energy for the protonated guest derived from ΔGref,exp°; and ΔGTI°+ is the binding free energy obtained
by using pKaC,calc with ΔGref,TI° in eq 6.
Discussion
Changes in the pKa values and, consequently,
the protonation states of ionizable species participating in biomolecular
association processes are well documented. To address this phenomenon,
we present a simple methodology for obtaining the pH dependence of
binding free energies for a series of cucurbit[7]uril (CB[7]):guest
complexes. On the basis of Wyman’s binding polynomial formalism,[43] binding free energies are computed as pH-dependent
corrections to a reference binding free energy. Combining this formalism
with constant pH molecular dynamics (CpHMD) simulations and free energy
computations yields a reasonable protocol for evaluating the pH-dependent
binding free energies of biomolecular systems.Focusing on the
application of CpHMD to eq 6 in order to obtain
pH-dependent relative binding free energies,
we assess how well CpHMD simulations can capture the pKa of BZ guests in complex with CB[7] (Table 2). With the exception of albendazole (ABZ), the values of
pKaC,calc obtained for the
different CB[7]:guest complexes deviate from experiment by less than
0.41 pK units. The pKaC,calc value obtained for the CB[7]:ABZ complex, however,
exhibits an error of 1.0 pK unit. Since the CpHMD
simulations conducted in this study are only 5 ns long, we extended
the simulation of CB[7]:ABZ to 25 ns to ascertain whether the value
of pKaC,calc had converged;
however, the resulting pKaC,calc remains unchanged. Further, the process of fitting titration data
obtained from CpHMD simulations of CB[7]:guest complexes to the Hill
equation is achieved with very little statistical error (<0.01
pK units, Table 2). Both of
these findings indicate that the error in the pKaC,calc values is not due to convergence problems
in the CpHMD simulations. Instead, it is possible, though not explicitly
demonstrated in this work, that inaccuracies in the computed pKaC values stem from problems with
the force field due to the accuracies of similar magnitude to those
seen in the previous CpHMD runs.[93,94]Since
CpHMD simulations can reliably compute the values of pKaC,calc (and, similarly, ΔZ in the case of eq 8) for the CB[7]:guest
systems, we proceed to incorporate these pKaC,calc values in eq 6 along with
a reference experimental binding free energy (ΔGref,exp°) to obtain binding free energies as functions of pH. This hybrid
experimental/computational approach is followed and shown for CB[7]:guest
systems (Figure 6A–C). To evaluate the
accuracy of this approach, we compare the computed value of ΔGhybrid°+ to experiment (ΔGexp°+) and observe good agreement, with
errors of <1.4 kcal/mol arising from the computation of pKaC,calc.Having established
that eq 6 can successfully
recapitulate pH-dependent binding free energies with an experimentally
determined reference binding free energy, we consider the use of thermodynamic
integration (TI) computations to remove this dependence on experiment.
TI computations effectively reproduce the reference binding free energies
observed from experiment (ΔGref,exp°) with absolute errors
less than 1.3 kcal/mol (ΔGref,TI°, Table 4). The resulting pH-dependent free energy profiles using ΔGref,TI° are similar to those computed with ΔGref,exp°, as
shown in Figure 6. Furthermore, the absolute
errors in predicting the free energies of the protonated guests, ΔGTI°+, using the ΔGref,TI° reference are less than 1.3 kcal/mol.
These errors arise from both ΔGref,TI° and
the use of CpHMD simulations to obtain pKaC,calc values. In regard to the computation of ΔGref,TI°, we do observe large statistical uncertainties for all CB[7]:guest
complexes considered, which stem largely from the van der Waals decoupling
simulations (Table S3, Supporting Information); however, the free energies computed for every transformation in
the thermodynamic cycle shown in Scheme 2 have
all converged, with the cumulative computed ΔG < 0.01 kcal/mol.The use of our CpHMD/TI approach to provide
a full computational
prediction of pH-dependent binding free energies is particularly advantageous
when experimental association constants are not available, as most
experimental measurements face the limitations at extreme pH levels
due to the highly possible destabilization or denaturation of the
proteins under such severe conditions. Therefore, when combined with
computational free energy calculations, our method is free from such
concerns, eliminating the reliance on the availability of experimental
data. While the CpHMD/TI computation of pH-dependent binding free
energies is prone to greater error than the hybrid experimental/computational
approach described previously, we find the absolute errors in the
CpHMD-derived pKaC values and
reference binding free energies obtained from TI computations are
not necessarily of the same sign for the CB[7]:guest systems considered
and may cancel out. Further, the observed errors with respect to the
ΔGexp°+ are relatively low (<1.4 kcal/mol).
In contrast, the error in assigning incorrect protonation states in
free energy computations without correcting for the pH dependence
of the binding free energy can give errors in excess of 2 kcal/mol
(Table 3). This observation underscores the
importance of accounting for the linkage of proton binding or release
to ligand binding in free energy computations and demonstrates the
high utility of the CpHMD/TI approach.Our results highlight
the significant changes in pKa and free
energy of binding upon complex formation that
accompanies a net proton uptake. Noting that the guests used here
have a single titratable site, corresponding changes in free energy
may sometimes be larger in protein–ligand binding where multiple
titratable groups exist. Therefore, we believe that our method will
have great utility in computer-aided drug discovery, where early stages
of the structure-based drug design often focus on finding a high-affinity
binder to a target protein. Extensions of our methodology to such
more complex protein systems may require improvements to the computational
protocols employed. The simple framework developed here allows for
trivial incorporation of CpHMD methods that incorporate explicit solvent
models[35] and/or enhanced sampling techniques,
such as accelerated molecular dynamics[37] or replica exchange[36] to improve pKa computations in systems where convergence
is difficult.[94] Similarly, our protocol
accommodates the use of alternative methods for obtaining the reference
binding free energy required by eqs 6 and 8. Thus, the computational methodology for performing
CpHMD/TI computations can be chosen to best address the system under
consideration.While we have focused on the results obtained
using eq 6, which assumes all titratable groups
are decoupled,
the CpHMD/TI method is also compatible with the expression for obtaining
the pH-dependent binding free energy given in eq 8, and these two expressions yield identical results in the case of
the CB[7]:guest systems considered here. We intend to build on the
computational protocol developed here, with a natural extension being
the application of the CpHMD/TI method to obtain pH-dependent binding
free energies of protein–ligand complexes. As protein–ligand
systems are more complicated than the CB[7]:guest systems considered
in this work, we believe the use of eq 8 will
have high utility to address potential interactions between titratable
groups.Given the magnitude of errors in computed binding free
energies
obtained with fixed protonation states in the CB[7]:guest systems,
our computational protocol represents a promising approach to remove
these errors, thus implicating its utility in drug discovery workflows.[95] Though not specifically addressed in this work,
similar philosophies may also be applicable to the scoring functions
in docking protocols.
Conclusions
In this work, we determined
the pH-dependent changes in binding
free energies for complex formation between cucurbit[7]uril (CB[7])
and a series of benzimidazole guests. Using constant pH molecular
dynamics simulations combined with experimental data, we developed
a hybrid protocol that could capture the significant changes in the
CB[7]:guest binding free energies with high accuracy. Subsequently,
we combined our method with thermodynamic integration (TI) to enable
a full computational prediction of the pH-dependent free energy profiles.
This protocol successfully accounted for the pH-dependent changes
in the binding free energies during complex formation. Future work
will include examination of pH-dependent binding free energies for
protein–ligand complexes.
Authors: Richard A Friesner; Jay L Banks; Robert B Murphy; Thomas A Halgren; Jasna J Klicic; Daniel T Mainz; Matthew P Repasky; Eric H Knoll; Mee Shelley; Jason K Perry; David E Shaw; Perry Francis; Peter S Shenkin Journal: J Med Chem Date: 2004-03-25 Impact factor: 7.446
Authors: Richard A Friesner; Robert B Murphy; Matthew P Repasky; Leah L Frye; Jeremy R Greenwood; Thomas A Halgren; Paul C Sanschagrin; Daniel T Mainz Journal: J Med Chem Date: 2006-10-19 Impact factor: 7.446
Authors: Tai-Sung Lee; Bryce K Allen; Timothy J Giese; Zhenyu Guo; Pengfei Li; Charles Lin; T Dwight McGee; David A Pearlman; Brian K Radak; Yujun Tao; Hsu-Chun Tsai; Huafeng Xu; Woody Sherman; Darrin M York Journal: J Chem Inf Model Date: 2020-09-16 Impact factor: 4.956
Authors: Thakshila Dissanayake; Jason M Swails; Michael E Harris; Adrian E Roitberg; Darrin M York Journal: Biochemistry Date: 2015-02-06 Impact factor: 3.162
Authors: Pablo F Cañete; Hao Wang; Grant J Brown; Arti Medhavy; Josiah Bones; Jonathan A Roco; Yuke He; Yuting Qin; Jean Cappello; Julia I Ellyard; Katharine Bassett; Qian Shen; Gaetan Burgio; Yaoyuan Zhang; Cynthia Turnbull; Xiangpeng Meng; Phil Wu; Eun Cho; Lisa A Miosge; T Daniel Andrews; Matt A Field; Denis Tvorogov; Angel F Lopez; Jeffrey J Babon; Cristina Aparicio López; África Gónzalez-Murillo; Daniel Clemente Garulo; Virginia Pascual; Tess Levy; Eric J Mallack; Daniel G Calame; Timothy Lotze; James R Lupski; Huihua Ding; Tomalika R Ullah; Giles D Walters; Mark E Koina; Matthew C Cook; Nan Shen; Carmen de Lucas Collantes; Ben Corry; Michael P Gantier; Vicki Athanasopoulos; Carola G Vinuesa Journal: Nature Date: 2022-04-27 Impact factor: 69.504