Literature DB >> 25134690

Protocols utilizing constant pH molecular dynamics to compute pH-dependent binding free energies.

M Olivia Kim¹, Patrick G Blachly, Joseph W Kaus, J Andrew McCammon.

Abstract

In protein-ligand binding, the electrostatic environments of the two binding partners may vary significantly in bound and unbound states, which may lead to protonation changes upon binding. In cases where ligand binding results in a net uptake or release of protons, the free energy of binding is pH-dependent. Nevertheless, conventional free energy calculations and molecular docking protocols typically do not rigorously account for changes in protonation that may occur upon ligand binding. To address these shortcomings, we present a simple methodology based on Wyman's binding polynomial formalism to account for the pH dependence of binding free energies and demonstrate its use on cucurbit[7]uril (CB[7]) host-guest systems. Using constant pH molecular dynamics and a reference binding free energy that is taken either from experiment or from thermodynamic integration computations, the pH-dependent binding free energy is determined. This computational protocol accurately captures the large pKa shifts observed experimentally upon CB[7]:guest association and reproduces experimental binding free energies at different levels of pH. We show that incorrect assignment of fixed protonation states in free energy computations can give errors of >2 kcal/mol in these host-guest systems. Use of the methods presented here avoids such errors, thus suggesting their utility in computing proton-linked binding free energies for protein-ligand complexes.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2014 PMID： 25134690 PMCID： PMC4306499 DOI： 10.1021/jp505777n

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 2.991

Introduction

The changes in the electrostatic environment that accompany binding of small molecules, nucleic acids, or other proteins may thus induce changes in the protonation states of titratable groups in the protein.[1−8] Recently, Aguilar et al. conducted a computational survey of various protein–protein, protein–small molecule, and protein–nucleic acid complexes to ascertain the prevalence of protonation change in the protein receptor upon biomolecular association. Notably, in 60% of the protein–small molecule complexes considered, at least one titratable residue in the protein was found to assume different protonation states in its free and bound states.[9] Furthermore, protonation changes that accompany small molecule binding to proteins are not limited to the protein partner: an estimated 60–80% of orally administered drugs are weak acids or bases, whose protonation states can also be tuned by the cellular pH and electrostatic environment of their protein binding partners.[10−13] In cases where protein–ligand binding accompanies a net transfer of protons to either binding partner, the binding process is pH-dependent; i.e., the observed binding free energy is a function of pH. Conventionally, both computational docking and more rigorous free energy computations, such as the thermodynamic integration (TI) and free energy perturbation (FEP) methods, employ fixed protonation states that are identical for free and bound states in the computation of binding affinities. Clearly, in cases where ligand binding is linked to the (un)binding of protons, such approximations will lead to error. Improper assignment of protonation states in binding free energy computations may result in significant errors, making correct assignment of pKa and protonation state essential to obtaining accurate free energies. As stated, simulations of protein–ligand systems are typically preceded by the assignment of fixed protonation states to titratable groups on the two binding partners, often using programs such as H++[14−16] and PROPKA[17−20] to do so. Further, docking studies often employ empirical prediction algorithms, which sometimes use Hammett and Taft relations, to assign fixed protonation states to the free ligands being docked.[21,22] These approaches, however, fail to account for changes in protonation that may follow from the altered electrostatic environment surrounding the two binding partners upon complex formation. Several computational methods, however, have been developed that permit the protonation of titratable residues to respond to changes in the electrostatic environment.[2,23−28] For instance, various flavors of constant pH molecular dynamics (CpHMD) methodologies have emerged to incorporate pH as an added external thermodynamic parameter to conventional molecular dynamics (MD) simulations, allowing fluctuations in the protonation of titratable residues to accompany conformational sampling.[29−34] To date, CpHMD simulations have been used to successfully predict pKa values of titratable groups in proteins[29−37] and nucleic acids,[38−40] as well as to explain the mechanism behind the pH-dependent conformational changes critical to the function of proteins such as nitrophorin[41] and rhodopsin.[42] The CpHMD method provides a framework through which the pH dependence of binding processes can be examined. To the best of our knowledge, there is currently no standard protocol available to rigorously account for proton-linked ligand binding. Multiple experimental and computational groups, however, have utilized the binding polynomial formalism devised by Wyman[43] to calculate the changes in binding free energy that accompany binding-induced protonation changes for both protein–protein[3,4,44] and protein–nucleic acid binding.[39,45,46] Motivated by Mason and Jensen’s usage of this binding polynomial formalism to estimate the free energies of binding for protein–protein complexes using the PROPKA web server,[4] we adopt a similar approach in conjunction with the CpHMD method by Mongan et al.[33] to obtain pH-dependent free energy profiles in silico for the binding of small molecules to the cucurbit[7]uril (CB[7]) host. CB[7] is a synthetic molecule with seven repeating glycoluril units bridged by methylene groups (Figure 1).[47,48] This 7-fold symmetric host has gained much attention due to its ability to encapsulate drug-like small molecules with high affinity as a stable host–guest complex.[49−55] Benzimidazole (BZ) and a series of its derivatives (Figure 2) comprise a class of widely used fungicides and anthelmintic drugs[56−58] that have been shown to bind to the CB[7] host and undergo the pKa shifts as large as 4 pK units upon complex formation (Table 1).[59] At neutral pH, these weakly acidic guests are predominantly deprotonated when free in solution, but each binds a single proton upon encapsulation by CB[7]. Both the acid/base behaviors of BZ-derived guests and the small size and relative rigidity of CB[7] compared to a typical biomolecule make the CB[7]:BZ complexes ideal model systems to test theoretical methods for computing pH-dependent binding free energies.

Figure 1

Structure of cucurbit[7]uril (CB[7]) host: (A) glycoluril unit; (B) top view of CB[7].

Figure 2

Chemical structures of benzimidazole and its derivatives.

Table 1

Experimental pKa Shifts of Benzimidazole Guests upon Binding to CB[7][59] a

guest	pK_a^F	pK_a^C,exp	ΔpK
BZ	5.5	9.0	3.5
TBZ	4.6	8.6	4.0
FBZ	4.8	8.6	3.8
ABZ	3.5	6.1	2.6
CBZ	4.5	7.0	2.5

pKaF denotes the pKa of the free guest, and pKaC represents the pKa of the guest in complex with CB[7].

Structure of cucurbit[7]uril (CB[7]) host: (A) glycoluril unit; (B) top view of CB[7]. Chemical structures of benzimidazole and its derivatives. In this work, we accurately reproduce the pKa shifts of the various BZ derivatives upon binding to CB[7], using CpHMD simulations. Coupling these pKa data with reference binding free energies taken either from experiment or from TI computations allows us to obtain a full description of CB[7]:guest binding free energies as functions of pH. Additionally, we show that improper assignment of guest protonation states in binding free energy computations can produce errors in excess of 2 kcal/mol at neutral pH, highlighting the importance of accurately accounting for the pH effects in free energy calculations or docking.

Theory

Binding Polynomial Formalism for Computing the pH Dependence of Binding Free Energies

Mason and Jensen recently examined the pH dependence of protein–protein binding[4] through an application of the binding polynomial formalism developed by Wyman[43] and used by Tanford to describe protein folding/unfolding.[60] Following the theoretical foundations of these groups, the binding of a titratable ligand (L) to a general macromolecular receptor (R) can be considered through a general equation for ligand association governed by the apparent equilibrium constant, Kapp:where the brackets indicate that the ligand (L) and complex (RL) ensembles may contain different protonated forms of the titratable ligand species. In the case of a ligand with a single titratable site binding to CB[7], which itself does not titrate in the biological range of pH levels, Kapp can be written aswhere the concentrations, rather than activities, of the given species are reported assuming ideal dilute solutions. Building from the thermodynamic cycle used to describe the proton-linked ligand binding to CB[7] (Scheme 1), Kapp can be rewritten according to eq 3, in which the concentrations of all species are presented in binding polynomials with respect to the concentrations of the deprotonated complex and ligand species:

Scheme 1

Thermodynamic Cycle for Complex Formation between a Receptor (R) and a Titratable Ligand (L)

Using the acid dissociation constants for the free ligand (KaF) and ligand–receptor complex (KaC), as illustrated by the vertical reactions in Scheme 1 (eqs 4 and 5),where the proton activity is denoted by aH, eq 3 can be rewritten in terms of the overall free energy of binding for the ligand L to the receptor R (ΔG°):where the proton activity and acid dissociation constants have been converted to their respective logarithmic constants, pH and pKa. The pH dependence of the binding free energy can thus be obtained having only the pKa values of the ligand molecule free in solution (pKaF) and in complex with the receptor (pKaC), as well as the free energy of binding for a reference reaction shown in eq 6 (the top reaction in Scheme 1), ΔGref°, in which there is no net uptake or release of protons. This formalism for obtaining ΔG° as a function of pH can further be applied to cases where multiple ligand and receptor groups titrate in the pH range considered, assuming that proton binding occurs independently. In other words, eq 6 can only be applied when all titratable groups are uncoupled from each other. As protein active sites often contain multiple titratable groups whose protonation states are coupled to perform a given function, it will sometimes be wrong to assume that all titratable groups remain uncoupled upon ligand binding. In such cases, Wyman[43,61] derived a relation between Kapp and pH such thatwhere, using the notation used by Tanford,[60] ΔvH is the change in the number of bound protons in the receptor–ligand complex, relative to the number of protons bound to the ligand and receptor individually. Utilizing the unit charge of a proton, this relation is equivalent to the difference in total charge, Z, between reactants and product in eq 1. With ΔZ = ZLR – (ZL + ZR), integration of eq 7 provides a thermodynamic relation that holds for proton-linked ligand binding in cases where titratable sites may interact (eq 8):where ZR is omitted, since the CB[7] receptor under consideration does not titrate in the pH range considered in this study. Since the integration is performed with respect to pH in the second term in eq 8, the reference binding free energy corresponds to the binding free energy at a specific pH. Both eqs 6 and 8 thus provide frameworks for computing the pH-dependent binding free energy by adding a correction term to the reference free energy of binding. In the case of eq 6, the reference free energy, ΔGref°, is obtained for receptor–ligand binding with protonation states fixed, such that no net change of protonation occurs. Analogously, the reference free energy in eq 8 is required to be the free energy of binding at a given value of pH. These two reference free energies are not necessarily equivalent; however, the reference reaction can be chosen such that they have the same value.

Constant pH Molecular Dynamics

Baptista and co-workers developed constant pH molecular dynamics (CpHMD) with stochastic titration to enable concurrent sampling of both conformational and protonation spaces according to the semigrand canonical ensemble.[32] Here, we use the simplified CpHMD formulation implemented in the standard release of AMBER 12[62] that is similar to Baptista’s formulation except that the simulation is performed in implicit solvent with generalized Born electrostatics.[33] In this method, an MD simulation is propagated from initial sets of coordinates and protonation states. After a chosen number of MD steps, the simulation is halted, at which point a Monte Carlo (MC) step evaluates whether a random titratable residue in the system should change protonation states. The acceptance of this new protonation state is contingent on the application of the Metropolis criterion to the computed transition free energy, ΔGtrans, obtained using eq 9, where pH enters as an external thermodynamic parameter and kBT is the Boltzmann constant multiplied by the temperature of the system.For the value of pH at which the simulation is conducted, the difference in electrostatic free energy that accompanies the change in protonation being considered, ΔGelec, is computed with respect to the difference in electrostatic free energy that accompanies the analogous change in protonation for a model compound, ΔGelec,ref, which has a known pKa value (pKa,ref). In this manner, any nonclassical contributions to the transition free energy cancel. For a given CB[7]:guest system, the model compound that enters eq 9 is the guest molecule free in solution, its pKa,ref is the experimentally obtained pKa value of the free guest (pKaF, Table 1), and ΔGelec,ref is defined to be the electrostatic free energy that equally populates the protonated and deprotonated forms of the free guest when the solution pH is equal to the experimental pKa of the free guest. If the transition is accepted, then MD is continued with the new protonation state for the titratable residue; otherwise, MD continues without change in the protonation state. Repeated application of these steps builds an ensemble of protonation states along the MD trajectory.

Use of Constant pH Molecular Dynamics in the Binding Polynomial Scheme

The CpHMD method is applied to obtain values for pKaC in eq 6 and ΔZ in eq 8 to provide pH-dependent correction terms to the reference binding free energies. In the case of eq 6, values of pKaC are obtained from simulating the CB[7]:guest system at a range of pH values. Each CpHMD simulation obtains a fractional protonation for the titratable guest being considered. By tabulating the fraction of deprotonated guest species (s) computed at each value of pH, application of the Hill equation can be used to predict pKaC as the midpoint of the titration, as well as the Hill coefficient, n (eq 10):This method can reliably extract the pKa when the titratable residue exhibits typical titration behavior.[63] In all fits, the Hill coefficient obtained is approximately 1, which is anticipated in the absence of cooperativity. In the case of eq 8, the partial charges for the guest free in solution, ZL, and the partial charges for the guest in complex with CB[7], ZLR, can similarly be obtained from CpHMD simulations. BZ and its derivatives have charges of +1 when protonated and 0 when deprotonated. Consequently, ZLR and ZL are equivalent to the fraction of protonated species (1 – s) obtained from CpHMD simulations performed on the CB[7]:guest complex and the free guest, respectively.

Methods

Parameterization of CB[7] and Benzimidazole Ligands for Molecular Dynamics Simulations

Partial charges for the CB[7] host have previously been derived[64] using the restrained electrostatic potential (RESP) procedure,[65−67] conventionally used to parameterize nonstandard residues for molecular simulations performed with AMBER force fields. Analogously, the geometries of benzimidazole (BZ), albendazole (ABZ), carbendazim (CBZ), fuberidizole (FBZ), and thiabendazole (TBZ, Figure 2) are optimized at the B3LYP/6-31G(d) level of theory[68−71] using the Gaussian 09 suite of programs.[72] Subsequently, the electrostatic potentials (ESPs) associated with the optimized geometries of these guests are computed using MK radii[73] at the HF/6-31G(d) level of theory. The ESPs of the different guest molecules are submitted to the antechamber module[67] in the AmberTools 12 suite of programs,[62] which applies the RESP procedure to extract atomic point charges for use in molecular dynamics (MD) simulations. All other CB[7] and guest ligand force field terms, including Lennard-Jones parameters, are taken from the general AMBER force field (GAFF).[74]

Docking of Guest Molecules to CB[7]

To generate starting coordinates for MD simulations of different CB[7]:guest complexes, the various BZ derivatives are docked rigidly into the CB[7] cavity using the extra precision mode (XP) in Schrodinger’s Glide program.[75−77] Each CB[7]:guest docking experiment yields a single pose for the CB[7]:guest complex, and all guest molecules bind CB[7] similarly. For illustrative purposes, the resulting CB[7]:FBZ complex obtained from Glide is shown in Figure 3. It is also worth noting that the docked poses of deprotonated and protonated guests in complex with CB[7] are similar. The hydrophobic core of the BZ guests is encapsulated by the CB[7] cavity, orienting the ligands similarly regardless of protonation, while additional furanyl, thiazole, amido, or thioether R-groups seen in the BZ derivatives protrude outside of the entrance to CB[7]. All poses show good agreement with experiments.[59]

Figure 3

Structure of CB[7]:fuberidazole complex generated by docking.

Constant pH Molecular Dynamics Simulation Details

CpHMD simulations are performed using the AMBER 12 suite of programs for the range of pH values between 2 and 12 at increments of 0.5.[33,62] All simulations employ the OBC generalized Born (GB) implicit solvent model (igb = 5)[78] with a salt concentration of 0.1 M. Starting from the docked CB[7]:guest structures, all systems are minimized for 5000 steps while applying positional constraints to all heavy atoms with a force constant of 20 kcal/mol Å2. Following minimization, the system is heated to 300 K over the course of 500 ps using a Langevin thermostat[79] while maintaining the positional constraints applied to all heavy atoms with a force constant of 5 kcal/mol Å2. After heating, a 1 ns equilibration simulation is performed at 300 K. Production simulations are then performed for 5 ns, with MC steps taken every 10 fs. In all equilibration and production steps, the bonds involving hydrogen are constrained using the SHAKE algorithm[80] and a cutoff of 30 Å for the computation of nonbonded interactions is enforced.

Computing Absolute Binding Free Energy with Thermodynamic Integration

The calculation of the pH-dependent binding free energy requires a reference binding energy obtained either in the absence of protonation change (eq 6) or at a specified pH value (eq 8). TI computations are performed to obtain the absolute binding free energy between CB[7] and guest molecules that are deprotonated both free in solution and in complex. In TI, the free energy change is evaluated aswhere U is the total potential energy of the system coupled to λ, which varies smoothly between the initial state of λ = 0 and the final state of λ = 1.[81] The reference binding free energy is obtained from the thermodynamic cycle shown in Scheme 2 and is calculated usingwhere ΔG1 is the free energy for gradually turning on restraints (see below), ΔG2 is the free energy for decoupling the guest while bound to the host in the presence of the restraints, ΔG3° is the free energy for turning off the restraint and correcting for the standard state, and ΔG4 is the solvation free energy for the decoupled guest (Scheme 2).

Scheme 2

Thermodynamic Cycle for an Absolute Binding Free Energy Calculation

Thermodynamic Cycle for an Absolute Binding Free Energy Calculation

The outer circle represents a CB[7] host, and the inner blue circle shows a guest molecule in the reference deprotonated state. ΔG1 is the free energy for gradually turning on the restraints; ΔG2 is for decoupling the guest from the host in the presence of the restraints; ΔG3° is the analytical correction for removing the restraints; and ΔG4 is the solvation free energy for the guest. The electrostatic and van der Waals (vdW) contributions to ΔG2 and ΔG4 are computed separately, the latter using the softcore potential algorithm.[82−84] To improve the convergence for these computations, the virtual bond algorithm developed by Karplus and co-workers is applied, where a set of restraints are used to fix the position and orientation of the guest relative to CB[7]. The free energy for turning on the restraint, ΔG1, is computed using TI. The free energy for turning off the restraint, ΔG3°, is calculated using an analytical expression, which corrects for the presence of restraints and also accounts for the standard state:[85]Here V° is the standard state volume of 1661 Å3 for ideal gas, raA, sin θA, and sin θB are the distance and angle values used for each restraint, having corresponding harmonic force constants (K’s in eq 13), which are 5 kcal/mol Å2 for the distance restraint and 20 kcal/mol rad2 for the angle and dihedral restraints. The second term in eq 13 accounts for the symmetry in the system, where σR···L, σR, and σL are the symmetry numbers for the host–guest complex, CB[7], and the guest molecule, respectively. For our system, σR···L and σL are 1 and σR is 14. The pmemd implementation of TI in AMBER 14 is used to calculate the reference binding free energy ΔGref,TI° for each guest to CB[7].[86,87] The reference ionization state is chosen to be deprotonated because all experimental values of ΔGref° were measured with the guests deprotonated.[59] For the calculation of the electrostatic contribution to ΔG2 and ΔG4, 11 equally spaced λ values are used (0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0). For the calculation of the vdW contribution to ΔG2 and ΔG4, 21 λ values are used (0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.725, 0.75, 0.775, 0.8, 0.825, 0.85, 0.875, 0.9, 0.925, 0.95, 0.975, 1.0). For the computation of the free energy for turning on the restraints, ΔG1, 16 λ values are used (0.0, 0.01, 0.02, 0.03, 0.05, 0.075, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0). The unequal spacing of λ windows is needed to capture a smoother transition of ∂U(λ)/∂λ along the λ parameter and reduce errors in integration. Integration is performed numerically using the trapezoidal rule, and uncertainties in the free energies are propagated as standard deviations. Each CB[7]:guest complex was solvated with TIP3P water[88] with a region of 12 Å in any direction using the tleap program.[89] The system was minimized for 5000 steps and heated to 300 K over 500 ps in the NVT ensemble using a Langevin thermostat,[79] followed by an equilibration for 500 ps in the NPT ensemble using a Berendsen barostat[90] with isotropic position scaling to bring the system to a stable density. All production simulations are performed in the NVT ensemble and are extended until the cumulative free energy computed for each individual transformation converges (changes in ΔG < 0.01 kcal/mol).

Results

Review of Experimental Results

Previously, Koner et al. observed enhancements in stabilities and solubilities of benzimidazole (BZ) derivatives upon encapsulation by the cucurbit[7]uril (CB[7]) host.[59] The authors obtained values of pKaF and pKaC by fitting the data from UV titrations and 1H NMR spectroscopy.[91] Henceforth, pKaF,exp and pKaC,exp will differentiate experimental pKa values from their respective computed values, pKaF,calc and pKaC,calc. The experimental data showed large shifts in pKa ranging between 2.5 and 4 pK units upon complex formation with CB[7] (Table 1). Additionally, association constants of the complexes were obtained at basic pH where guests were presumably deprotonated in both bound and unbound states; association constants were also obtained for the binding of protonated guests through application of the thermodynamic cycle (see Scheme 2 in ref (59)). In all cases, measurements of the binding free energies for different CB[7]:guest complexes indicated that the protonated guests are favored in the CB[7] cavity (discussed later; see Table 3).

Table 3

Binding Free Energies of the Guests upon Complex Formation with CB[7], Computed Using the Hybrid Approach with eq 6a

guest	ΔG_ref,exp^°	ΔG_exp^°+	ΔG_hybrid^°+	ΔG_ref,exp^° (pH 7)	ΔG_hybrid^° (pH 7)
BZ	–4.4	–9.2	–8.8	–7.1	–6.7
TBZ	–3.0	–8.6	–7.9	–5.2	–4.7
FBZ	–2.3	–7.6	–7.6	–4.5	–4.6
ABZ	–6.6	–10.2	–11.5	–6.7	–7.1
CBZ	–6.0	–9.5	–10.0	–6.4	–6.8

All energies are reported in kcal/mol. ΔGref,exp° is the experimental[59] binding free energy for the reference deprotonated guest; ΔGexp°+ is the binding free energy for the protonated guest derived from ΔGref,exp°; and ΔGhybrid°+ is the free energy obtained by using pKaC,calc with ΔGref,exp° in eq 6.

pKaF denotes the pKa of the free guest, and pKaC represents the pKa of the guest in complex with CB[7].

pKa Shifts upon CB[7]:Guest Complex Formation

To compute the pKa values of various BZ derivatives in complex with CB[7], we perform CpHMD simulations on five CB[7]:guest complexes. In Figure 4, representative titration curves are shown for benzimidazole (BZ) and albendazole (ABZ), both in complex with CB[7] and free in solution. Similar curves corresponding to the other guests can be found in Figure S1 (Supporting Information). Titration of the free guest in solution offers preliminary examination of the ability of the CpHMD to reproduce the experimental pKa; for example, in the case of BZ, the pKaF,calc value matches the pKaF,exp value of 5.5, indicating proper calibration of the CpHMD method. From the titration curve of BZ free in solution (Figure 4A, green curve), it is apparent that free BZ is protonated at values of pH less than 4.5 and deprotonated at pH levels above 6.5. Between these pH levels, an ensemble of protonated and deprotonated states exists. Relative to the titration curve for free BZ, the titration curve for the CB[7]:BZ complex is shifted toward more basic values of pH (Figure 4A, purple curve). Indeed, the value of pKaC,calc for BZ is found to be 8.7—a shift of more than 3 pK units above its pKaF (Table 2); consequently, complexed BZ is protonated at pH below 7.5, indicating the preferred protonation state of BZ at neutral (typical physiological) pH differs depending on its bound state. The observed preference for the protonated guest in the cavity of CB[7] is due to the additional hydrogen bond between the titratable proton on BZ and one of the carbonyl oxygens at the entrance to the CB[7] cavity (Figure 5). It is worth noting that the Hill equation provides a reasonable estimate of the pKa values for BZ both free and in complex with CB[7], with fitting errors of ∼0.01 pK units. Furthermore, the pKaC,calc value for BZ underestimates its pKaC,exp by only 0.3 pK units (Table 2).

Figure 4

Titration curves from constant pH MD simulations of the guests free in solution (green) and in complex with CB[7] (purple): (A) benzimidazole; (B) albendazole.

Table 2

Comparison of pKaC Values Obtained from CpHMD Simulations (pKaC,calc) with Experimental Data (pKaC,exp)[59] a

guest	pK_a^C,exp	pK_a^C,calc	n
BZ	9.0	8.71 ± 0.01	0.99
TBZ	8.6	8.19 ± 0.01	1.01
FBZ	8.6	8.61 ± 0.01	1.01
ABZ	6.1	7.10 ± 0.01	0.99
CBZ	7.0	7.40 ± 0.01	1.03

The corresponding Hill coefficients (n) are also shown.

Figure 5

Hydrogen bonds formed between the protonated benzimidazole with the carbonyl oxygens of CB[7].

Titration curves from constant pH MD simulations of the guests free in solution (green) and in complex with CB[7] (purple): (A) benzimidazole; (B) albendazole. The corresponding Hill coefficients (n) are also shown. Hydrogen bonds formed between the protonated benzimidazole with the carbonyl oxygens of CB[7]. The chemical structure of ABZ differs from that of BZ by the presence of amido and thioether R-groups attached to the BZ core. Additionally, the experimentally determined pKa for the CB[7]:ABZ complex remains acidic (pKaC,exp = 6.1). The titration curves obtained from CpHMD simulations of free and complexed ABZ are shown in Figure 4B. Qualitatively, the titration behavior of ABZ appears similar to that of BZ, as its pKaC,calc of 7.1 is shifted toward a more basic value from its pKaF,exp of 3.5. At neutral pH, these data suggest that ABZ is fully deprotonated when free in solution, whereas both its protonated and deprotonated forms are significantly populated when in complex with CB[7]. While the errors obtained for fitting the Hill equation to the titration data are minimal with errors observed for BZ and all other guests of less than 0.01 pK units, the value of pKaC,calc for the CB[7]:ABZ complex has the greatest deviation from the experiment (ΔpK = 1.0, Table 2). The titration curves for the other guests follow a similar trend, where formation of the CB[7]:guest complex increases the pKa of the guest (Figure S1, Supporting Information). These shifts are in line with the experimentally determined pKa values (Table 2), with the largest deviation seen for ABZ as stated above. Overall, the CpHMD method provides accurate predictions of pKaC,calc values, with a mean average error (MAE) of 0.42 pK units with respect to experiment (pKaC,exp, Table 2).

pH Dependence of the Binding Free Energy

As discussed above, the pKa values of the BZ-derived guests differ when bound to CB[7] and when free in solution (Figure 4, Table 2). Since there are no other titratable groups in the CB[7]:guest complexes in the pH ranges studied here, the binding of the guests to CB[7] can have a net uptake of protons, which makes their binding free energies depend on the solution pH. In this section, we compute binding free energies as functions of pH using eqs 6 and 8. Both of these equations can be used to obtain the pH-dependent binding free energy by adding a pH-dependent correction term to a reference binding free energy. In eq 6, this reference free energy corresponds to the free energy of binding in the absence of proton binding. In contrast, eq 8 requires that the reference free energy be obtained at a specific pH. The reference binding free energies in these two equations can be identical if obtained at a specific value of pH where the protonation states do not change. Since experimental association constants were obtained at pH levels where both the free and bound guests are deprotonated,[59] we use reference binding free energies for the association of deprotonated guests with CB[7] in this work. As a simple illustration of how pH-dependent binding free energies may be obtained, we first use the binding free energy measured experimentally for each of the different CB[7]:guest systems (ΔGref,exp°) as the reference free energy term in eqs 6 and 8. We refer to this as a “hybrid” approach, as it obtains a pH-dependent binding free energy (ΔGhybrid°) from the experimental reference binding free energy (ΔGref,exp°) and CpHMD-derived terms. These terms are either the pKa for the complex, pKaC,calc, used in eq 6 or charge differences between the binding partners in complex and free in solution, ΔZ, in eq 8. While all results described here have been obtained using eq 6 with pH-dependent corrections requiring values of pKaC,calc, identical results have also been obtained using eq 8 (Figure S3 and Table S1, Supporting Information). Plots of binding free energies as functions of pH for CB[7] complexes with BZ, FBZ, and ABZ are shown in Figure 6A–C. While these binding free energies are referenced to ΔGref,exp° (Figure 6, red line), the use of pKaC,calc in eq 6 and ΔZ in eq 8 to generate the full curve as a function of pH can be assessed by how well the computed binding free energy at acidic pH (ΔGhybrid°+ in Table 3) matches the analogous value derived from experiment (ΔGexp°+, blue line in Figure 6). For all CB[7]:guest complexes, the values of ΔGhybrid°+ deviate less than 1.35 kcal/mol from the respective experimental values (Table 3), with the greatest error observed for ABZ. These errors are entirely due to the errors in computing values of pKaC,calc, as the value for ΔGexp°+ was derived using experimentally obtained values of pKaC, pKaF, and ΔGref°.[59]

Figure 6

Binding free energies as functions of pH (black line). The top row is computed by the hybrid approach using the experimental reference binding energies (ΔGref,exp°, red line), and the bottom row uses the full computational approach with the reference binding energies computed by thermodynamic integration (ΔGref,TI°, green line). Experimentally derived binding free energies for the protonated guests are shown in blue. (A, D) Benzimidazole; (B, E) fuberidazole; (C, F) albendazole. All energies are reported in kcal/mol. ΔGref,exp° is the experimental[59] binding free energy for the reference deprotonated guest; ΔGexp°+ is the binding free energy for the protonated guest derived from ΔGref,exp°; and ΔGhybrid°+ is the free energy obtained by using pKaC,calc with ΔGref,exp° in eq 6. From Figure 6A–C, it is evident that all guests bind more favorably when protonated. Indeed, the binding free energies observed for deprotonated guests (at extremely basic pH) are 3.4–5.6 kcal/mol more positive (less favorable) than those obtained at acidic pH when the guests are protonated. This tendency is most pronounced in the CB[7]:FBZ complex (Figure 6B), for which the binding free energy obtained when FBZ is predominantly protonated (−7.56 kcal/mol) is over 5 kcal/mol more favorable than its respective value when FBZ is deprotonated (−2.33 kcal/mol). This observation is consistent with experiment[59] and stems from the favorable hydrogen bond between the guest and CB[7] (Figure 5). Taking a closer look at the pH-dependent binding free energies of the CB[7]:BZ complex, it is apparent that the binding free energy spans 4.8 kcal/mol between pH levels 4.5 and 10, a range that essentially encompasses the pH levels of most biological reactions (Figure 6A).[92] At physiological pH (∼7), free BZ is predominantly deprotonated, whereas BZ in complex with CB[7] is protonated (Figure 4A). In conventional free energy computations, ligand protonation states are typically assigned as the preferred protonation state for the free ligand. Consistent with this convention, BZ would be considered deprotonated in free energy computations performed at pH 7. In making this assumption, the binding free energy deviates from the pH-dependent binding free energy obtained here by ∼2.3 kcal/mol for the CB[7]:BZ complex. Similar deviations are noted for the binding of other guests as well, with the magnitudes ranging between 0.4 and 2.3 kcal/mol (Table 3 and Figure S2, Supporting Information).

Full Prediction of the pH-Dependent Free Energy Profile

To demonstrate the utility of our method when experimental binding free energies are unavailable, we perform thermodynamic integration (TI) computations based on the thermodynamic cycle shown in Scheme 2 to obtain ΔGref,TI°, the reference binding free energies of the CB[7]:guest complexes with the guests deprotonated. The pH-dependent correction terms obtained either with eq 6 or 8 are then referenced to ΔGref,TI° to obtain a full computational prediction (CpHMD/TI) of the pH-dependent free energy profiles. The free energy profiles of CB[7]:guest complexes using ΔGref,TI° are shown in Figure 6D–F, while the computed values of ΔGref,TI° are reported in Table 4 for comparison with experiment (ΔGref,exp°, Table 4). All values of ΔGref,TI° agree well with experiment, showing absolute errors that are less than 1.3 kcal/mol. Further, the error with respect to experiment (ΔGexp°+) in the predicted values for the binding free energy of protonated guests (ΔGTI°+) are similarly low (<1.4 kcal/mol). Errors in ΔGTI°+ arise from both the computation of pKaC,calc (or ΔZ, when using eq 8; Table S2, Supporting Information) using the CpHMD method and the binding free energy computation with TI. These errors are not always of the same sign; for example, the deviation from ΔGexp°+ obtained for ABZ decreased from 1.4 kcal/mol when using ΔGref,exp° in the hybrid approach to 0.7 kcal/mol when using the ΔGref,TI° reference, indicating some cancellation of error in the full computational approach. In contrast, the deviation for TBZ increased from 0.7 kcal/mol when using the experimental ΔGref,exp° reference to 1.4 kcal/mol when using ΔGref,TI°. Regardless, results obtained using the full computational approach with ΔGref,TI° show errors that are similar in magnitude to those observed using ΔGref,exp° in the hybrid experimental/computational approach. Furthermore, the errors associated with the full computational protocol can be lower than the errors that arise from performing binding free energy computations with fixed protonation states assigned to the unbound CB[7] and guest molecules (Table 4).

Table 4

Binding Free Energies of the Guests, Computed Using the Full Computational Approach (CpHMD/TI) and Compared to Experiment[59] a

guest	ΔG_ref,exp^°	ΔG_ref,TI^°	ΔG_exp^°+	ΔG_TI^°+	ΔG_ref,exp^° (pH 7)	ΔG_ref,TI^° (pH 7)
BZ	–4.4	–4.1 ± 2.0	–9.2	–8.5	–7.1	–6.4
TBZ	–3.0	–2.3 ± 2.6	–8.6	–7.3	–5.2	–4.0
FBZ	–2.3	–2.1 ± 2.6	–7.6	–7.3	–4.5	–4.3
ABZ	–6.6	–6.0 ± 3.0	–10.2	–11.0	–6.7	–6.5
CBZ	–6.0	–4.8 ± 2.7	–9.5	–8.7	–6.4	–5.5

All energies are reported in kcal/mol. ΔGref,exp° is the experimental[59] binding free energy for the reference deprotonated guest; ΔGref,TI° is the absolute binding free energy obtained from TI computations for the reference state; ΔGexp°+ is the binding free energy for the protonated guest derived from ΔGref,exp°; and ΔGTI°+ is the binding free energy obtained by using pKaC,calc with ΔGref,TI° in eq 6.

Discussion

Changes in the pKa values and, consequently, the protonation states of ionizable species participating in biomolecular association processes are well documented. To address this phenomenon, we present a simple methodology for obtaining the pH dependence of binding free energies for a series of cucurbit[7]uril (CB[7]):guest complexes. On the basis of Wyman’s binding polynomial formalism,[43] binding free energies are computed as pH-dependent corrections to a reference binding free energy. Combining this formalism with constant pH molecular dynamics (CpHMD) simulations and free energy computations yields a reasonable protocol for evaluating the pH-dependent binding free energies of biomolecular systems. Focusing on the application of CpHMD to eq 6 in order to obtain pH-dependent relative binding free energies, we assess how well CpHMD simulations can capture the pKa of BZ guests in complex with CB[7] (Table 2). With the exception of albendazole (ABZ), the values of pKaC,calc obtained for the different CB[7]:guest complexes deviate from experiment by less than 0.41 pK units. The pKaC,calc value obtained for the CB[7]:ABZ complex, however, exhibits an error of 1.0 pK unit. Since the CpHMD simulations conducted in this study are only 5 ns long, we extended the simulation of CB[7]:ABZ to 25 ns to ascertain whether the value of pKaC,calc had converged; however, the resulting pKaC,calc remains unchanged. Further, the process of fitting titration data obtained from CpHMD simulations of CB[7]:guest complexes to the Hill equation is achieved with very little statistical error (<0.01 pK units, Table 2). Both of these findings indicate that the error in the pKaC,calc values is not due to convergence problems in the CpHMD simulations. Instead, it is possible, though not explicitly demonstrated in this work, that inaccuracies in the computed pKaC values stem from problems with the force field due to the accuracies of similar magnitude to those seen in the previous CpHMD runs.[93,94] Since CpHMD simulations can reliably compute the values of pKaC,calc (and, similarly, ΔZ in the case of eq 8) for the CB[7]:guest systems, we proceed to incorporate these pKaC,calc values in eq 6 along with a reference experimental binding free energy (ΔGref,exp°) to obtain binding free energies as functions of pH. This hybrid experimental/computational approach is followed and shown for CB[7]:guest systems (Figure 6A–C). To evaluate the accuracy of this approach, we compare the computed value of ΔGhybrid°+ to experiment (ΔGexp°+) and observe good agreement, with errors of <1.4 kcal/mol arising from the computation of pKaC,calc. Having established that eq 6 can successfully recapitulate pH-dependent binding free energies with an experimentally determined reference binding free energy, we consider the use of thermodynamic integration (TI) computations to remove this dependence on experiment. TI computations effectively reproduce the reference binding free energies observed from experiment (ΔGref,exp°) with absolute errors less than 1.3 kcal/mol (ΔGref,TI°, Table 4). The resulting pH-dependent free energy profiles using ΔGref,TI° are similar to those computed with ΔGref,exp°, as shown in Figure 6. Furthermore, the absolute errors in predicting the free energies of the protonated guests, ΔGTI°+, using the ΔGref,TI° reference are less than 1.3 kcal/mol. These errors arise from both ΔGref,TI° and the use of CpHMD simulations to obtain pKaC,calc values. In regard to the computation of ΔGref,TI°, we do observe large statistical uncertainties for all CB[7]:guest complexes considered, which stem largely from the van der Waals decoupling simulations (Table S3, Supporting Information); however, the free energies computed for every transformation in the thermodynamic cycle shown in Scheme 2 have all converged, with the cumulative computed ΔG < 0.01 kcal/mol. The use of our CpHMD/TI approach to provide a full computational prediction of pH-dependent binding free energies is particularly advantageous when experimental association constants are not available, as most experimental measurements face the limitations at extreme pH levels due to the highly possible destabilization or denaturation of the proteins under such severe conditions. Therefore, when combined with computational free energy calculations, our method is free from such concerns, eliminating the reliance on the availability of experimental data. While the CpHMD/TI computation of pH-dependent binding free energies is prone to greater error than the hybrid experimental/computational approach described previously, we find the absolute errors in the CpHMD-derived pKaC values and reference binding free energies obtained from TI computations are not necessarily of the same sign for the CB[7]:guest systems considered and may cancel out. Further, the observed errors with respect to the ΔGexp°+ are relatively low (<1.4 kcal/mol). In contrast, the error in assigning incorrect protonation states in free energy computations without correcting for the pH dependence of the binding free energy can give errors in excess of 2 kcal/mol (Table 3). This observation underscores the importance of accounting for the linkage of proton binding or release to ligand binding in free energy computations and demonstrates the high utility of the CpHMD/TI approach. Our results highlight the significant changes in pKa and free energy of binding upon complex formation that accompanies a net proton uptake. Noting that the guests used here have a single titratable site, corresponding changes in free energy may sometimes be larger in protein–ligand binding where multiple titratable groups exist. Therefore, we believe that our method will have great utility in computer-aided drug discovery, where early stages of the structure-based drug design often focus on finding a high-affinity binder to a target protein. Extensions of our methodology to such more complex protein systems may require improvements to the computational protocols employed. The simple framework developed here allows for trivial incorporation of CpHMD methods that incorporate explicit solvent models[35] and/or enhanced sampling techniques, such as accelerated molecular dynamics[37] or replica exchange[36] to improve pKa computations in systems where convergence is difficult.[94] Similarly, our protocol accommodates the use of alternative methods for obtaining the reference binding free energy required by eqs 6 and 8. Thus, the computational methodology for performing CpHMD/TI computations can be chosen to best address the system under consideration. While we have focused on the results obtained using eq 6, which assumes all titratable groups are decoupled, the CpHMD/TI method is also compatible with the expression for obtaining the pH-dependent binding free energy given in eq 8, and these two expressions yield identical results in the case of the CB[7]:guest systems considered here. We intend to build on the computational protocol developed here, with a natural extension being the application of the CpHMD/TI method to obtain pH-dependent binding free energies of protein–ligand complexes. As protein–ligand systems are more complicated than the CB[7]:guest systems considered in this work, we believe the use of eq 8 will have high utility to address potential interactions between titratable groups. Given the magnitude of errors in computed binding free energies obtained with fixed protonation states in the CB[7]:guest systems, our computational protocol represents a promising approach to remove these errors, thus implicating its utility in drug discovery workflows.[95] Though not specifically addressed in this work, similar philosophies may also be applicable to the scoring functions in docking protocols.

Conclusions

In this work, we determined the pH-dependent changes in binding free energies for complex formation between cucurbit[7]uril (CB[7]) and a series of benzimidazole guests. Using constant pH molecular dynamics simulations combined with experimental data, we developed a hybrid protocol that could capture the significant changes in the CB[7]:guest binding free energies with high accuracy. Subsequently, we combined our method with thermodynamic integration (TI) to enable a full computational prediction of the pH-dependent free energy profiles. This protocol successfully accounted for the pH-dependent changes in the binding free energies during complex formation. Future work will include examination of pH-dependent binding free energies for protein–ligand complexes.

67 in total

1. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy.

Authors: Richard A Friesner; Jay L Banks; Robert B Murphy; Thomas A Halgren; Jasna J Klicic; Daniel T Mainz; Matthew P Repasky; Eric H Knoll; Mee Shelley; Jason K Perry; David E Shaw; Perry Francis; Peter S Shenkin
Journal: J Med Chem Date: 2004-03-25 Impact factor: 7.446

2. Soft-core potentials in thermodynamic integration: comparing one- and two-step transformations.

Authors: Thomas Steinbrecher; InSuk Joung; David A Case
Journal: J Comput Chem Date: 2011-08-27 Impact factor: 3.376

3. Automatic atom type and bond type perception in molecular mechanical calculations.

Authors: Junmei Wang; Wei Wang; Peter A Kollman; David A Case
Journal: J Mol Graph Model Date: 2006-02-03 Impact factor: 2.518

4. Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes.

Authors: Richard A Friesner; Robert B Murphy; Matthew P Repasky; Leah L Frye; Jeremy R Greenwood; Thomas A Halgren; Paul C Sanschagrin; Daniel T Mainz
Journal: J Med Chem Date: 2006-10-19 Impact factor: 7.446

Review 5. Improving drug candidates by design: a focus on physicochemical properties as a means of improving compound disposition and safety.

Authors: Nicholas A Meanwell
Journal: Chem Res Toxicol Date: 2011-07-26 Impact factor: 3.739

Review 6. Sensors and regulators of intracellular pH.

Authors: Joseph R Casey; Sergio Grinstein; John Orlowski
Journal: Nat Rev Mol Cell Biol Date: 2009-12-09 Impact factor: 94.444

7. A joint structural, kinetic, and thermodynamic investigation of substituent effects on host-guest complexation of bicyclic azoalkanes by beta-cyclodextrin.

Authors: Xiangyang Zhang; Gabriela Gramlich; Xiaojuan Wang; Werner M Nau
Journal: J Am Chem Soc Date: 2002-01-16 Impact factor: 15.419

8. Deconstructing activation events in rhodopsin.

Authors: Elena N Laricheva; Karunesh Arora; Jennifer L Knight; Charles L Brooks
Journal: J Am Chem Soc Date: 2013-07-22 Impact factor: 15.419

9. H++ 3.0: automating pK prediction and the preparation of biomolecular structures for atomistic molecular modeling and simulations.

Authors: Ramu Anandakrishnan; Boris Aguilar; Alexey V Onufriev
Journal: Nucleic Acids Res Date: 2012-05-08 Impact factor: 16.971

10. Constant pH Replica Exchange Molecular Dynamics in Explicit Solvent Using Discrete Protonation States: Implementation, Testing, and Validation.

Authors: Jason M Swails; Darrin M York; Adrian E Roitberg
Journal: J Chem Theory Comput Date: 2014-02-05 Impact factor: 6.006

10 in total

1. Exploring pH Dependent Host/Guest Binding Affinities.

Authors: Thomas J Paul; Jonah Z Vilseck; Ryan L Hayes; Charles L Brooks
Journal: J Phys Chem B Date: 2020-07-22 Impact factor: 2.991

2. Alchemical Binding Free Energy Calculations in AMBER20: Advances and Best Practices for Drug Discovery.

Authors: Tai-Sung Lee; Bryce K Allen; Timothy J Giese; Zhenyu Guo; Pengfei Li; Charles Lin; T Dwight McGee; David A Pearlman; Brian K Radak; Yujun Tao; Hsu-Chun Tsai; Huafeng Xu; Woody Sherman; Darrin M York
Journal: J Chem Inf Model Date: 2020-09-16 Impact factor: 4.956

3. Origin of pK_a Shifts of Internal Lysine Residues in SNase Studied Via Equal-Molar VMMS Simulations in Explicit Water.

Authors: Xiongwu Wu; Juyong Lee; Bernard R Brooks
Journal: J Phys Chem B Date: 2016-10-18 Impact factor: 2.991

Review 4. Computation of pH-dependent binding free energies.

Authors: M Olivia Kim; J Andrew McCammon
Journal: Biopolymers Date: 2016-01 Impact factor: 2.505

5. Interpretation of pH-activity profiles for acid-base catalysis from molecular simulations.

Authors: Thakshila Dissanayake; Jason M Swails; Michael E Harris; Adrian E Roitberg; Darrin M York
Journal: Biochemistry Date: 2015-02-06 Impact factor: 3.162

6. TLR7 gain-of-function genetic variation causes human lupus.

Authors: Pablo F Cañete; Hao Wang; Grant J Brown; Arti Medhavy; Josiah Bones; Jonathan A Roco; Yuke He; Yuting Qin; Jean Cappello; Julia I Ellyard; Katharine Bassett; Qian Shen; Gaetan Burgio; Yaoyuan Zhang; Cynthia Turnbull; Xiangpeng Meng; Phil Wu; Eun Cho; Lisa A Miosge; T Daniel Andrews; Matt A Field; Denis Tvorogov; Angel F Lopez; Jeffrey J Babon; Cristina Aparicio López; África Gónzalez-Murillo; Daniel Clemente Garulo; Virginia Pascual; Tess Levy; Eric J Mallack; Daniel G Calame; Timothy Lotze; James R Lupski; Huihua Ding; Tomalika R Ullah; Giles D Walters; Mark E Koina; Matthew C Cook; Nan Shen; Carmen de Lucas Collantes; Ben Corry; Michael P Gantier; Vicki Athanasopoulos; Carola G Vinuesa
Journal: Nature Date: 2022-04-27 Impact factor: 69.504

10. Conformational Dynamics and Binding Free Energies of Inhibitors of BACE-1: From the Perspective of Protonation Equilibria.

Authors: M Olivia Kim; Patrick G Blachly; J Andrew McCammon
Journal: PLoS Comput Biol Date: 2015-10-27 Impact factor: 4.475