Literature DB >> 26300709

Constant-pH Hybrid Nonequilibrium Molecular Dynamics-Monte Carlo Simulation Method.

Abstract

A computational method is developed to carry out explicit solvent simulations of complex molecular systems under conditions of constant pH. In constant-pH simulations, preidentified ionizable sites are allowed to spontaneously protonate and deprotonate as a function of time in response to the environment and the imposed pH. The method, based on a hybrid scheme originally proposed by H. A. Stern (J. Chem. Phys. 2007, 126, 164112), consists of carrying out short nonequilibrium molecular dynamics (neMD) switching trajectories to generate physically plausible configurations with changed protonation states that are subsequently accepted or rejected according to a Metropolis Monte Carlo (MC) criterion. To ensure microscopic detailed balance arising from such nonequilibrium switches, the atomic momenta are altered according to the symmetric two-ends momentum reversal prescription. To achieve higher efficiency, the original neMD-MC scheme is separated into two steps, reducing the need for generating a large number of unproductive and costly nonequilibrium trajectories. In the first step, the protonation state of a site is randomly attributed via a Metropolis MC process on the basis of an intrinsic pKa; an attempted nonequilibrium switch is generated only if this change in protonation state is accepted. This hybrid two-step inherent pKa neMD-MC simulation method is tested with single amino acids in solution (Asp, Glu, and His) and then applied to turkey ovomucoid third domain and hen egg-white lysozyme. Because of the simple linear increase in the computational cost relative to the number of titratable sites, the present method is naturally able to treat extremely large systems.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2015 PMID： 26300709 PMCID： PMC4535364 DOI： 10.1021/acs.jctc.5b00261

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.006

Introduction

In conventional MD simulations based on classical force fields, chemical bonds are commonly never broken and the protonation state of each ionizable residue is fixed and ascribed externally by the user. This is a considerable limitation when trying to study pH-dependent biological processes. These include, to name a few, membrane insertion,[1] fibril formation,[2] protein denaturation,[3] proton gradient-driven ATP synthesis,[4] and membrane transporters.[5,6] Typically, in the absence of specific information, the protonation state of the ionizable residues is assigned on the basis of their reference pKa from the isolated amino acid in solution. However, the pKa of specific residues in a protein is very sensitive to the environment, and actual pKa values can deviate considerably from standard references. The issue is particularly important in the case of histidine because its pKa is close to physiological pH. Furthermore, the dynamical movement of a protein can be coupled with an alternation of protonation state. At the most fundamental level, protonation and deprotonation involve breaking or forming chemical bonds. For this reason, a calculation of absolute proton affinity can be carried out only from a quantum mechanical (QM) treatment. However, a pure QM approach is impractical for biomolecular systems, where there can be a very large number of ionizable sites. Slightly more tractable approaches are based on mixed quantum mechanical and molecular mechanical (QM/MM) treatments. While QM/MM methods can address issues of absolute proton affinity in the context of enzyme reactions,[7,8] they remain too computationally expensive to treat multiple ionizable sites in large bimolecular systems. In the context of large biomolecular systems, an effective computational method to carry out simulations under conditions of constant pH must necessarily be built on the basis of empirical molecular mechanical (MM) models. In constant-pH simulations, preidentified titratable sites are allowed to spontaneously change their protonation state as a function of time in response to the environment and the imposed pH. Monte Carlo (MC) is a natural method for treating the abrupt transitions between discrete protonation states. For example, in a hybrid MD–MC simulation algorithm, one can imagine a cycle that consists of carrying out an equilibrium trajectory for some time, followed by attempts to make a transition to a different protonation state that is either accepted or rejected according to a Metropolis criterion. The first hybrid MD–MC methodology with discrete protonation states was introduced by Bürgi et al.[9] The method, which relied on a thermodynamic integration (TI) calculation carried out with explicit solvent to evaluate the free energy difference and the acceptance probability associated with the change in protonation state, was computationally expensive and suffered from convergence issues due to the short TI calculations. Soon afterward, Mongan, McCammon, and colleagues[10,11] proposed a hybrid MD–MC methodology with sudden jumps between discrete protonation states in conjunction with an implicit solvent model based on the generalized Born (GB) approximation. The algorithm effectively achieves a correct Boltzmann sampling and avoids the issues arising from short TI simulations. However, it cannot be readily transposed to explicit solvent simulations. Sudden jumps between discrete protonation states unavoidably lead to large differences in electrostatic energy and to a rejection of nearly all proposed change. The issue stems from the lack of overlap between the explicit solvent configurations of the protonated and unprotonated states. Similar difficulties are encountered in trying to design an effective grand canonical MC algorithms for ionic solution in which ions are created and destroyed in explicit solvent.[12] To circumvent the configurational overlap problem in constant-pH simulations, some treatments have resorted to an empirical combination of explicit/implicit representations: the equilibrium MD is carried out with explicit solvent, but the protonation/deprotonation MC processes are carried out with an implicit solvent model.[13−16] While such constant-pH simulations have the ability to explore the accessible ionization states of a system in the presence of explicit solvent, it is important to keep in mind that the actual pKa values are derived from a CE or GB approximation. Although the usage of continuum solvent models in the treatment of ionizable sites in proteins has a long history,[17−19] they are necessarily approximate and there is a need for a simulation methodology that is based entirely on an explicit solvent representation. There are many complex situations where a titratable site may be located in a narrow aqueous crevice for which the usage of an implicit solvent approximation may be limited. Fundamentally, the statistical properties arising from such combined MD–MC methods do not stem from a single consistent Hamiltonian, which is problematical. A different strategy to address the configurational overlap problem consists of considering an extended Lagrangian with a continuous λ(t) that is dynamically propagated to control the protonation state of the site. This leads to the so-called λ-dynamics algorithm developed by Brooks and co-workers.[20−27] The original concept of expanding the degrees of freedom of a system in order to include a continuous thermodynamic coupling parameter λ as a dynamical variable in the sampling process goes back to early work by Jorgensen,[28] Tidor,[29] and Hünenberger[30] (see also ref (31)). In this method, because λ is allowed to evolve as a continuous dynamical degree of freedom, there are no discrete protonation and deprotonation events and the configurational overlap problem is not encountered. However, carefully calibrated λ-dependent potentials must be incorporated into the Hamiltonian to increase computational efficiency. For example, an unfavorable energy barrier is often introduced between the physical end states to decrease the probability of the unphysical intermediate region. This barrier must not be too large in order to enable rapid transitions between end states λ = 0 and λ = 1. For this reason, there remains unavoidably a nonzero probability q of visiting the unphysical states (δ < λ < 1 – δ, with δ ≈ 0.1). Strictly speaking, only end states λ = 0 or λ = 1 are physical. Assuming independence of the sites for the sake of simplicity, the probability of a physically meaningful configuration of the entire system with n sites scales like (1 – q), with (1 – q) < 1. For example, even if q is 0.1, the probability of generating a physical state is not larger than about 0.12 with 20 sites (in practice, much larger values of q are tolerated to increase computational efficiency). As a result, an algorithm based on sampling a continuous λ does not scale very well for a bimolecular system with a large number n of ionizable sites. This remains the major limitation of an algorithm sampling the dynamics of a continuous coupling parameter λ(t). A different solution to the configurational overlap problem to construct an explicit solvent constant-pH simulation method is based on the enveloping distribution sampling (EDS) method.[32,33] EDS is a method used to generate a hybrid energy surface from a finite number of states such that the statistical mechanical distribution of the end states will overlap;[34] a smoothing parameter s controls the magnitude of the energy barrier and the extent of configurational overlap between the end states. In this constant-pH simulation method, EDS is used to mix the energy surface of the protonated and unprotonated states for a small set of N preselected ionizable sites. Systems with different values of the EDS smoothing parameter are then connected via a Hamiltonian replica-exchange to facilitate interconversion between the different protonation states during a hybrid MD–MC simulation. The EDS constant-pH simulation method can handle explicit solvent. However, it requires that all of the accessible protonated and unprotonated states be enumerated explicitly into the smoothed EDS Hamiltonian. If there are N sites, then 2 energy surfaces need to be mixed into the smoothed EDS Hamiltonian. From a computational point of view, a simulation requires at least as many processors as ionization states. While this limitation can be mitigated to some extent by including only the relevant states and leaving out those anticipated to have small probabilities, the method remains fundamentally designed to treat systems comprising only a small number of sites. Scalability to a large number of ionizable sites remains the most essential feature of a truly useful constant-pH simulation method for biomolecular systems. It is worth mentioning that several of the previously proposed constant-pH simulation methods have introduced replica-exchange between systems at different pH.[14−16,32,33] While usage of H-REMD may lead to enhanced sampling for specific systems, it does not alter the fundamental nature (and inherent limitations) of the currently available algorithms regarding the scaling properties. Constant-pH simulation methods that are inherently limited to a small number of sites are, ultimately, not able to address this fundamental issue. In this regard, it is important to recall that a constant-pH simulation is not really necessary for investigating the effect of pH when only a small number of ionizable sites groups are involved (e.g., less than 3 or 4) and that all of the possible states can be enumerated explicitly. All of the effects from changing the pH can be assessed rigorously by computing the relative free energies between all of the relevant protonation states. As long as all the relevant states can be enumerated and treated via free energy perturbation (FEP), such an exhaustive approach is viable. Constant-pH algorithms that are applicable to only a small number of ionizable sites offer no specific advantage over direct FEP calculations. The true challenge facing a constant-pH simulation method for bimolecular systems appears when the number of ionizable sites becomes very large, for example, when simulating the pH dependence of a viral particle,[35] where a complete enumeration of all possible states is completely infeasible. The only approach that could be potentially scaled to a very large system is some type of hybrid MD–MC with discrete states.[10,36,37] However, the algorithm must be significantly enhanced to circumvent the configurational overlap problem in the presence of explicit solvent. A naive MC algorithm relying on discrete (instantaneous) protonation/deprotonation attempts in the presence of explicit solvent fails because the coordinates of the surrounding solvent atoms are kept fixed while the state of the ionizable site is changed. The large energy variations essentially result in complete rejection of all attempts to change protonation states. What is needed is a scheme that would allow the solvent configuration of the system to dynamically adapt to the new state, at least partially. This occurs naturally if the coupling parameter λ is progressively switched between the end states during the MD trajectory. This is essentially the MC scheme with dynamical state switching that was previously formulated by Harry Stern.[38,39] It consists of carrying out short nonequilibrium molecular dynamics (neMD) switching trajectories to generate a physically plausible configuration with an altered protonation state, which is then subsequently accepted or rejected according to a Metropolis Monte Carlo (MC) criterion. While a continuous λ(t) evolving according to a pre-established time-dependent schedule is also involved in the neMD trajectory, the outcome is a random walk in terms of the discrete end state only, thus avoiding the poor scaling properties of λ-dynamics because of the simple linear increase in the computational cost relative to the number of titratable sites. Consequently, the present hybrid neMD–MC method is naturally able to treat systems of arbitrary size. In this article, we formulate and implement a hybrid neMD–MC method to carry out simulations under conditions of constant pH. To ensure microscopic detailed balance arising from the nonequilibrium switches, the atomic momenta are altered according to the symmetric two-ends momentum reversal prescription.[39−41] To achieve higher efficiency, the original neMD–MC scheme is separated into two steps, reducing the need for generating a large number of unproductive and costly nonequilibrium trajectories. In the first step, the protonation state of a site is randomly attributed via a Metropolis MC process on the basis of an intrinsic pKa; an attempted nonequilibrium switch is generated only if this change in protonation state is accepted. The simulation method is tested with single amino acids in solution (Asp, Glu, and His) and is then applied to turkey ovomucoid third domain and hen egg-white lysozyme. The article is concluded with an outlook to further expand the present algorithm.

Theoretical Background

For the sake of simplicity, we consider here a solvated molecule comprising a single preidentified ionizable site, as illustrated in Figure 1. The formalism can be easily generalized to multiple ionizable sites. It is assumed that, when the molecule is unprotonated, the proton is transferred to a reference ionizable group located far away in solution (i.e., the number of particles in the system is the same for the two states). Let r represent all of the coordinates within the system and Up(r), Uu(r), correspond to the potential energy function for the protonated (p) and unprotonated (u) systems, respectively. In principle, the free energy difference Δought to be directly related to the experimental pKa of the site in the molecule. However, to match the experimental pKa value with a MM force field, it is necessary to introduce an adjustable offset constant Cgiven byThe use of an empirical offset constant C is necessary because the MM potential energy function is not designed to account for a number of critical factors, including the QM energy for binding a proton to an ionizable group, the standard state of the solvated proton, and effects associated with zero-point vibrations. A force field-dependent offset constant must be ascribed to each type of titratable site in a protein to calibrate the method. The value of these offset constants depends on the details of the MM force field. For each type of titratable site, an offset constant can be determined via an explicit solvent alchemical FEP calculation on a corresponding reference molecule. An amino acid dipeptide typically provides a reasonably good reference system for ionizable side chains (more details are provided in Section II.3). Once this calibration has been carried out, the resulting empirical offset constants are then assumed to be fully transferable. In other words, it is assumed that they do not vary with the sequence and conformation of the protein. The assumption of transferable empirical offset constants is a feature that is common to all MM treatments of ionizable groups in proteins.[9−11,13−17,20−27,32,33] This is also the reason why all MM treatments of ionizable groups in proteins, rather than absolute pKa values, always focus on the issue of pKa shifts relative to their corresponding reference compounds.

Figure 1

An aspartic dipeptide with blocked ends solvated in bulk water is used as an example of a molecule comprising an ionizable site. Within the framework of constant-pH simulations, a preidentified ionizable site is allowed to spontaneously protonate and deprotonate as a function of time in response to the environment and the imposed pH. In practice, the labile H atom (yellow sphere) is not annihilated in the unprotonated state. It is a fully interacting particle in the protonated state, and it is converted to a dummy noninteracting particle in the deprotonated state. For the subsequent developments we construct an alchemical potential energy function U(r,λ) using the continuous coupling parameter λ, which takes a value of 1 when the system is protonated and a value of 0 when the system is unprotonatedThe function U(r,λ), defined in terms of a continuous coupling parameter λ, is equivalent to the alchemical potential function commonly defined for FEP/MD calculations[42] and λ-dynamics.[20−27] The probability of finding the ionizable group protonated (P1) or unprotonated (P0) can then be expressed by the Henderson–Hasselbalch equationsandrespectively. To be consistent with the probability ratio P1/P0 = 10(p from the Henderson–Hasselbalch equations, we construct an extended pH-dependent potential energy Ũ(r,λ), accounting for the protonated and unprotonated state asand the corresponding Hamiltonianwhere p represent all of the momenta within the system. For the sake of clarity, we will write x as a short-hand notation for (r,p) in the following.

Constant-pH Simulation Algorithm

Our ultimate goal is to design a constant-pH simulation propagation algorithm (x,λ → x′,λ′) such that it generates the proper Boltzmann equilibrium distribution for the systemOne possible avenue consists of generating a random walk in the space of x and λ that satisfies microscopic detailed balancewhere λ → λ′ represents any transition between the two end points (i.e., 0 → 1 or 1 → 0) and λ′ → λ represents the reverse transition (i.e., 1 → 0 or 0 → 1). In practice, however, this is difficult to implement because of the strong coupling due to the large variations in electrostatic energy accompanying changes in the protonation state (λ). To circumvent this problem, we adopt a hybrid simulation method that combines the advantages of Monte Carlo (MC) with the strengths of classical molecular dynamics (MD). It consists of carrying out short nonequilibrium MD (neMD) trajectories upon a change in the protonation state of the site in order to generate a new configuration x′ that is subsequently accepted or rejected via a Metropololis criterion. During the neMD trajectory, the proposed change in the discrete state variable λ is controlled externally while the MD propagation allows the continuous dynamical degrees of freedom x to dynamically adapt to the new state. This hybrid neMD–MC scheme is essentially the constant-pH simulation method that was initially elaborated by Harry Stern.[38] The scheme is designed to allow the system to evolve between the two states to generate the proposed move (x,λ) → (x′,λ′). The hybrid neMD–MC scheme is composed of a time-dependent nonequilibrium MD switching process, followed by an MC step based on an acceptance criteria. Formally, the transition probability T is expressed as the product of the probability of a proposed move Tp and the probability to accept or reject the proposed move TaDuring the dynamical switching process, the parameter λ follows a fixed time-dependent schedule over a time interval τ, whereas the remaining degrees of freedom are propagated dynamically according to the mixed-state potential energy function, Ũ(r,λsw(t)) defined in eq 7. Here, the function λsw(t) given by the linear formrepresents the time-dependent continuous coupling parameter, which is externally controlled to drive the system between the two end points, λ and λ′, during a neMD switching trajectory (note that λ can take the value 0 or 1 and λ′ can take the value 1 or 0). For example, see the first switch in the time interval τ between t1 and t2 in Figure 2. The coupling parameter starts in the initial state λ at time t1 and is gradually altered in a time-dependent manner to reach the final state λ′ at time t2 = t1 + τ, where τ is the length of the switch.

Figure 2

Schematic representation of the constant-pH hybrid neMD–MC simulation scheme. The red wiggly line represents the first stage of the simulation. The blue wiggly dashed line represents the second stage of the simulation. A Metropolis criterion is applied after the second stage (open black circle). If it is accepted, then the next MD simulation will start with the last configuration from the switching simulation; otherwise, the switching simulation will be discarded, and the next cycle of simulation will start with the last configuration from the MD stage. The time-dependent λsw(t) for the forward and backward switching, λ → λ′ and λ′ → λ, must be consistent to guarantee detailed balance, as previously noted by Stern.[38] This is automatically satisfied if λsw(t), for the sake of simplicity, is chosen to be symmetric with respect to the two end states over the interval τ. The linear form of eq 12 fulfills this condition. When a deterministic (reversible symplectic) propagator and a correct momentum reversal prescription are used,[40,41,43] the transition probability of the proposed move is inherently symmetricand the acceptance or rejection probability is given by the Metropolis criterionBy substitution, it can be shown readily thatTherefore, the Metropolis construct satisfies eq 10. It is also possible to use a stochastic propagator for the neMD trajectory and satisfy detailed balance by using a Metropolis acceptance criterion based on the external work.[41] The switching trajectory when attempting a change in protonation state has some analogies with the short TI calculations carried out to evaluate the free energy difference associated with the change in protonation state in the method proposed early on by Bürgi et al.[9] However, the Metropolis criterion applied to the discrete end state in the present hybrid neMD–MC algorithm formally resolve the issues arising from the poorly converged thermodynamic integration and guarantees that an equilibrium Boltzmann distribution will be achieved. As illustrated in Figure 2, the constant-pH simulation scheme with dynamical state switching comprises multiple rounds, each of which is composed of two stages of MD simulation. The first stage is simply a standard MD simulation where the system is kept in a given protonation state λ (red line). The second stage is a neMD switching simulation where the state of the system is gradually switched from one to another (blue dashed line). At the end of the second stage, a Metropolis MC criterion is applied to determine if the switching will be accepted (open black circle). If it is accepted, then the next MD simulation will start with the last configuration from the switching simulation (red line again); otherwise, the switching simulation will be discarded, and the next round simulation will start with the last configuration from the MD stage.

Two-Step Inherent pKa Algorithm

If the pH differs markedly from the inherent pKa of an ionizable group, then the probability of the protonated state is expected to be either very close to 0 or 1. For example, if an aspartic residue is in an environment at a high pH, then the side chain should be essentially always unprotonated. As a consequence, almost all attempts to protonate the side chain ought to be rejected. Similarly, if an arginine residue is in an environment at a pH of 1, then the side chain should be predominantly protonated. Almost all attempts to deprotonate the side chain will be rejected. In such situations, generating a large number of computationally expensive neMD switching trajectories can become wasteful and inefficient. It is possible, however, to redesign the algorithm to take advantage of the information about the inherent pKa of the site. The idea is to separate the transition probability T into two distinct stepswhere T(i) represents the transition probability for λ describing a protonation/deprotonation process according to some inherent pKa(i) of the site and T(s) is the transition probability of an attempted x → x′ neMD switch, conditional on the transition λ → λ′ taking place. The transition probability of the first step, T(i), is constructed such that it obeys the following microscopic detailed balance relationwhere π(i)(λ) is some arbitrarily assigned inherent equilibrium probability. Assuming an inherent pKa(i) for the site, the equilibrium probability ratio for the first step isThe first step does not involve any changes in the atomic coordinates of the system; the process affects only the value of the ionization state variable λ. Complex changes in the atomic coordinates of the system are absorbed into the second step. To make progress, it is useful to rewrite the transition probability T(s) as the product of the probability of a proposed move Tp(s) and the probability to accept or reject the proposed move Ta(s)Imposing that the two-step scheme satisfies the global detailed balance relationand assuming that the proposed move in x are inherently symmetric and deterministicwe obtain the relationwhere the modified Hamiltonian H̃*(x,λ), defined asis shifted by pKa(i). In the two-step neMD–MC scheme, the functional form of the shifted Metropolis acceptance probability Ta(s), isWith respect to the first step, a number of methods could be used to generate the random walk in λ. As this is a simple two-state system, one might even generate the transition λ → λ′ directly from the equilibrium probabilities. Interestingly, it can be shown that a simple Metropolis MC acceptance probabilityprovides the most effective method to simulate the first step (see Appendix I). The overall efficiency of the two-step scheme reaches its highest level when the acceptance probability is equivalent for the protonation and deprotonation processeswhere ⟨...⟩λ denotes an average over all possible transition paths, given the initial configuration is sampled from Boltzmann distribution eq 9 with λ. Using the actual pKa value of the site (if it were known) would always result in the highest efficiency from an importance sampling point of view. This is demonstrated in the following derivation.where the third equation results from the definition of C, eq 3. In practice, the actual pKa value of the site is not known, but it should be possible to choose some reasonable value capturing the dominant effects. For example, one could use the pKa of the reference system pKa(ref). In this case, the overall efficiency will be close to optimal if the pKa of the site is shifted only by a small amount. Alternatively, it is possible to adaptively adjust the value of pKa(i) during the simulation based on an earlier estimate (see Discussion). Ultimately, the two-step scheme is rigorously valid, although a well-chosen value for pKa(i) can greatly increase the overall efficiency of the simulation. For systems with multiple ionizable sites, the total cost of the constant-pH neMD–MC simulation scales linearly with the total number of sites if one attempts to change the ionization state of one site at a time. However, it might be sometimes advantageous to attempt changing the ionization state of multiple sites simultaneously. For example, when several titratable sites are in close proximity, attempts to change the ionization state of a single site at a time may be inefficient. In this case, attempts to protonate one site while deprotonating another site may be more advantageous. It is possible to include such two-sites processes in the MC move set to reach optimal efficiency.

Empirical Offset Constants

In practice, there are two stages for implementing a constant-pH hybrid simulation method. In the first stage, the free energy Δaa of the protonated state relative to the unprotonated state, defined in eq 1, must be calculated for all the ionizable amino acids (aa) using alchemical FEP simulations with explicit solvent. Then, in a second stage, the empirical offset constant Caamust be used to shift the extended pH-dependent MM potential energyThe latter MM potential energy Ũ(r,λ) must then be used to carry out the constant-pH hybrid neMD–MC simulations of the protein with explicit solvent. In practice, the proton is treated as a dummy noninteracting particle for the unprotonated state (denoted H0 in the following). That is, the dummy H atom is never annihilated, but its interaction with the environment is turned off (Figure 1). In this alchemical process, a number of internal MM covalent terms (bonds, angles, dihedrals) are kept to avoid the problem of a wandering noninteracting free proton in the simulation;[44−46] their influence cancels out in the treatment of the standard state. The actual value of the empirical offset constant Caa depends on the simulation protocol used to carry out the alchemical FEP calculation yielding the Δaa. One possibility is to treat only the ionizable entity in the explicit solvent simulation box. This has the consequence that charge neutrality is not maintained during the protonation/deprotonation process. In this case, the empirical offset constant Caa must be based on Δaa, also calculated from an alchemical FEP simulation with just the amino acid in the explicit solvent box, with no constraint on charge neutrality. Alternatively, charge neutrality could be strictly enforced during all of the protonation/deprotonation processes by including some charge-canceling counterpart in the simulation. Maintaining charge neutrality is a useful way to avoid some important simulation artifacts associated with the free energy to insert or remove net charges in the system. With Ewald lattice summation and tinfoil conducting boundary conditions, such charging free energies necessarily include a spurious shift due to the Galvani potential of the bulk water phase in the finite simulation system.[47] The magnitude of such artifact depends on the details of the simulation system, such as the volume fraction occupied by the solvent.[47] While the issue may be safely ignored in the case of a solvated protein surrounded by a large number of water molecules, it can become much more severe when the volume fraction occupied by the solvent is much less than 85% of the entire system, a condition often encountered with simulation of membrane proteins. In the present study, the alchemical protocol that we use couples the following reactionsfor the protonation of acidic residues (Asp, Glu, Tyr, COO-terminal) andfor the protonation of basic residues (His, Lys, Arg, N-terminal). Here, H0 represents the dummy noninteracting hydrogen. In these alchemical transformations, the oxygen atom of the water molecule is converted into K+ or a Cl− ion while the two hydrogen atoms are transformed into dummy particles. It is worth noting that charge neutrality could also be maintained the simultaneous creation or destruction of a pair of opposite charges viafor the protonation of acidic residues andfor the protonation of basic residues. However, this second protocol is less advantageous because it requires the creation of opposite charges during the alchemical switch process, yielding large energy differences that affect the overall Metropolis acceptance probability. The first protocol based on eqs 30 and 31 yields more efficient hybrid MD–MC simulations because the two charging free energies appear in opposite directions of the alchemical processes.

Computational Details

Three dipeptides were simulated with the hybrid algorithm: aspartic acid, glutamic acid, and histidine. In addition, two protein systems were simulated with the hybrid algorithm: the turkey ovomucoid third domain (OMTKY3, PDB ID 1OMU) and hen egg-white lysozyme (HEWL, PDB ID 1AKI). The amino acid dipeptides have acetylated N-terminus and N-methylamide C-terminus, whereas OMTKY3 and HEWL have a standard charged N- and C-terminus. All systems were simulated with explicit solvent with periodic boundary conditions. The CHARMM27 force field[48] with the TIP3 water potential[49] was used to model the microscopic interactions. The amino acid dipeptides were solvated in a 24 Å cubic box. OMTKY3 and HEWL were solvated in a 46 and 60 Å cubic boxes, respectively. To prevent any drift, the center-of-mass of the protein (or dipeptide) was weakly restrained to the center of the periodic box with a harmonic potential. Particle-mesh Ewald (PME) summation[50] was used to treat the electrostatic interactions, with a real-space cutoff set to 14 Å and grid spacing smaller than 0.5 Å. The LJ interactions were smoothly truncated with a switching function from 10 to 12 Å. The equations of motion were integrated with a 2 fs time step, and SHAKE[51] was used to constrain covalent bonds involving hydrogen atoms. The program CHARMM, version c36b1, was used to carry out all of the MD simulations.[52] Special patches were created to handle the switch between the protonated and unprotonated states of the amino acids. Those allow alterations to the protein-structure-file (PSF) that represent the atomic model underlying a given simulated system.[52] In the case of aspartic and glutamic acids, two dummy hydrogen are introduced to allow the two carboxylate oxygen atoms to be protonated. In the unprotonated state, the acidic group has two dummy hydrogens, one on each carboxylate oxygen. When the protonated state is generated, one of the two dummy hydrogen particles is alchemically transformed into a fully interacting hydrogen atom. In the case of histidine, both nitrogen atoms δ and ϵ can be protonated. To determine the offset constant Δaa of the ionizable sites, FEP/MD simulations of 10 ns were carried out for the dipeptide systems. WHAM was used to calculate the free energy difference between the two ionizable states.[53] The FEP simulation provides the reference values Caa for aspartic acid, glutamic acid, and histidine that are then used for the OMTKY3 and HEWL simulations. To set charge neutrality and maintain this neutrality during changes in ionization states, a number of special counterions able to alchemically interconvert into a water molecule were included in the simulation system. Similar procedures have been used previously in free energy perturbation (FEP) calculations involving charged species.[47,54] The number of these counterion–water “molecules” is adjusted to correspond to the number of ionizable sites. These counterion–water molecules were harmonically restrained in the bulk solvent away from the protein to ensure that this alchemical transformation occurs in the bulk region and never in close contact to the protein. Their restrained positions were generated by a script that sought to maximize these distances as much as possible. In the dipeptide systems, the distance of the counterion–water molecule to the ionizable site is 13 Å. In the OMTKY3 system, five counterion–water molecules were included; the shortest distance among them, and with ionizable protein sites, is 16.0 Å. In the HEWL system, 16 counterion–water molecules were included; the shortest distance among them, and with ionizable protein sites, is 17.5 Å. In-house Python scripts were used for controlling the hybrid neMD–MC simulation scheme. In the hybrid algorithm, the equilibrium MD simulations are carried out in the NPT ensemble at a temperature of 300 K and 1 atm pressure, whereas the neMD switching simulations are carried out in the NVE ensemble with no constant temperature or pressure control. The PERT command of CHARMM is used for switching the Hamiltonian.[52] For peptides and proteins with multiple ionizable sites, a maximum of one protonation site is perturbed per cycle. For each cycle, one site is randomly picked. One may also assign weight for each site so that important sites can be perturbed more often. The expression T(i) in eq 25 is used to decide if a switch will be performed for this site. If it is not, then another site will be picked, until one acceptance is obtained or all sites have been visited. The neMD–MC switch is skipped for this cycle if none of the transition trials are accepted. After the neMD switching trajectory, the energy difference is calculated. The expression T(s) in eq 24 is used to accept or reject the new proposed state. If the state is accepted, then the new conformation and velocities are kept; otherwise, the configuration from the neMD switching simulation is discarded, and the simulation is continued from the position and momenta at the end of the MD stage. After the neMD–MC step, the configuration and state of the system is recorded. For the neMD switches, the momenta of all particles in the system are treated according to a symmetric two-ends momentum reversal prescription.[39,40] Accordingly, momenta have a probability of 0.5 to be kept unchanged and a probability of 0.5 to be reversed both before and after the switch. This prescription can greatly reduce the probability of different regions of configurational space being isolated from one another.[40] Protonation states were randomly picked for each residue to minimize the impact of the initial state to initiate the system. A first round of MD and neMD was then carried out, and the first generated configuration was accepted unconditionally. During the hybrid neMD–MC simulation, a counterion–water molecule is randomly picked together with the ionizable state of a protein site for subsequent perturbation into K+ or Cl– ions according to eq 30 and eq 31. During the protonation process of the histidine, a K–H2O is perturbed into neutral H2O according to eq 31. In case the system contains no K–H2O, a neutral H2O is perturbed into Cl–H2O according to eq 33. Similarly, during the deprotonation process of aspartic acid or glutamate, a Cl–H2O is perturbed into neutral H2O according to eq 30. In case the system contains no Cl–H2O, a neutral H2O is perturbed into K–H2O according to eq 32. For the three dipeptide systems, the counterion–water molecule and the ionizable site are coupled via eqs 32 and 33. In OMTKY3, the ionization of the five acidic residues (Asp7, Asp27, Glu10, Glu19, Glu43) is coupled to a counterion–water molecule via eq 30 (there are no basic sites). In HEWL, the ionization of nine acidic residues (Glu7, Asp18, Glu35, Asp48, Asp52, Asp66, Asp87, Asp101, and Asp119) is coupled to a counterion–water molecule via eq 30, and the ionization of His15 is coupled to a counterion–water molecule via eq 31. In the following analysis, we report the net computational cost of the hybrid simulation by adding the number of time steps used for both the equilibrium MD and neMD simulations. For the three dipeptide systems, a total of 15 hybrid neMD–MC simulations was generated to calculate the statistical uncertainty of the results. The neMD switching trajectories were set to 10 ps, whereas the equilibrium MD trajectories were set to a very short length of only 0.2 ps. Such short equilibrium trajectories were used to reveal any possible sampling issues arising from the neMD switches. The total length of the hybrid simulations, calculated by accounting for the equilibrium MD and the neMD switching trajectory of each cycle, is 40 ns. In the case of the aspartic acid dipeptide, a scan over nine pH values was also carried out (with 0.2 ps of equilibrium MD and 10 ps of neMD for the switching per cycle). Again, a total of 15 trajectories was generated to calculate the statistical uncertainty of the results. For the OMTKY3 system, a total of seven sets of hybrid simulations with different parameters was carried out (each was carried out as 30 separate simulations with randomized initial conditions to reduce the computational time). Simulations 1, 5, and 6 comprised 15 000 cycles, and simulations 2, 3, 4, and 7 comprised 10 000 cycles. Each cycle of hybrid simulation included 10 ps of equilibrium MD and 20 ps of neMD for the switching. For simulation 7, the length of the neMD trajectory was 50 ps. Simulation 7 is the longest trajectory, for a total length of 600 ns (including equilibrium and switching contributions). Simulations 1, 2, 5, 6, and 7 were carried out at pH of 4, and simulations 3 and 4 were carried out at pH of 3. For simulation 1, 3, 5, and 7, the value of pKa( was set to the experimental pKa. For simulations 2, 4, and 6, the value of pKa(i) was set to pKa(ref). To test if this affects convergence, the proton could be added to a single oxygen of the carboxylate group only in simulations 5 and 6. A single set of hybrid neMD–MC simulation was generated for the HEWL system (separated into 30 simulations with randomized initial conditions to reduce the computational time). Each cycle of hybrid simulation included 10 ps of equilibrium MD and 50 ps of neMD for the switching. The total computational cost of the hybrid simulation is 1.1 μs, including the equilibrium and switching contributions, corresponding to 18 838 cycles of neMD–MC.

Results and Discussion

Free Energy Difference Calculations

All of the calibration free energy difference Δaa and constant Caa values for the different reactions used in the present hybrid simulations are given in Table 1 (the dummy H0 are not explicitly included for the sake of clarity). The upper half presents reactions for which charge neutrality are not maintained, whereas the lower half presents those for which charge neutrality is maintained. The difference comes from the ΔG of 81.2 kcal/mol for perturbing a K+ ion into a neutral H2O molecule, and 69.2 kcal/mol for perturbing a Cl− ion into a neutral H2O molecule. The quantities Δaa and Caa for the second protocol based on eqs 32 and 33 can also be calculated in a similar manner (values not shown in Table 1).

Table 1

Free Energy Difference Δaa and Constant Caa for Different Reactionsa

reaction	ΔG_aa (kcal/mol)	C_aa (kcal/mol)
Asp^– → Asp^–(H⁺)	45.5	40.0
Glu^– → Glu^–(H⁺)	49.0	43.0
His → His(H_δN⁺)	−4.2	−12.4
His → His(H_ϵN⁺)	−18.8	−27.0
Asp^–+ H₂O → Asp^–(H⁺) + Cl^–	−23.7	−29.2
Glu^–+ H₂O → Glu^–(H⁺) + Cl^–	−20.2	−26.2
His + K⁺ → His(H_δN⁺) + H₂O	77.0	68.8
His + K⁺ → His(H_ϵN⁺) + H₂O	62.4	54.2

Dummy H0 particles are left out for the sake of clarity.

Validation of the Hybrid neMD–MC Scheme

For each test case, 15 independent hybrid simulations were generated to estimate the average and error. One configuration and state are recorded per round after the neMD–MC switch. The pKa of each protonation site is calculated aswhere p1 and p0 are the probability of the protonated and deprotonated states being extracted from the simulation, respectively. To further validate the neMD–MC algorithm, the conditional density for a given λ was examined to see if it is internally consistent based on the Crooks identity.[55] The latter depicts the relationship of the work evaluated from forward and backward transitions.[55] Here, the forward (f) transition is defined as the protonation process, λ = 0 → 1, and the backward (b) transition is defined as the deprotonation process, λ = 1 → 0. When the initial configuration is sampled from an equilibrated ensemble, the Crooks equation states thatwhere Pf is the forward work probability distribution, Pb is the backward work probability distribution, and W is the external input work for the switch. W is equivalent to Δ if a deterministic switch is used.[41] Equation 35 can be transformed intoLet us defineThen, eq 36 becomesWe fitted Q(W) and W observed from simulations to eq 38 and calculated the coefficient of determination (R2). If the distribution of starting configuration is the equilibrated ensemble, then R2 should be close to 1. Therefore, this verifies that the conditional density for a given λ is correct by examining R2.

Isolated Dipeptides

The hybrid neMD–MC scheme was applied to isolated dipeptides in explicit water. The pKa is calculated using eq 34. For all three dipeptides, the error between the experimental and calculated pKa is less than 0.3. The evolution of the calculated pKa and its standard deviation for an aspartic acid residue is plotted in Figure 3. The evolution for glutamate acid and histidine is similar (not shown). The results show that the calculated pKa converges toward the experimental pKa. A scan over nine pH values was also carried out for an aspartic acid dipeptide in water. The results are shown in Figure 4. The percentage of protonated states calculated for pH varying from 2 to 6 agrees well with the theoretical curve.

Figure 3

Figure 4

pH scan for the aspartic dipeptide in water. Constant-pH simulations are carried out at pH values of 2, 3, 3.5, 3.75, 4, 4.25, 4.5, 5, and 6. Each point is calculated from the time average of 15 trajectories to present the percentage of protonated states for that pH value. The solid curve presents the theoretical values.

Evolution of the calculated pKa for the aspartic dipeptide in water. The flat dashed line is the experimental pKa of the system. The black line is calculated from the time average of all 15 trajectories, and the error bars show the standard deviation. Each round of hybrid neMD–MC simulation uses 0.2 ps for the equilibrium MD and 10 ps for the neMD switch. The time axis reflects the total computational cost of the hybrid simulation by including the number of time steps used for the equilibrium MD and for the neMD simulation. pH scan for the aspartic dipeptide in water. Constant-pH simulations are carried out at pH values of 2, 3, 3.5, 3.75, 4, 4.25, 4.5, 5, and 6. Each point is calculated from the time average of 15 trajectories to present the percentage of protonated states for that pH value. The solid curve presents the theoretical values. The distribution of the energy difference of all switches for the aspartic acid dipeptide is plotted in Figure 5a. The relation between Q(W) and W is plotted in Figure 5b. The coefficient of determination when fitting Q(W) and W to eq 38 is 0.99. On the basis of this result, it is safe to conclude that the Crooks equation is satisfied here and that the initial configuration is drawn from an equilibrated ensemble. The average acceptance ratio using different switching time for aspartic acid dipeptide is plotted in Figure 6. As expected, it is observed that the acceptance ratio increases steadily when using longer switching time. Longer switching times increase the computational cost, but this is mitigated by a larger acceptance probability. In principle, it should be possible to optimize the switching time to maximize the efficiency of the algorithm.

Figure 5

(a) Distribution of the energy difference for the protonation process (black) and deprotonation process (red) for the aspartic acid dipeptide. The sign is flipped for the deprotonation process. The distribution is plotted with dots using 60 000 data points. The solid curve presents the fitted distribution using the Nadaraya–Watson kernel regression estimate with bandwidth of 0.5. (b) The relationship between of Q(W) and W is displayed. Q(W) is defined as eq 38. The dots represent the simulation results, and the solid line presents the theoretical line. The coefficient of determination is 0.99 when fitting the data to the theoretical line. The dots represent all observable Q(W); for other W, either Pf(+W) or Pb(−W) is zero.

Figure 6

Acceptance ratio vs switching time for isolated aspartic acid residue. Switching time varies from 0.5 to 50 ps. Each point is calculated from the average of five trajectories, and the error bars show the standard deviation.

Multisite Systems

The present constant-pH hybrid neMD–MC scheme was applied to multisites protein systems with explicit solvent. The calculated pKa’s for OMTKY3 are shown in Table 2. The calculated pKa’s for HEWL are shown in Table 3. For OMTKY3, the RMSD error of calculated pKa from experimental value can get to as low as 0.67 pH units (simulation 7 in Table 2) with a proper choice of the parameters of the algorithm. The error is significantly smaller than previous results from λ-dynamics[20] and is also smaller than the error assuming the reference pKa. For HEWL, the RMSD error of calculated pKa’s compared to experiment is 0.93 (Table 3). These results match the best performance from other existing methods.[9−11,20,21,56]

Table 2

Simulation Results for OMTKY3a

simulationb	Asp7	Asp27	Glu10	Glu19	Glu43	errorc
1	3.5 (0.8)d	3.6 (1.3)	4.2 (0.1)	3.6 (0.4)	4.4 (−0.4)	0.73
2	3.5 (0.8)	4.3 (2.0)	4.2 (0.1)	3.5 (0.3)	4.2 (−0.6)	1.01
3	3.1 (0.4)	4.0 (1.7)	3.8 (−0.3)	3.4 (0.2)	4.7 (−0.1)	0.80
4	3.2 (0.5)	5.5 (3.2)	4.0 (−0.1)	3.4 (0.2)	4.1 (−0.7)	1.49
5	3.7 (1.0)	4.4 (2.1)	4.0 (−0.1)	3.5 (0.3)	4.4 (−0.4)	1.06
6	3.6 (0.9)	4.6 (2.3)	4.1 (0.0)	3.7 (0.5)	4.5 (−0.3)	1.13
7	3.4(0.7)	3.5 (1.2)	4.0 (−0.1)	3.6 (0.4)	4.4 (−0.4)	0.67
pK_a^(exp)	2.7	2.3	4.1	3.2	4.8
pK_a^(ref)	4.0 (1.3)	4.0 (1.7)	4.4 (0.3)	4.4 (1.2)	4.4 (−0.4)	1.12

Experimental values from Schaller and Robertson.[57] The solution contains 10 mM KCl.

Seven sets of simulations with different parameters: simulations 1, 5, and 6 comprised 15 000 cycles, and simulations 2, 3, 4, and 7 comprised 10 000 cycles; each cycle consisted of 10 ps MD simulation with 20 ps neMD switching simulation (except for simulation 7, for which the switching length is 50 ps); accordingly, the longest trajectory, simulation 7, has a total length of 600 ns. Simulations 1, 2, 5, 6, and 7 were carried out at pH 4, and simulations 3 and 4 were carried out at pH 3; for simulations 1, 3, 5, and 7, pKa(i) was set to the experimental pKa; for simulations 2, 4, and 6, pKa(i) was set to pKa(ref). The proton could be added to a single oxygen only in simulations 5 and 6.

The number in the parentheses shows the deviation from experimental value.

Root-mean-square deviation (RMSD) error relative to the experimental values.

Table 3

Simulation Results for HEWL

	experimenta	simulationb	error	trial	accepted	%
Glu7	2.85	2.86	0.01	2215	273	12.3
His15	5.36	3.85	1.51	4024	160	4.0
Asp18	2.66	2.74	0.08	1140	45	3.9
Glu35	6.20	3.98	2.22	641	29	4.5
Asp48	1.60	1.41	0.19	135	8	5.9
Asp52	3.68	3.99	0.31	3312	39	1.2
Asp66	0.90	0.83	0.07	39	2	5.1
Asp87	2.07	3.03	0.96	1276	126	9.9
Asp101	4.09	3.57	0.52	3468	111	3.2
Asp119	3.20	3.06	0.14	2588	134	5.2
Average			0.93c	1884d	93	5.5

Experimental results from Bartik et al.[58]

Experimental pKa is used as pKa(i).

RMSD error from experimental value.

Each cycle of simulation consisted of 10 ps MD simulation with 50 ps neMD switching simulation. The trajectory has a total of 18 838 rounds of simulation. The total length is 1.1 μs.

Experimental values from Schaller and Robertson.[57] The solution contains 10 mM KCl. Seven sets of simulations with different parameters: simulations 1, 5, and 6 comprised 15 000 cycles, and simulations 2, 3, 4, and 7 comprised 10 000 cycles; each cycle consisted of 10 ps MD simulation with 20 ps neMD switching simulation (except for simulation 7, for which the switching length is 50 ps); accordingly, the longest trajectory, simulation 7, has a total length of 600 ns. Simulations 1, 2, 5, 6, and 7 were carried out at pH 4, and simulations 3 and 4 were carried out at pH 3; for simulations 1, 3, 5, and 7, pKa(i) was set to the experimental pKa; for simulations 2, 4, and 6, pKa(i) was set to pKa(ref). The proton could be added to a single oxygen only in simulations 5 and 6. The number in the parentheses shows the deviation from experimental value. Root-mean-square deviation (RMSD) error relative to the experimental values. Experimental results from Bartik et al.[58] Experimental pKa is used as pKa(i). RMSD error from experimental value. Each cycle of simulation consisted of 10 ps MD simulation with 50 ps neMD switching simulation. The trajectory has a total of 18 838 rounds of simulation. The total length is 1.1 μs. The average acceptance ratio for each residue in the hybrid simulations of OMTKY3 is shown in Table 4. For simulation 6 in Table 2, the average acceptance ratio of Glu19 for either protonation or deprotonation process is calculated. Of around 2500 trials to protonate the Glu19, only 2.1% are accepted, whereas 10% are accepted of around 500 trials to unprotonated it. This reveals the imbalance of the acceptance ratio between protonation and deprotonation processes. Simulation 5 in Table 2 is carried out in the attempt to reduce such imbalance. It uses the exact same settings as those in simulation 6 except that pKa(i) is set as the experimental value, 3.2. Roughly 8% of 1000 protonation trials and 4% of 2000 deprotonation trials are accepted. The total number of successful transitions increases 1.6-fold compared with those for simulation 6. This provides a good example of how the overall efficiency of the algorithm can be increased by the choice of the inherent pKa(i). The average acceptance ratio for each protonation site is calculated for simulation 1 in Table 2; the acceptance ratio can get as high as 10% for Glu10 or as low as 0.8% for Asp27. Especially for Asp27, of over 3000 attempted switches, less than 25 are accepted. Such a low acceptance probability is clearly a source of inefficiency. The calculated pKa values converge much faster for Glu10 than for Asp27. The low acceptance ratio for Asp27 is also part of the reason that the calculated pKa for it is not accurate.

Table 4

Acceptance Ratio for OMTKY3

simulation	Asp7	Asp27	Glu10	Glu19	Glu43	average
1	5.0	0.8	10.6	3.8	7.2	5.4
2	6.4	1.1	11.2	2.7	9.5	5.6
3	5.5	0.5	7.4	3.9	8.7	4.0
4	4.4	0.5	10.5	2.7	7.6	4.6
5	10.2	1.6	10.9	5.3	10.3	6.5
6	8.7	1.3	8.9	3.3	13.3	6.5
7	8.7	3.2	20.3	11.1	17.1	12.0

In general, the total length needed to converge the hybrid simulation depends on the acceptance ratio of the residue with highest energy barrier opposing a change in the ionization state. Increasing the length of the neMD switching trajectory could help to increase the average acceptance ratio. However, it is not always desirable to use longer neMD switches for all sites. The data from the hybrid simulation 7 of OMTKY3 in Table 2 was carried out to demonstrate this observation. The simulation uses exactly the same settings as those for hybrid simulation 1 except that neMD switching trajectories of 50 ps were used. As a result, the average acceptance ratio is increased by a factor of 3 for Asp27, a factor less than 2 for Glu10, and a factor less than 2.5 overall. Considering that the switching length is increased by a factor of 2.5, the overall efficiency is actually lower. However, this does not mean that the length of switch is an irrelevant parameter. Empirically, we find that it is preferable to choose a customized length for each site so that the average acceptance probability is close to about 10%. In a follow-up simulation, neMD switching trajectories of 50 ps were used for Asp7 and Glu19, 30 ps for Asp27, and 20 ps for Glu10 and Glu43. The average acceptance ratio versus switching time increased compared to both simulations 1 and 7 in Table 2 (results not shown). In practice, it may be difficult to pick the best inherent value pKa(i) or switching time for each residue. In the absence of additional information, the best choice is to use the reference pKa and a fixed switching length in the early stage of the simulation. The results can then be used to adjust the parameters before pursuing the simulation further. This general idea could be developed into an adaptive version of the algorithm that could, on-the-fly, automatically adjust the pKa(i) and the switching length. This adaptive program would adjust parameters periodically, but at a low frequency, to keep the average acceptance ratio around 10% for each site and for either process. Finally, we would like to emphasize that these simulation parameters affect only the efficiency of the algorithm, but they do not alter the expected outcome of the constant-pH simulations.

Summary

A new computational method was developed to carry out explicit solvent simulations of complex molecular systems under conditions of constant pH. Preidentified ionizable sites are allowed to spontaneously protonate and deprotonate as a function of time in response to the environment and the imposed pH. To generate physically plausible configurations with altered protonation states that are subsequently accepted or rejected according to a Metropolis Monte Carlo (MC) criterion, the method consists of carrying out short nonequilibrium molecular dynamics (neMD) switching trajectories. To achieve higher efficiency, the random protonation/deprotonation processes are separated into two steps, reducing the need for generating a large number of unproductive and costly nonequilibrium trajectories. In the first step, the protonation state of a site is randomly attributed via a Metropolis MC process on the basis of an intrinsic pKa value; a nonequilibrium switch is generated only when this change in protonation state is accepted. This hybrid two-step inherent pKa neMD–MC simulation method was illustrated with application to turkey ovomucoid third domain and hen egg-white lysozyme. The illustrative results demonstrate that the present method is practical and able to treat multisites of proteins with explicit solvent. While biological systems were used here as a primary motivation, the present method offers a general framework to simulate the effect of pH in a wide range of nonbiological molecular systems and materials. Charge neutrality of the simulated system is a feature of constant-pH simulation that deserves some special attention. In practice, it is possible to carry out constant-pH simulations without compensating for the change in total charge in the system during the protonation/deprotonation events. With Ewald lattice sum and tinfoil boundary conditions, the total net charge is always accompanied by a uniform canceling background charge, which makes the simulation cell of the periodic system neutral.[47] However, because the spatial average over the entire simulation box is constrained to be exactly zero by the tinfoil boundary conditions, the Galvani potential of the bulk water phase floats with respect to the standard reference vacuum potential.[47] This phenomenon can give rise to a spurious shift in the free energy of charged moieties that is proportional to the volume fraction of the solvent in the simulation box, which would affect the apparent pKa’s of the sites attached to a solvated protein. This issue can be avoided by keeping the overall charge of the system neutral during the protonation/deprotonation events by introducing counter-reactions involving the transformation of an ion into a water molecules. However, the fluctuations in the energy become larger due to the counter-reactions, which decreases the acceptance probability and the efficiency of the algorithm. It is possible to compensate by adjusting the length of the neMD switching trajectories to restore the efficiency of the method (Figure 6). Because the sampling time is expected to grow linearly with the number of titratable sites, a fundamental advantage of the present method is its natural ability to scale to extremely large systems. A number of enhancements of the method will be considered in the near future. For example, while the algorithm considered only independent protonation and deprotonation attempts in the applications presented here, it is possible that including correlated attempts would be useful, particularly when ionizable sites are strongly coupled. For example, transferring the proton from one site to another when several ionizable sites are in close proximity from one another may yield a higher acceptance probability and increase the overall efficiency of the algorithm. Such correlated attempts could be easily included in the MC move set on the basis of distance with no extra computational cost. Another feature of interest will be to enable a Hamiltonian replica-exchange treatment with multiple pH values to expand the scope of the method.

37 in total

1. Structural changes linked to proton translocation by subunit c of the ATP synthase.

Authors: V K Rastogi; M E Girvin
Journal: Nature Date: 1999-11-18 Impact factor: 49.962

2. The ionization state and the conformation of Glu-71 in the KcsA K(+) channel.

Authors: Simon Bernèche; Benoît Roux
Journal: Biophys J Date: 2002-02 Impact factor: 4.033

3. Constant-pH molecular dynamics using continuous titration coordinates.

Authors: Michael S Lee; Freddie R Salsbury; Charles L Brooks
Journal: Proteins Date: 2004-09-01

4. pKa's of ionizable groups in proteins: atomic detail from a continuum electrostatic model.

Authors: D Bashford; M Karplus
Journal: Biochemistry Date: 1990-11-06 Impact factor: 3.162

5. Molecular simulation with variable protonation states at constant pH.

Authors: Harry A Stern
Journal: J Chem Phys Date: 2007-04-28 Impact factor: 3.488

Review 6. CHARMM: the biomolecular simulation program.

Authors: B R Brooks; C L Brooks; A D Mackerell; L Nilsson; R J Petrella; B Roux; Y Won; G Archontis; C Bartels; S Boresch; A Caflisch; L Caves; Q Cui; A R Dinner; M Feig; S Fischer; J Gao; M Hodoscek; W Im; K Kuczera; T Lazaridis; J Ma; V Ovchinnikov; E Paci; R W Pastor; C B Post; J Z Pu; M Schaefer; B Tidor; R M Venable; H L Woodcock; X Wu; W Yang; D M York; M Karplus
Journal: J Comput Chem Date: 2009-07-30 Impact factor: 3.376

7. pH-dependent insertion of proteins into membranes: B-chain mutation of diphtheria toxin that inhibits membrane translocation, Glu-349----Lys.

Authors: D O O'Keefe; V Cabiaux; S Choe; D Eisenberg; R J Collier
Journal: Proc Natl Acad Sci U S A Date: 1992-07-01 Impact factor: 11.205

Review 8. The amyloid-beta peptide and its role in Alzheimer's disease.

Authors: A B Clippingdale; J D Wade; C J Barrow
Journal: J Pept Sci Date: 2001-05 Impact factor: 1.905

9. Structure of a eukaryotic CLC transporter defines an intermediate state in the transport cycle.

Authors: Liang Feng; Ernest B Campbell; Yichun Hsiung; Roderick MacKinnon
Journal: Science Date: 2010-09-30 Impact factor: 47.728

10. Constant pH Replica Exchange Molecular Dynamics in Explicit Solvent Using Discrete Protonation States: Implementation, Testing, and Validation.

Authors: Jason M Swails; Darrin M York; Adrian E Roitberg
Journal: J Chem Theory Comput Date: 2014-02-05 Impact factor: 6.006

30 in total

1. Multiple Time-Step Dual-Hamiltonian Hybrid Molecular Dynamics - Monte Carlo Canonical Propagation Algorithm.

Authors: Yunjie Chen; Seyit Kale; Jonathan Weare; Aaron R Dinner; Benoît Roux
Journal: J Chem Theory Comput Date: 2016-03-25 Impact factor: 6.006

Review 2. Molecular Dynamics Simulations of Membrane Permeability.

Authors: Richard M Venable; Andreas Krämer; Richard W Pastor
Journal: Chem Rev Date: 2019-02-12 Impact factor: 60.622

3. Coarse-grained dynamic RNA titration simulations.

Authors: S Pasquali; E Frezza; F L Barroso da Silva
Journal: Interface Focus Date: 2019-04-19 Impact factor: 3.906

4. Efficiency in nonequilibrium molecular dynamics Monte Carlo simulations.

Authors: Brian K Radak; Benoît Roux
Journal: J Chem Phys Date: 2016-10-07 Impact factor: 3.488

5. Reservoir pH replica exchange.

Authors: Ana Damjanovic; Benjamin T Miller; Asim Okur; Bernard R Brooks
Journal: J Chem Phys Date: 2018-08-21 Impact factor: 3.488

6. Scalable molecular dynamics on CPU and GPU architectures with NAMD.

Authors: James C Phillips; David J Hardy; Julio D C Maia; John E Stone; João V Ribeiro; Rafael C Bernardi; Ronak Buch; Giacomo Fiorin; Jérôme Hénin; Wei Jiang; Ryan McGreevy; Marcelo C R Melo; Brian K Radak; Robert D Skeel; Abhishek Singharoy; Yi Wang; Benoît Roux; Aleksei Aksimentiev; Zaida Luthey-Schulten; Laxmikant V Kalé; Klaus Schulten; Christophe Chipot; Emad Tajkhorshid
Journal: J Chem Phys Date: 2020-07-28 Impact factor: 3.488

Review 7. Development of constant-pH simulation methods in implicit solvent and applications in biomolecular systems.

Authors: Fernando Luís Barroso daSilva; Luis Gustavo Dias
Journal: Biophys Rev Date: 2017-09-18

8. Origin of pK_a Shifts of Internal Lysine Residues in SNase Studied Via Equal-Molar VMMS Simulations in Explicit Water.

Authors: Xiongwu Wu; Juyong Lee; Bernard R Brooks
Journal: J Phys Chem B Date: 2016-10-18 Impact factor: 2.991

9. Standard state free energies, not pK_as, are ideal for describing small molecule protonation and tautomeric states.

Authors: M R Gunner; Taichi Murakami; Ariën S Rustenburg; Mehtap Işık; John D Chodera
Journal: J Comput Aided Mol Des Date: 2020-02-12 Impact factor: 3.686

10. Constant-pH Molecular Dynamics Simulations for Large Biomolecular Systems.

Authors: Brian K Radak; Christophe Chipot; Donghyuk Suh; Sunhwan Jo; Wei Jiang; James C Phillips; Klaus Schulten; Benoît Roux
Journal: J Chem Theory Comput Date: 2017-11-22 Impact factor: 6.006