Literature DB >> 32492343

Toward Automated Free Energy Calculation with Accelerated Enveloping Distribution Sampling (A-EDS).

Jan Walther Perthold¹, Dražen Petrov¹, Chris Oostenbrink¹.

Abstract

Free-energy perturbation (FEP) methods are commonly used in drug design to calculate relative binding free energies of different ligands to a common host protein. Alchemical ligand transformations are usually performed in multiple steps which need to be chosen carefully to ensure sufficient phase-space overlap between neighboring states. With one-step or single-step FEP techniques, a single reference state is designed that samples phase-space not only representative of a full transformation but also ideally resembles multiple ligand end states and hence allows for efficient multistate perturbations. Enveloping distribution sampling (EDS) is one example for such a method in which the reference state is created by a mathematical combination of the different ligand end states based on solid statistical mechanics. We have recently proposed a novel approach to EDS which enables efficient barrier crossing between the different end states, termed accelerated EDS (A-EDS). In this work, we further simplify the parametrization of the A-EDS reference state and demonstrate the automated calculation of multiple free-energy differences between different ligands from a single simulation in three different well-described drug design model systems.

Entities: Chemical Disease Gene

Year: 2020 PMID： 32492343 PMCID： PMC7686955 DOI： 10.1021/acs.jcim.0c00456

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Introduction

With the paradigm shift from the rigid lock-and-key hypothesis of receptor–ligand recognition to more dynamic conceptions of molecular interactions involving structural ensembles rather than single conformations and recent advances in computer technology, molecular dynamics (MD) simulations are being used more and more in the field of drug design. Specifically, MD-based methods to efficiently calculate relative drug–target binding free energies are commonly used.[1] Calculations involving transformations of one ligand into other ligands via unphysical pathways are often referred to as “alchemical” changes. If such transformations are conducted when the ligand is bound to the receptor and when free in solution, free-energy differences obtained from these transformations can be translated into relative binding free energies using thermodynamic cycles.[2] A plethora of methods is available to perform alchemical transformations;[3] however, free-energy perturbation (FEP) approaches involving many unphysical intermediate states are most commonly applied.[1,4] Recently, FEP-based methods have been shown to allow for the calculation of relative ligand binding free energies with remarkable accuracy on large scales.[5−7] FEP approaches are based on Zwanzig’s equation described more than six decades ago,[8] which relates the free-energy differences between two states, ΔF2,1, to the exponential average of the energy difference given by their Hamiltonians and , obtained from an ensemble of the positions of one of the states (R is the ideal gas constant and T is the temperature): Zwanzig’s equation only gives unbiased free-energy estimates if there is sufficient overlap in the phase-space that is sampled by both states, which is the reason for the introduction of unphysical intermediate states in practice. However, it is intriguing to think about possibilities to design a single reference state which does not only allow for single-step perturbations, but also for the accurate representation of multiple end states simultaneously, thereby greatly enhancing the efficiency of the FEP approach. One-step perturbation (OSP) methods make use of empirically tuned reference-state Hamiltonians which often involve “soft” atoms to accurately represent multiple end states.[9−14] However, OSP approaches suffer from limited transferability of the designed Hamiltonians, i.e. a different perturbation might require empirical reparametrization of the reference-state Hamiltonian.[15] Moreover, the soft-core approach does not work very well for larger changes in the charge distributions[16,17] and additional end state simulations are often needed to calculate correct contributions of the electrostatic potential energy to the free-energy differences.[18−20] A different approach to elegantly calculate a reference Hamiltonian is enveloping distribution sampling (EDS).[21] Here, n different Boltzmann-weighted end-state Hamiltonians are combined in the reference Hamiltonian , the partition function of which is now the sum of the partition functions of the end states. Additionally, to ensure equal sampling of the end states, constant free-energy offset parameters ΔF are added:[21−24] Here, we call the offset parameters ΔFfree-energy offset parameters, because they correspond to the relative free-energies of the end states if all states are sampled with equal probability. However, eq was found to result in high energy-barriers, preventing efficient sampling of the minima of all end states. An additional smoothening parameter s was proposed to smoothen the energy landscape of the reference Hamiltonian.[22,25−29] However, for the small values of s that are often used, the energy minima of the reference Hamiltonian no longer correspond to the energy minima of the original end states, reducing the phase-space overlap and further complicating the search for the optimal values of s and ΔF.[28,30−34] A computationally expensive approach to bypass this problem uses replica-exchange simulations, in which multiple replicas are performed at different values of s.[35,36] We have very recently proposed an alternative EDS formulation in which the EDS reference-state Hamiltonian given in eq is smoothed with a harmonic potential energy function, similar to the accelerated MD approach proposed by McCammon and co-workers.[37,38] Our accelerated EDS (A-EDS) approach notably preserves local energy minima and the energy landscape around them, while simultaneously decreasing the energy barriers between the end states. Moreover, it allows for intuitive and potentially automated tuning of the EDS reference-state Hamiltonian parameters by setting an anticipated value for the maximum energy barrier between the ends-states as the only search parameter.[39] In this work, we propose an improved algorithm for the A-EDS Hamiltonian parameter search which further simplifies the construction of the reference-state Hamiltonian. With our updated search scheme, it is no longer necessary to give a certain value for the anticipated maximum energy barrier between the end states. The search for the optimal acceleration parameters is instead based on a given acceleration factor. This acceleration factor is translated to the probability of reaching the maximum transition-state energy E‡, estimated from the average energy and energy fluctuation of each end state. Hence, the user specifies the likelihood of a state transition as search parameter and does not need to have any knowledge about the structure and magnitudes of the energy landscape of the nonaccelerated EDS Hamiltonian before the simulation. Additionally, we improved the fully automated search for the free-energy offset parameters ΔF by considering the deformation (flattening) of the original end-state Hamiltonians upon acceleration and by using a memory relaxation time for their calculation, which allows for faster adjustment of the parameters. To demonstrate the applicability of our proposed method to automatically calculate multiple free-energy differences from a single simulation, three protein–ligand systems were chosen. All three systems are already established in the literature and the drug–target interactions were examined both using experimental and computational methods: glutamate receptor A2 (GRA2) allosteric modulators[14,39] trypsin (TRP) inhibitors[19,40−47] phenylethanolamine N-methyltransferase (PNMT) inhibitors[32,35,36,48] Notably, systems 2 and 3 both include net-charge changes in the ligand sets, which were previously considered too challenging for multistate EDS approaches.[35,36]

Theory

A-EDS Hamiltonian

In A-EDS the reference Hamiltonian given in eq is smoothened by an approach that is similar to Gaussian accelerated MD (GAMD),[38] which is a variant of the general accelerated MD approach.[37] A harmonic potential energy term is added to the EDS reference Hamiltonian, pulling the energy landscape down to a minimum energy value Emin. The continuous accelerated EDS (A-EDS) reference Hamiltonian is defined asWhere, Emin and Emax are tunable parameters, in addition to the ΔF in eq . Effectively, is only modified in the region between Emin and Emax. The A-EDS functional form preserves the energy landscape around local energy minima.

A-EDS Parameter Search

In our earlier work, we have proposed an intuitive parameter search,[39] which is simplified further in the current work. During a parameter-search simulation, the end state i which is currently being sampled in a simulation of the EDS reference state is defined as the state with the lowest energy . We monitor the average value of while the state is sampled, denoted as E̅, as well as the value of when a new state is sampled, indicative of the barrier height and denoted as . From the values of we compute E̅max‡ as the average of the maximum transition energy between states within a state round-trip (we define a round-trip as having visited all states at least once). With these properties we define the maximum energy barrier ΔEmax between the end states in the unmodified EDS reference Hamiltonian : is the value of when a new state is sampled. The unmodified EDS Hamiltonian can now be accelerated by setting Emin and Emax such that the maximum energy barrier between the states, ΔEmax, is reduced to the target value ΔEmax in . The target value ΔEmax is set to a user-defined multiple z of the standard deviation, , of the energy of the state with the lowest average energy E̅. Like for the average end state energies E̅, all are only calculated while the respective state is sampled. By setting the acceleration as sigma-level z, the minimum probability of reaching the maximum transition-energy, Pmin (E̅max‡), becomes for Gaussian end state energy distributions:where erf() is the error function. Hence, the probability of reaching the maximum transitions-energy is decreased exponentially with increasing sigma-level z. The value of ΔEmax that is obtained in this way, is subsequently used to determine Emin and Emax in eq . Emax simply corresponds to E̅max‡, and Emin is calculated asIf the resulting value of Emin is less than min(E̅), it is adjusted to The proposed optimization scheme for the acceleration parameters of the A-EDS Hamiltonian is used in a nonequilibrium parameter-search simulation in which the user only specifies the desired maximum barrier height ΔEmax as sigma-level z which gives a multiple of of the end state with the lowest average energy. No initial values for Emin and Emax have to be set, as they are adjusted while the reference state explores the energy landscape. To enable sampling of all states in the beginning of the search simulation, the value of Emax is set to the maximum transition energy Emax‡ of the unmodified Hamiltonian as the system explores the energy landscape until all states have been visited at least once. After all states have been visited at least once, Emax is calculated as described above. In addition to the A-EDS parameters Emin and Emax, the free-energy offset parameters ΔF are optimized simultaneously during the parameter-search simulation. The free-energy offset parameter of the first state is arbitrarily set to zero and the other free-energy offset parameters are calculated relative to the first state by simply using the perturbation formula, eq : Note that in contrast to the previously suggested approach,[39] we here consider the accelerated end-state Hamiltonians instead of the unmodified end-state Hamiltonians to account for deformation (flattening) of the original energy landscapes: To allow for faster adjustment of the free-energy offset parameters ΔF during the parameter-search simulation, we implemented a memory relaxation time τ for the exponential averages of the energy differences in analogy to time-averaged distance restraining functions.[49] Accordingly, the exponential energy differences with a characteristic memory decay time τ read (t is the simulation time and Δt the time-step): Additionally, we increase τ linearly over the course of the simulation from a value τ to a value τ (ttot is the total simulation time of the nonequilibrium parameter search): In systems in which the free energy offset parameters ΔF are very large, τ can be set to a small value in beginning of the parameter-search simulation to allow for rapid adjustment of all ΔF, while more statistics are used upon convergence toward the end of the parameter-search simulation. Using this approach, the free energy offset parameters ΔF fluctuate around their optimal values which ensure equal sampling of all end states. After the nonequilibrium parameter-search simulation, the values of Emin, Emax, and ΔF are fixed and an equilibrium simulation is performed to determine the final free energy differences between the states.

Methods

Simulation Settings

All MD simulations were performed using the GROMOS MD++ 1.5.0 simulation engine[50] to which the A-EDS functionality was added. All systems were simulated in NVT ensembles at 300 K using Nosé–Hoover chains[51] to ensure canonical energy distributions with 5 coupled thermostats and relaxation times of 0.1 ps. The solute and solvent degrees of freedom were coupled to different heat-baths. The equations of motion were solved using a leapfrog integration scheme[52] with a time-step of 2 fs. Covalent bond lengths of solute molecules were constrained using the SHAKE[53] algorithm, and covalent bond lengths and angles of solvent water molecules were constrained using the SETTLE[54] algorithm. The center-of-mass motions of the systems were removed every 10 000 steps. Neighbor searching was performed every 10 fs using a group-based cutoff scheme within a cutoff sphere of 1.4 nm. For the calculation of nonbonded interactions, a twin-range cutoff scheme was employed. Within a cutoff of 0.8 nm, nonbonded interactions were calculated every step. Between 0.8 and 1.4 nm, nonbonded interactions were evaluated every 10 fs and held constant between the updates. For the calculation of electrostatic interactions, a reaction-field contribution[55] with a relative dielectric permittivity of 61[56] beyond the cutoff sphere was added. For parametrization and preparation of the drug-target model systems, the reader is referred to previous literature.[14,19,36] For the PNMT ligands, small updates were made to the parameters. These involved the introduction of explicit aromatic hydrogens and the more precise definition of integer-charged charge groups. The updated molecular building blocks are available in the Supporting Information. It is noted that for PNMT ligands and TRP ligands additional weak distance restraints with force constants of 250 kJ·mol–1·nm–2 were introduced to prevent the reference-state ligands from leaving the binding site during the sampling of end states with unfavorable binding affinity. The additional protein–ligand distance restraint definitions for these systems is given in Tables S1 and S2 in the Supporting Information.

Perturbation Topology Setup

For each of the protein systems, multiple end states were considered in the multistate A-EDS calculations, as shown in Figure . For GRA2, 16 distinct states were used, which is more than the five end states used in our previous work.[39] For TRP, eight end states were used, including the protonated (state 2) and deprotonated (state 8) carboxylic acid. For PNMT, ten different states were considered, including state 2, which bears a different net-charge from all the other ligands. To combine the different ligand end states given in Figure into a single reference state, a single-topology approach was chosen. This was also shown to work best in ref (32), while a dual topology approach was used in refs (35) and (36).

Figure 1

Different ligand end states for which relative binding free energies were calculated. Different benzothiadiazine dioxide ligands of system GRA2 (A), different benzamidine ligands of system TRP (B), and the different tetrahydroisoquinoline ligands of system PNMT (C) are shown with their corresponding end state numbering. To simplify the complex setup of such molecules, a recently developed automated approach was used for systems TRP and PNMT, based on maximal common substructure (MCS) search for a pair of molecules. The search involves an iterative procedure in which in each step a pair of atoms, each belonging to one of the compounds, is added to the current common substructure (current solution). Upon adding a pair of atoms, common parts of molecular topologies are checked for mismatching parameters of bonded interactions. As the search was primarily aimed at maximizing the number of common atoms, while keeping all bonded interactions unperturbed, only added pairs without any mismatched parameters of bonded interactions were taken into account. To create a perturbation topology from the resulting MCS, atom pairs from the solution common substructure were merged into single atoms with interaction parameters assigned to their respective end-state parameters, while the remaining atoms from both topologies not belonging to the solution common substructure were assigned as noninteracting dummy atoms transforming into interacting atoms (end-state parameters) and the other way around. To generate a combined multistate single topology from a set of multiple compounds (n topologies), MCS search (as described above) was performed n – 1 times. An initial MCS search was performed on two arbitrarily chosen compounds from the set, producing a first perturbation topology (representative of the first two end states). Each of the following (n – 2) MCS searches were performed on the resulting perturbation topology from the previous MSC step and one of the remaining compounds (topologies) from the set, appending an additional end state in each of the iterations and finally producing the multistate single topology.

A-EDS Parameter-Search Simulations

For the calculation of the A-EDS acceleration parameters, nonequilibrium A-EDS parameter-search simulations were first performed for the unbound ligands in water with three different acceleration σ-levels, 1σ, 2σ, and 3σ, to calculate the anticipated maximum energy barrier between the end states, ΔEmax (i.e., z = 1, 2, and 3). In these simulations, A-EDS parameters Emax, Emin, and all free-energy offset parameters ΔF were determined simultaneously, with an increase in the memory relaxation time for the calculation of the free-energy offset parameters ΔF from τ = 1 ps to τ = 100 ps within the parameter-search simulation time of 10 ns. For the parameter-search simulations of the ligands bound to the proteins, A-EDS parameters Emax and Emin previously determined for the ligands in water for the three different acceleration levels were used and held fixed throughout the parameter search. Only the free-energy offset parameters ΔF were optimized during these parameter-search simulations, with the same relaxation time parameters as used for the ligands in water. For the PNMT system the parameter-search simulations were prolonged to 100 ns with a relaxation time τ of 1000 ps because only few state transitions were observed during the first 10 ns (see Table S3 in the Supporting Information) and free-energy offset parameters converged more slowly (see Figure ). Since the free-energy offset parameters fluctuate around their optimal values during the parameter-search simulations, the values used for the subsequent equilibrium simulations were calculated by averaging over the free-energy offset parameter trajectories of the parameter-search simulations. For systems GRA2 and TRP, the first 5 ns of the simulations were discarded for the averaging as nonequilibrated region. For system PNMT, the first 10 ns of the simulations were discarded for the averaging as nonequilibrated region.

Figure 2

Convergence of the A-EDS acceleration parameters Emax and Emin and the free-energy offset parameters ΔF relative to their final values for the A-EDS parameter-search simulation with an acceleration σ-level of 2σ, for the unbound ligands in water of system GRA2 (A), TRP (B), and PNMT (C), and for the ligands bound to the protein of system GRA2 (D), TRP (E), and PNMT (F). The convergence of the free-energy offset parameters ΔF is shown as a forward cumulative average over the free-energy offset trajectory with the first 5 ns discarded as the nonequilibrated region. Note that in panels B and C the A-EDS parameter Emin (red solid line) falls off the plot in the beginning of the trajectory.

A-EDS Equilibrium Sampling Simulations

A-EDS equilibrium simulations for the calculation of relative binding free energies were started from the last snapshots of the parameter-search simulations in water and for the ligands bound to the proteins, respectively. The A-EDS parameters determined during the parameter-search simulations were held fixed and sampling was performed for another 10 ns in water and for 100 ns (systems GRA2 and TRP) or 200 ns (system PNMT) for the protein systems.

Results and Discussion

A-EDS Parameter-Search Simulations

To determine the A-EDS acceleration parameters Emax and Emin, and the free-energy offset parameters ΔF, parameter-search simulations were performed for 10 ns for the ligands solvated in water. The resulting acceleration parameters are given in Table for all systems. Depending on the chosen acceleration level the resulting A-EDS parameter Emin, controlling the range over which the energy landscape is flattened, varies greatly. The parameters Emax, which correspond to the maximum energy barrier between the states, are comparably independent of the chosen acceleration level. The convergence of all parameters for all ligand systems in water for an acceleration σ-level of 2σ are given in Figure A–C, and for the other acceleration levels in the Supporting Information, Figure S1 and S2. Parameters mostly are converged or only change little with simulation time. The greatest fluctuation occurs for the acceleration parameter Emin, which can be explained by the dependence of this parameter on the slower-converging energy fluctuation estimate for each end state. However, this fluctuation does not have a big influence on the calculated free-energy offset parameters ΔF.

Table 1

A-EDS Acceleration Parameter-Search Results Obtained from the Simulations of the Unbound Ligands in Water for the Three Different Acceleration σ-Levelsa

system	parameter	1σ	2σ	3σ
GRA2	E_max	158.5	97.2	93.6
	ΔE_max^* (i.e., z·σ)	19.2	27.0	34.5
	E_max	119.2	94.7	81.4
	E_min	–535.0	–150.6	–44.1
TRP	ΔE_max	391.4	472.2	486.2
	ΔE_max^* (i.e., z·σ)	132.1	209.5	240.2
	E_max	33.7	36.1	36.5
	E_min	–546.1	–496.1	–455.6
PNMT	ΔE_max	300.8	506.9	538.4
	ΔE_max^* (i.e., z·σ)	35.6	216.5	261.6
	E_max	30.6	32.9	31.0
	E_min	–1240.8	–560.4	–523.2

ΔEmax and ΔEmax are determined from the simulation and define the values of Emax and Emin. Energy units are kilojoules per mole.

ΔEmax and ΔEmax are determined from the simulation and define the values of Emax and Emin. Energy units are kilojoules per mole. The data in Table nicely demonstrates how the selected acceleration level determines the values of Emin and Emax. Setting the acceleration level to 1σ implies requesting a large number of transitions between the states. Accordingly, the acceleration is strongest, leading to the largest range of acceleration (difference Emax – Emin is largest). In contrast, setting the acceleration level to 3σ allows for fewer transitions, higher energy barriers and hence to a reduced range of acceleration. As the fluctuations in the energy basins are system dependent, the resulting values of Emax – Emin differ considerably between the different cases. To calculate the free-energy offset parameters ΔF for the end states of the ligands bound to the proteins, A-EDS acceleration parameters Emax and Emin were held fixed during the parameter-search simulation in the protein systems, as convergence of these parameters was very slow in preliminary simulations, given the generally complicated and possibly non-Gaussian energy landscapes of host–guest systems, where the EDS region is not surrounded by a homogeneous and fast-relaxing medium, but by the protein environment. In our previous work, we determined that the free-energy differences are quite robust with respect to the exact choice of Emin and Emax.[39] Therefore, the acceleration parameters for each acceleration level were set to the values determined during the parameter-search simulations in water. The convergence of the free-energy offset parameters for all systems with an acceleration σ-level of 2σ is shown in Figure D–F and for the other acceleration levels in the Supporting Information, Figure S1 and S2. For systems GRA2 and PNMT, parameter-search simulations in the protein with an acceleration level of 1σ (very strong acceleration) became unstable. In particular, the SHAKE algorithm failed at maintaining the bond-length constraints, indicative of the occurrence of large forces due to too strong acceleration. Subsequent simulations in the protein using this acceleration level were not performed for these systems. Moreover, for system PNMT, the convergence of the free-energy offset parameters ΔF was slower than for the other systems, and the parameter search was extended to 100 ns. All final free-energy offset parameters ΔF are given in the Supporting Information, Tables S7–S9. The free-energy offset values can depend strongly on the chosen acceleration level, as the flattening of the energy landscape contributes to the free-energy differences between the accelerated end states (eq ) that are combined to the single reference state in the simulation.

A-EDS Equilibrium Simulations

To calculate relative binding free energies for the ligands, A-EDS equilibrium simulations were performed both in water and in the protein-bound state for all systems with the A-EDS parameters determined for each acceleration level in the parameter-search simulations. The free-energy differences between the ligands in each state were calculated from the free-energy difference of each state to the A-EDS reference state obtained from Zwanzig’s equation (eq ). The effects of any distance restraints added in the simulations (Tables S1, S2) on the free energy were removed in this stage, by adding the restraining energies to the reference state Hamiltonians. For system GRA2, all calculated free-energy differences for the unbound ligands were very similar for all three acceleration levels (Table ) and converged very rapidly (Figure S3A–C in the Supporting Information). However, for the bound ligands, different results were obtained for acceleration levels of 2σ and 3σ, despite apparent convergence of the calculated free energy values (Figure S3D, E in the Supporting Information). Simulations with an acceleration level of 1σ were not performed in the protein, as simulations became unstable during the parameter-search simulation with this acceleration level. Analysis of the sampled states and state transitions in these equilibrium simulations revealed that for an acceleration level of 3σ, many states were not sampled adequately and state transitions to these states occurred only rarely (Table and Figure S6 in the Supporting Information).

Table 2

Relative Free-Energy Differences ΔF≠1,1 between the Ligand End States for the Three Different Acceleration σ-Levels for System GRA2a

	ΔF_i_≠1,1 unbound ligand			ΔF_i_≠1,1 two bound ligands			calculated ΔΔF_i_≠1,1
	1σ	2σ	3σ	1σ	2σ	3σ	1σ	2σ	3σ
ΔF_2,1	–111.2 ± 0.3	–111.0 ± 0.2	–110.8 ± 0.1	n.a.	–224.4 ± 1.6	–231.4 ± 1.2	n.a.	–2.4 ± 1.7	–9.8 ± 1.0
ΔF_3,1	22.6 ± 0.2	22.5 ± 0.2	22.4 ± 0.1	n.a.	45.5 ± 1.5	39.6 ± 1.4	n.a.	0.5 ± 1.5	–5.3 ± 1.2
ΔF_4,1	8.9 ± 0.2	8.6 ± 0.1	8.7 ± 0.1	n.a.	7.3 ± 1.7	24.8 ± 2.3	n.a.	–9.9 ± 1.7	7.5 ± 2.1
ΔF_5,1	–53.6 ± 0.3	–54.3 ± 0.2	–54.4 ± 0.1	n.a.	–116.5 ± 1.5	–118.0 ± 0.6	n.a.	–7.8 ± 1.6	–9.3 ± 0.4
ΔF_6,1	–90.6 ± 0.3	–90.9 ± 0.2	–90.7 ± 0.1	n.a.	–187.5 ± 1.6	–197.0 ± 1.3	n.a.	–5.6 ± 1.6	–15.6 ± 1.1
ΔF_7,1	–104.3 ± 0.2	–104.8 ± 0.2	–104.5 ± 0.1	n.a.	–218.1 ± 1.7	–207.1 ± 1.8	n.a.	–8.6 ± 1.7	2.0 ± 1.6
ΔF_8,1	–135.0 ± 0.2	–135.7 ± 0.2	–135.8 ± 0.1	n.a.	–289.5 ± 1.6	–292.5 ± 1.3	n.a.	–18.1 ± 1.6	–20.9 ± 1.1
ΔF_9,1	60.3 ± 0.3	60.5 ± 0.2	60.5 ± 0.1	n.a.	108.8 ± 1.5	124.8 ± 1.9	n.a.	–12.2 ± 1.6	3.9 ± 1.7
ΔF_10,1	–73.0 ± 0.3	–73.4 ± 0.2	–73.3 ± 0.1	n.a.	–162.6 ± 1.5	–160.0 ± 1.4	n.a.	–15.8 ± 1.6	–13.3 ± 1.2
ΔF_11,1	–47.8 ± 0.3	–48.0 ± 0.2	–48.2 ± 0.1	n.a.	–117.2 ± 1.5	–94.3 ± 1.4	n.a.	–21.1 ± 1.6	2.1 ± 1.2
ΔF_12,1	–55.3 ± 0.4	–54.9 ± 0.2	–54.8 ± 0.2	n.a.	–123.3 ± 1.6	–113.1 ± 1.9	n.a.	–13.5 ± 1.6	–3.6 ± 1.5
ΔF_13,1	–156.0 ± 0.3	–157.0 ± 0.2	–156.9 ± 0.1	n.a.	–340.2 ± 1.6	–338.5 ± 1.5	n.a.	–26.2 ± 1.7	–24.8 ± 1.3
ΔF_14,1	–130.8 ± 0.3	–131.6 ± 0.2	–131.6 ± 0.1	n.a.	–286.9 ± 1.6	–270.5 ± 1.7	n.a.	–23.7 ± 1.6	–7.4 ± 1.5
ΔF_15,1	–38.0 ± 0.5	–37.7 ± 0.2	–37.7 ± 0.1	n.a.	–102.3 ± 1.7	–79.1 ± 2.5	n.a.	–27.0 ± 1.7	–3.7 ± 2.3
ΔF_16,1	–123.1 ± 0.4	–123.2 ± 0.2	–123.3 ± 0.2	n.a.	–277.7 ± 2.1	–260.3 ± 2.4	n.a.	–31.2 ± 2.1	–13.6 ± 2.0

Energy units are kilojoules per mole. n.a. = not available.

Table 5

Number of Unique Visits of All Ligand End States and of the Ligand End State with the Longest Average Residence Time in Water for the Three Different Acceleration σ-Levels in the Equilibrium A-EDS Simulations

	unbound ligands			bound ligands
system	1σ	2σ	3σ	1σ	2σ	3σ
GRA2-all	113081	121863	129663	n.a.a	78277	14396
GRA2-1	7084	6366	6802	n.a.	2636	21
TRP-all	97020	99912	83688	249119	198880	75610
TRP-8	1328	427	128	2821	1446	15
PNMT-all	31382	18133	17138	n.a.	18741	10678
PNMT-2	5004	135	48	n.a.	11	2

n.a. = not available.

Energy units are kilojoules per mole. n.a. = not available. For system TRP, the calculated free-energy differences for all ligands in the unbound state were again similar for all three acceleration levels. However, for the bound ligands more different values were obtained, especially for the acceleration level of 3σ compared to 1σ or 2σ (Table ). For 3σ, convergence of the calculated values was slower both in water and in the protein (Figure S4 in the Supporting Information) and much less state transition was observed (Table and Figure S7 in the Supporting Information).

Table 3

Relative Free-Energy Differences ΔF≠1,1 between the Ligand End States for the Three Different Acceleration σ-Levels for System TRPa

	ΔF_i_≠1,1 unbound ligand			ΔF_i_≠1,1 bound ligand			calculated ΔΔF_i_≠1,1
	1σ	2σ	3σ	1σ	2σ	3σ	1σ	2σ	2σ
ΔF_2,1	–160.2 ± 0.8	–163.4 ± 1.2	–161.4 ± 0.8	–150.6 ± 1.5	–148.5 ± 1.2	–160.6 ± 1.3	9.6 ± 1.7	14.9 ± 1.7	0.8 ± 1.5
ΔF_3,1	–143.8 ± 0.9	–142.5 ± 0.3	–146.2 ± 1.7	–139.1 ± 0.8	–136.6 ± 0.6	–136.0 ± 0.8	4.7 ± 1.2	5.9 ± 0.7	10.2 ± 1.9
ΔF_4,1	70.7 ± 0.3	70.3 ± 0.3	71.0 ± 0.3	68.6 ± 0.3	66.6 ± 0.5	63.3 ± 1.4	–2.1 ± 0.4	–3.7 ± 0.6	–7.7 ± 1.4
ΔF_5,1	–171.5 ± 2.5	–166.2 ± 1.0	–166.0 ± 0.9	–158.9 ± 1.2	–161.5 ± 2.1	–158.3 ± 2.5	12.6 ± 2.8	4.7 ± 2.3	7.7 ± 2.7
ΔF_6,1	–25.2 ± 0.1	–25.1 ± 0.1	–25.3 ± 0.3	–24.1 ± 0.9	–25.4 ± 1.3	–19.6 ± 0.9	1.1 ± 0.9	–0.3 ± 1.3	5.7 ± 0.9
ΔF_7,1	–97.1 ± 0.5	–97.7 ± 0.8	–97.1 ± 0.6	–89.8 ± 0.6	–90.7 ± 0.6	–84.7 ± 1.0	7.3 ± 0.8	7.0 ± 1.0	12.4 ± 1.2
ΔF_8,1	–427.4 ± 0.6	–425.8 ± 1.0	–430.6 ± 1.4	–402.6 ± 0.6	–401.1 ± 1.6	–395.6 ± 1.9	24.8 ± 0.8	24.7 ± 1.9	35.0 ± 2.4

Energy units are kilojoules per mole.

Energy units are kilojoules per mole. For system PNMT, the acceleration level of 1σ (very strong acceleration) lead to partially different calculated free-energy differences between the ligand end states in water compared to the other acceleration levels (Table ). The difference was especially pronounced for the end state involving a charge-change (ligand 2), but also present for ligands 7 and 8, which have a large sulfonamide group at the R1 position. Possibly, the strong acceleration parameters determined for 1σ lead to overflattening of the energy landscape, which prohibits proper sampling of the most important regions of phase-space for these end states. For the ligands bound to the protein, sometimes different free-energy differences between the states were obtained for acceleration levels of 2σ and 3σ.

Table 4

Relative Free-Energy Differences ΔF≠1,1 between the Ligand End States for the Three Different Acceleration σ-Levels for System PNMTa

	ΔF_i_≠1,1 unbound ligand			ΔF_i_≠1,1 bound ligand			calculated ΔΔF_i_≠1,1
	1σ	2σ	3σ	1σ	2σ	3σ	1σ	2σ	3σ
ΔF_2,1	–376.3 ± 1.7	–439.3 ± 1.4	–440.8 ± 2.3	n.a.	–216.9 ± 2.8	–217.6 ± 2.8	n.a.	222.4 ± 3.1	223.2 ± 3.6
ΔF_3,1	5.6 ± 0.2	5.8 ± 0.3	6.1 ± 0.3	n.a.	0.8 ± 2.6	4.2 ± 1.6	n.a.	–5.0 ± 2.6	–1.9 ± 1.6
ΔF_4,1	8.7 ± 0.3	7.6 ± 0.3	8.4 ± 0.3	n.a.	–6.4 ± 2.9	4.7 ± 1.6	n.a.	–14.0 ± 2.9	–3.7 ± 1.6
ΔF_5,1	7.8 ± 0.4	7.9 ± 0.3	8.1 ± 0.3	n.a.	–2.0 ± 2.5	–2.4 ± 1.4	n.a.	–9.9 ± 2.5	–10.5 ± 1.4
ΔF_6,1	–24.5 ± 0.9	–29.1 ± 0.8	–30.0 ± 0.9	n.a.	–50.8 ± 2.2	–35 ± 1.5	n.a.	–21.7 ± 2.3	–5.0 ± 1.7
ΔF_7,1	–448.4 ± 0.9	–463.0 ± 1.7	–461.3 ± 1.3	n.a.	–465.4 ± 1.9	–475.8 ± 1.5	n.a.	–2.4 ± 2.5	–14.5 ± 2.0
ΔF_8,1	–482.1 ± 1.0	–498.0 ± 1.1	–501.0 ± 1.2	n.a.	–526.7 ± 2.0	–510.6 ± 1.3	n.a.	–28.7 ± 2.3	–9.6 ± 1.8
ΔF_9,1	13.9 ± 0.4	13.9 ± 0.3	13.8 ± 0.5	n.a.	–0.5 ± 2.1	5.6 ± 1.5	n.a.	–14.4 ± 2.1	–8.2 ± 1.6
ΔF_10,1	–161.9 ± 1.3	–163.1 ± 0.6	–163.8 ± 0.5	n.a.	–173.2 ± 2.8	–167.3 ± 2.6	n.a.	–10.1 ± 2.9	–3.5 ± 2.6

Energy units are kilojoules per mole. n.a. = not available.

Energy units are kilojoules per mole. n.a. = not available. Except for the differences described above, the calculated free-energy differences between the end-states are independent of the chosen acceleration level. Notably, they do differ significantly from the corresponding free-energy offset parameters ΔF (Tables S7–S9 in the Supporting Information), as the offset parameters describe the free-energy differences between the accelerated end states in the reference state (eq ), while the free-energy differences of the end states, ΔF≠1,1, are calculated for the original (i.e., nonaccelerated) Hamiltonians. Table collects the total number of unique visits of the individual states for all simulations. The state with the longest average residence time (the fewest transitions) is also listed, to give a lower bound to these values. It is most relevant to focus on the compounds that are sampled least to judge if the sampling is sufficient. A large number of unique samples is necessary to ensure proper sampling of each of the states. For the compounds in water, the total number of unique visits of all compounds is relatively independent of the acceleration level. Only for PNMT, with the largest differences between states, we observe a significantly lower value at 2σ and 3σ. The differences are more pronounced for the states that are sampled least, with a clearer drop of the number of visits for TRP and PNMT. In system GRA2, the ligands are arguably the most similar and there is no large difference observed. In the protein-bound states, the number of unique visits is much lower throughout, especially considering that the simulations are a factor 10 (GRA2, TRP) or 20 (PNMT) longer than the ones of the ligands in water. This is representative of the fact that the sampling is much more complex in a protein environment. During the parameter search for GRA2 and PNMT, the simulations at 1σ became unstable, suggesting a too strong acceleration. On the other hand, the number of unique visits at 3σ drops down considerably, suggesting that this acceleration level, which is most conservative and therefore possibly preferred, should only be used in conjunction with longer simulation times. n.a. = not available.

Comparison to Experiment and Other Methods

The binding free energies in Tables –4 are reported relative to the arbitrarily chosen first compound of the set. To compare results for the computed relative binding free energies independent of an arbitrary selection, an estimate of the binding free energy of the A-EDS reference state (average difference of the relative binding free energies to the experimental binding free energies) was used to calculate ligand binding free energies which can directly be compared to experimentally determined values; i.e., the relative binding free energy values were shifted by the average deviation to the experimental binding free energies to account for the binding free energy of the reference state.[19,57] This corresponds to a single fitting parameter to compare relative binding free energies to the experimental values. Tables S3–S5 report the individual binding affinities. The overall performance is summarized by computing the root-mean-square error (RMSE) with respect to the experimental values (Table ).

Table 6

Root-Mean-Square Errors (kJ·mol–1) Relative to the Experimental Values with Error Estimates Calculated by Leave-One-Out Cross Validationa

	GRA2	TRP	PNMT
A-EDS 1σ	n.a.	6.1 ± 0.1	n.a.
A-EDS 2σ	6.3 ± 0.2	6.0 ± 0.1	8.8 ± 0.1
A-EDS 3σ	6.0 ± 0.3	8.5 ± 0.2	3.8 ± 0.1
EDS	2.2b	n.a.	5.2e/4.9f
TI	3.9c	6.6d	3.7
OSP	1.9c	2.8d	n.a.

The individual values are reported in Tables S3–S5 in the Supporting Information. n.a. = not available.

A-EDS with fixed ΔEmax; ref (39).

Reference (14).

Reference (19).

EDS with s-parameter; ref (32).

RE-EDS; ref (36).

The individual values are reported in Tables S3–S5 in the Supporting Information. n.a. = not available. A-EDS with fixed ΔEmax; ref (39). Reference (14). Reference (19). EDS with s-parameter; ref (32). RE-EDS; ref (36). For the GRA2 ligands, both simulations with acceleration levels of 2σ and 3σ performed similarly in terms of agreement with the experimental data, with RMSEs of about 6 kJ/mol. A-EDS simulations previously performed using a fixed set energy barrier[39] using only the five ligands for which experimental data was available showed a lower RMSE of 2.2 kJ/mol. The RMSE of the calculations in this work to the previously reported A-EDS calculations was 9.0 and 6.6 kJ/mol for acceleration levels of 2σ and 3σ, respectively. The acceleration level in the previous simulations was between the acceleration for 2σ and 3σ used here. Furthermore, the previous simulations were significantly shorter (11 ns vs 110 ns for parameter search and equilibrium simulations in the protein system). However, the total number of visits to the five first end states lies between 927 (state 2) and 6982 (state 3) and is slightly better comparable to what we observed previously (51 to 2548 unique visits). Obviously, the longer simulation time is largely offset by the larger amount of states that needs to be sampled. The discrepancies between the two sets of simulations can rather be explained due to larger drifts in the overall conformation that are sampled. In system GRA2, two benzothiadiazine dioxide ligands are binding simultaneously to the protein–protein interface of the GRA2 dimer. At the acceleration level of 2σ, the distance between the centers of mass of the aromatic rings of the two ligands in the active site moved from initially 0.55 nm to an average value of 0.62 nm. Furthermore, the side chain of Ser108 was seen to change conformation, leading to a different environment of fluorine substituents (Figure ).

Figure 3

Different conformations of the side chain of Ser108 hydrogen-bonding to the carbonyl oxygen of Phe105 in system GRA2 at the end of A-EDS equilibrium simulations of the ligand end states bound to the protein, for an acceleration level of 2σ (red, GRA2 ligands in magenta) and 3σ (green, GRA2 ligands in light green). Also note the shift in ligand positions at an acceleration level of 2σ. The positions of the ligands in the simulation at 3σ largely correspond to the ones observed in the initial X-ray structure. Previously reported calculations using the same force-field but using the thermodynamic integration (TI) or the OSP approach yielded better agreement with experiment, with RMSE values of 3.9 and 1.9 kJ/mol, respectively.[14] These simulations were likely also too short to observe the conformational changes in the active site described above. The calculations using the TI method amounted to a total simulation time of about 50 ns, while the previous simulations with the OSP method was based on 20 ns of simulation time. For the TRP systems, calculated binding free energies were in better agreement with experimental data for acceleration levels of 1σ and 2σ than with an acceleration level of 3σ (Table S4 in the Supporting Information), possibly caused by insufficient state transitions for the given simulation time with 3σ. Moreover, simulations with 1σ and 2σ performed slightly better than previous calculations employing the same force-field and TI; however, a previously reported method employing OSP and the third-power fitting (TPF) approach for electrostatic interactions performed better than the A-EDS simulations.[19] Except for compound 2, where the TI calculations overestimated the binding affinity, the values obtained for A-EDS with 2σ agree closely to the TI data with a root-mean-square difference of 2.9 kJ/mol. The total simulation of the A-EDS simulation for each acceleration level was 130 ns, compared to 245 ns for the previously reported TI method and 52 ns for the previously reported OSP/TPF method. For the PNMT ligands, the free-energy values computed with an acceleration of 3σ fit experimental values much better than those computed with an acceleration of 2σ (see also Table S5 in the Supporting Information). However, more state transitions occurred, as expected, for an acceleration level of 2σ (Table and Figure S8 in the Supporting Information). Visual inspection of the trajectories revealed that for the simulation with acceleration parameters determined for 2σ, the S-adenosyl-l-methionine (SAM) cofactor in the PNMT enzyme was in a different conformation, thereby changing the direct environment for the PNMT ligands bound to the protein (see Figure ), explaining differences in calculated affinities of the end state for this simulation. A-EDS simulations with an acceleration level of 3σ performed similar in terms of comparison to experiment as previously reported values obtained with the same force-field and TI[48] and better than other EDS-based methods.[32,36] It is, moreover, noted that in the A-EDS simulations performed in this work ligand 2, to which the perturbation includes a net-charge change of the molecule, was included in the reference state, while it was omitted in previously reported EDS calculations. While the number of visits is very limited for this state (Table ) and the binding affinities are quite different from the TI values (Table S5), the A-EDS approach is able to include this state and correctly predicts it to be a nonbinder. For each A-EDS acceleration level a total simulation time of 320 ns was used, while for the simulations employing TI sampling was done for ∼1.5 μs, and for the most recently reported approach employing EDS (RE-EDS) sampling was done for ∼670 ns.

Figure 4

Different conformations of the S-adenosyl-l-methionine (SAM) cofactor in system PNMT at the end of A-EDS equilibrium simulations of the ligand end states bound to the protein, for an acceleration level of 2σ (red, PNMT ligand in magenta) and 3σ (green, PNMT ligand in light green). For all systems, either simulations with acceleration parameters of 2σ or 3σ lead to the best agreement of the calculated binding free energies with experimental values. For system TRP, very little state transitions occurred with an acceleration level of 3σ, leading to undersampling and less good agreement with experimental values. For systems GRA2 and PNMT (2σ) conformations were sampled that differ from the experimentally determined structures, possibly explaining less good agreement with experimental values for these simulations. This highlights a so-far unexplored challenge for longer simulations sampling many different states. Due to the sampling of unphysical reference states and the inclusion of compounds with low affinity, the active site conformation may diverge to irrelevant conformations. Possibly adding further slight restraints on the active site conformations could already be sufficient to avoid this behavior.[58] Simulations with an acceleration level of 3σ might perform generally better if enough simulation time is generated, as the energy landscape is less flattened and important energy minima are more pronounced.

Conclusion

In this work, we presented a novel automated parameter search scheme for A-EDS. The search scheme only requires a single chosen input-parameter, the anticipated acceleration level for flattening of the energy barriers between the EDS end states (σ-level). All other required parameters for equilibrium A-EDS simulations (A-EDS acceleration parameters Emax and Emin, and the free-energy offset parameters ΔF) are determined using this search scheme in a nonequilibrium parameter-search simulation prior to the A-EDS equilibrium simulation. We demonstrated the applicability of this scheme with three well-established protein–ligand model systems and showed that using this method, free-energy differences between multiple ligands can be computed from a single simulation in an automated way. The full automation of the parameter-search simulations represents an important step forward compared to previously described multistate EDS simulations. All ligand end states were sampled, including ligands to which the perturbation involves net-charge changes. We tested three different acceleration levels for the parameter-search simulations and conclude that in general it is better to choose a more conservative acceleration level, as important energy minima might be lost otherwise. However, a more conservative choice of acceleration levels requires longer simulation times, and in practice, both the importance of energy minima and available simulation time have to be considered when choosing the acceleration level. A remaining challenge is the occurrence of conformational changes in the protein–ligand systems due to unfavorable ligand end states and the unphysical nature of the EDS reference state. EDS ligands that are strongly accelerated and incorporate unfavorable end-states need to be restrained to the protein to prevent them from leaving the binding site and to sample irrelevant binding modes.

41 in total

1. Simple, Efficient, and Reliable Computation of Multiple Free Energy Differences from a Single Simulation: A Reference Hamiltonian Parameter Update Scheme for Enveloping Distribution Sampling (EDS).

Authors: Clara D Christ; Wilfred F van Gunsteren
Journal: J Chem Theory Comput Date: 2009-02-10 Impact factor: 6.006

2. Role of Molecular Dynamics and Related Methods in Drug Discovery.

Authors: Marco De Vivo; Matteo Masetti; Giovanni Bottegoni; Andrea Cavalli
Journal: J Med Chem Date: 2016-02-08 Impact factor: 7.446

3. Phase-space overlap measures. II. Design and implementation of staging methods for free-energy calculations.

Authors: Di Wu; David A Kofke
Journal: J Chem Phys Date: 2005-08-22 Impact factor: 3.488

4. Enveloping distribution sampling: a method to calculate free energy differences from a single simulation.

Authors: Clara D Christ; Wilfred F van Gunsteren
Journal: J Chem Phys Date: 2007-05-14 Impact factor: 3.488

5. Comparison of three enveloping distribution sampling Hamiltonians for the estimation of multiple free energy differences from a single simulation.

Authors: Clara D Christ; Wilfred F Van Gunsteren
Journal: J Comput Chem Date: 2009-08 Impact factor: 3.376

6. Comparison of enveloping distribution sampling and thermodynamic integration to calculate binding free energies of phenylethanolamine N-methyltransferase inhibitors.

Authors: Sereina Riniker; Clara D Christ; Niels Hansen; Alan E Mark; Pramod C Nair; Wilfred F van Gunsteren
Journal: J Chem Phys Date: 2011-07-14 Impact factor: 3.488

7. Simulations of the estrogen receptor ligand-binding domain: affinity of natural ligands and xenoestrogens.

Authors: B C Oostenbrink; J W Pitera; J H Meerman
Journal: J Med Chem Date: 2000-11-30 Impact factor: 7.446

8. Computational analysis of binding of P1 variants to trypsin.

Authors: B O Brandsdal; J Aqvist; A O Smalås
Journal: Protein Sci Date: 2001-08 Impact factor: 6.725

9. Efficient and Accurate Free Energy Calculations on Trypsin Inhibitors.

Authors: Anita de Ruiter; Chris Oostenbrink
Journal: J Chem Theory Comput Date: 2012-02-03 Impact factor: 6.006

10. Saturation Mutagenesis by Efficient Free-Energy Calculation.

Authors: Zuzana Jandova; Daniel Fast; Martina Setz; Maria Pechlaner; Chris Oostenbrink
Journal: J Chem Theory Comput Date: 2018-01-08 Impact factor: 6.006

7 in total

1. Machine Learning for Electronically Excited States of Molecules.

Authors: Julia Westermayr; Philipp Marquetand
Journal: Chem Rev Date: 2020-11-19 Impact factor: 60.622

2. Replica-Exchange Enveloping Distribution Sampling Using Generalized AMBER Force-Field Topologies: Application to Relative Hydration Free-Energy Calculations for Large Sets of Molecules.

Authors: Salomé R Rieder; Benjamin Ries; Kay Schaller; Candide Champion; Emilia P Barros; Philippe H Hünenberger; Sereina Riniker
Journal: J Chem Inf Model Date: 2022-06-08 Impact factor: 6.162

3. Extension of the Variational Free Energy Profile and Multistate Bennett Acceptance Ratio Methods for High-Dimensional Potential of Mean Force Profile Analysis.

Authors: Timothy J Giese; Şölen Ekesan; Darrin M York
Journal: J Phys Chem A Date: 2021-03-30 Impact factor: 2.781

4. Challenges Encountered Applying Equilibrium and Nonequilibrium Binding Free Energy Calculations.

Authors: Hannah M Baumann; Vytautas Gapsys; Bert L de Groot; David L Mobley
Journal: J Phys Chem B Date: 2021-04-27 Impact factor: 2.991

5. Umbrella Sampling Simulations of Carbon Nanoparticles Crossing Immiscible Solvents.

Authors: Anastasios Gotzias
Journal: Molecules Date: 2022-01-31 Impact factor: 4.411

6. Relative free-energy calculations for scaffold hopping-type transformations with an automated RE-EDS sampling procedure.

Authors: Benjamin Ries; Karl Normak; R Gregor Weiß; Salomé Rieder; Emília P Barros; Candide Champion; Gerhard König; Sereina Riniker
Journal: J Comput Aided Mol Des Date: 2022-01-03 Impact factor: 3.686

7. Rapid and accurate estimation of protein-ligand relative binding affinities using site-identification by ligand competitive saturation.

Authors: Himanshu Goel; Anthony Hazel; Vincent D Ustach; Sunhwan Jo; Wenbo Yu; Alexander D MacKerell
Journal: Chem Sci Date: 2021-05-25 Impact factor: 9.825

7 in total