Alexander D Wade1, David J Huggins1,2,3. 1. TCM Group, Cavendish Laboratory , University of Cambridge , 19 J J Thomson Avenue , Cambridge CB3 0HE , U.K. 2. Tri-Institutional Therapeutics Discovery Institute , Belfer Research Building, 413 East 69th Street, 16th Floor, Box 300 , New York 10021 , United States. 3. Department of Physiology and Biophysics , Weill Cornell Medicine , 1300 York Avenue , New York , New York 10065 , United States.
Abstract
We present an explicit solvent alchemical free-energy method for optimizing the partial charges of a ligand to maximize the binding affinity with a receptor. This methodology can be applied to known ligand-protein complexes to determine an optimized set of ligand partial atomic changes. Three protein-ligand complexes have been optimized in this work: FXa, P38, and the androgen receptor. The sets of optimized charges can be used to identify design principles for chemical changes to the ligands which improve the binding affinity for all three systems. In this work, beneficial chemical mutations are generated from these principles and the resulting molecules tested using free-energy perturbation calculations. We show that three quarters of our chemical changes are predicted to improve the binding affinity, with an average improvement for the beneficial mutations of approximately 1 kcal/mol. In the cases where experimental data are available, the agreement between prediction and experiment is also good. The results demonstrate that charge optimization in explicit solvent is a useful tool for predicting beneficial chemical changes such as pyridinations, fluorinations, and oxygen to sulfur mutations.
We present an explicit solvent alchemical free-energy method for optimizing the partial charges of a ligand to maximize the binding affinity with a receptor. This methodology can be applied to known ligand-protein complexes to determine an optimized set of ligand partial atomic changes. Three protein-ligand complexes have been optimized in this work: FXa, P38, and the androgen receptor. The sets of optimized charges can be used to identify design principles for chemical changes to the ligands which improve the binding affinity for all three systems. In this work, beneficial chemical mutations are generated from these principles and the resulting molecules tested using free-energy perturbation calculations. We show that three quarters of our chemical changes are predicted to improve the binding affinity, with an average improvement for the beneficial mutations of approximately 1 kcal/mol. In the cases where experimental data are available, the agreement between prediction and experiment is also good. The results demonstrate that charge optimization in explicit solvent is a useful tool for predicting beneficial chemical changes such as pyridinations, fluorinations, and oxygen to sulfur mutations.
In recent years, alchemical methods have
garnered increasing attention
in drug design.[1−6] In particular, free-energy perturbation (FEP) is now commonly used
by pharmaceutical companies because of improvements in efficiency,[7] more accurate force fields,[8,9] and
increases in computational power. Based on the Zwanzig equation,[10,11] it is common to use FEP calculations in drug design to calculate
the relative binding affinity of two molecules.[11−14] This relative free energy, ΔGAB, can be defined using the Zwanzig equation,
as shown in eq , as
the free-energy difference between thermodynamic states A and B.with k as Boltzmann’s
constant, T as temperature, EA and EB as the potential energies
of the system calculated using the Hamiltonian of state A and B, and
⟨.⟩A as a state average over system A. Typically,
in free-energy calculations, eq is expanded upon to include sampling from both states A and
B[15] or sampling from intermediate states.[7] This is done to improve the sampling overlap
between end states. The drawback here is that for every relative binding
affinity calculation, lengthy molecular dynamics (MDs) simulations
must be performed for all the end and intermediate states. If, however,
the perturbation between end states remains small enough, such that
the overlap between end states is large, eq is sufficient without any intermediate states.
Applying eq with no
intermediate states is referred to as single-step perturbation (SSP),
and this is the primary free-energy method used in this work. Numerous
studies have used SSP,[16−19] demonstrating that it is applicable to relative free-energy calculations[20,21] and can be significantly faster than standard FEP.[22] Most recently, the authors of this article have used SSP
to perform computational fluorine scanning.[23] We now apply SSP with the goal of optimizing ligand partial charges.Charge optimization was developed by Tidor and co-workers using
an implicit water treatment of electrostatics.[24,25] Poisson–Boltzmann calculations are performed on the bound
and unbound states in order to find the optimal partial charges of
a given molecule.[26−29] This approach has since been used by other academic groups,[30,31] employed in industries,[32] and been extended
to consider induced fit effects.[33] However,
the approach suffers from the deficiencies of all implicit water approaches:
it is unable to deal effectively with interfacial water molecules.
These occur commonly and are very difficult to treat effectively with
implicit solvent approaches. The accuracy of implicit water models
is lacking for many types of free-energy calculations,[34−36] particularly relevant here are the implicit model’s shortcomings
relative to explicit models in terms of binding free energies.[37] Additionally, while previous work has considered
flexibility in the ligand, the receptor and complex were assumed to
be rigid. It is known that this may play a significant role in binding
free energies.[33] Because of advances in
available computing power, explicit water approaches to charge optimization
are now possible. We propose to exploit these computational advances
by applying SSP to the bound and unbound states of small molecule
inhibitors to develop a method for electrostatic charge optimization
in explicit solvent.Combining SSP with explicit water MD calculations
and flexible
receptors and complexes has the potential to develop a more accurate
charge optimizer. To carry out these charge optimizations, we developed
a tool to automate their execution. This tool is freely available
at https://github.com/adw62/Ligand_Charge_Optimiser. Our ligand charge optimizer uses OpenMM[38] as both an MD engine and a tool to create the modified alchemical
systems. The software will generate all of the required mutant ligands
from an input wild-type ligand. These mutants are automatically parameterized,
built into the complex systems, simulated, and optimized.
Methods
We optimize the ligand partial charges for three protein test cases:
FXa, P38 kinase, and the androgen receptor. The chemical structures
of the ligands studied are shown in Table . The ligands were built from highly related
molecules in the Protein Databank[39] (PDB): 2RA0(40) (FXa), 3S3I(41) (P38), and 2NW4(42) (androgen
receptor). These small modifications are made from the PDB to allow
comparison with experimental data.[40−42] Superpositions of the
molecules in the original PDB and the modified molecules are shown
in Figures S2–S4.
Table 1
Name of the Target for Each System
with Ligands′ 2D Chemical Structure and the PDB IDs of the
Target Ligand Complex
System Setup
To prepare the FXa, P38, and androgen
receptor systems, the nonstandard residues were converted to their
standard equivalents with pdbfixer.[43] Selenomethionines
were changed to methionines and missing sidechains were added using
Schrödinger’s Preparation Wizard,[44] which was also used to assign the protonation state of
all ionizable residues. All buffer solvents and ions were removed.
The hydrogen atom positions were then built using tleap and force
field parameters, and partial charges were assigned from the AMBER
ff14SB force field.[9] Parameters for the
inhibitors were generated using Antechamber[45] with AMBER GAFF 2[46] and AM1-BCC.[47] These structures and parameters were then passed
to YANK’s[48] 0.23.7 automatic setup
pipeline to build solvated ligand–protein and ligand systems.
For solvation, TIP4P-EW[49] was used. A salt
concentration of 150 mM and any required counterion were added using
sodium and chloride ions. In every case, the edge of the solvation
box was set to be 15 Å from any atom of the receptor and ligand.
Molecular Dynamics
MD sampling in this work is collected
to compute SSP and FEP relative free energies, and the amount of MD
sampling varies for each application in this work and so is stated
explicitly for each case in the relevant section. All simulations
were performed with OpenMM 7.3.0[43] as follows.
First, OpenMM’s default minimizer was used to minimize all
structures. Then, equilibration was performed in the NPT ensemble at 300 K and 1 atm using a Langevin integrator and Monte
Carlo barostat for 250 ps. MD simulations were performed in the NPT ensemble using a time step of 2 fs. van der Waals interactions
were truncated at 11.0 Å with switching at 9.0 Å. Electrostatics
were modeled using the particle mesh Ewald method with a cutoff of
11.0 Å. All other simulation parameters were left as default.
Snapshots were collected every 5 ps and all simulations were performed
with constrained hydrogens.
Charge Optimization
The objective
function of this
optimization was ΔGbinding(q) – ΔGoriginal. ΔGbinding(q) was defined as
the difference in free energy between the bound and unbound states
of the ligand and target for a ligand with charges q. ΔGoriginal was defined as the difference in free energy between the bound and
unbound states of the ligand and target for a ligand with the charges
of the original unoptimized ligand. ΔGoriginal is thus a constant. We therefore constructed the optimization
problem as follows.where q denotes the charges of the
ligand and q denotes the charges for
iteration n of the optimization, m is the number of atoms in the ligand, and net charge is the total
net charge of the ligand. Equation constrains the net charge of the ligand to be constant. Equation constrains the root
mean square difference between the ligands′ original charges, q, and the q charges to be
less than some value of the root-mean-square deviation (rmsd) limit.
These limits were chosen to limit the change in ΔΔGtotal to a sensible range <10 kcal/mol. Without
this limit, the optimization continued to very large unphysical values
of ΔΔGtotal because atomic
partial charges can reach unphysical values. The results of an unconstrained
optimization can be seen in Figures S17–S19. Equation constrains
the perturbation to each atom to be less than 0.01e per iteration, where e is the elementary charge.
With this limit of 0.01e, a determination of how
much sampling was required to give converged calculation of ΔGbinding with a perturbation to each atom of
0.01e is shown in Figure S14. The amount of sampling needed was determined to be 2.5 ns, and
this was then the amount of sampling used in this work to calculate
the objective and gradient for each optimization step. The algorithm
which was used to find a local minimum in this objective function
was the SciPy 1.1.0[50] implementation of
the sequential least squares programming algorithm.[51]The primary calculations in this work were those
of ΔΔGbindings, calculated
as follows: first we applied SSP theory to calculate ΔGunperturbed→perturbed in the bound and
unbound states as shown in eq .Eperturbed and Eunperturbed are the potential energies of the
system calculated using the Hamiltonian of the perturbed system and
the unperturbed system, respectively. To change Hamiltonians, the
charges were switched from unperturbed to perturbed values; however,
Lennard-Jones, bonded, angle, and torsion parameters did not change.
ΔΔGbinding was then constructed
as shown in eq , with
ΔGunperturbed→perturbedbound and ΔGunperturbed→perturbedunbound as ΔGunperturbed→perturbed calculated in the bound and unbound states of the ligand and target.Note that ΔΔGbinding is
equal to the objective function in eq , if we take eqs and 4 and set the unperturbed state
as the ligand with the original unoptimized charges and the perturbed
state as the ligand with charges q.To calculate the gradient of the objective function
in each direction
in charge space, a finite difference was calculated as shown in eq .where h is a finite difference
of 0.00015e. The numerator of the rhs of eq is a ΔΔGbinding and can be calculated using an SSP approach
as detailed in eqs and 4 where q are the unperturbed charges and q + h are the perturbed charges. This calculation
of the gradient shows the advantage of SSP, as numerous (10–100
s) evaluations of ΔΔGbinding are required, one for each direction in charge space. This ΔΔGbinding is between molecules that are extremely
similar, differing only by 0.00015e in one atom’s
partial charge. There is therefore likely to be a large sampling overlap
between these states allowing SSP to be applied. Of note is that for
each finite difference calculation, the charge of the simulation box
has been changed by 0.00015e. The potential for finite
size effects[52] caused by this change was
investigated and the padding of the simulation with solvent was chosen
to negate these effects, see Figure S15.To delineate between the free-energy change for individual
optimization
steps and the free-energy change between the original and final optimized
ligands, these will be defined as ΔΔGstep and ΔΔGtotal, respectively. Because an SSP method is being used, efforts were
made to avoid poor overlap between the end states of any perturbations.
To achieve this, the system was resampled after every optimization
step. With the new sampling, the perturbed system could be redefined
as a new unperturbed system after each optimization step. If this
was not done, then the difference between the perturbed and unperturbed
systems would grow over the course of the optimization, reducing the
overlap in sampling and so reducing the applicability of SSP. Resampling
had an additional advantage as it allowed for a calculation of a reverse
alchemical step. Therefore, ΔΔGstep for both the forward and backward alchemical transformation was
calculated for every step, and the ΔΔGstep in the results are reported as an average of the
forward and backward transformations.
FEP Calculations
The optimization in this work gave
a set of optimized charges and an associated ΔΔGbinding for going from the original set of ligand
charges to the optimal set, named above as ΔΔGtotal. To validate the ΔΔGtotal given by the optimization, we compared it against
standard alchemical relative binding free-energy calculations using
the MBAR[7] estimator applied to the system
with the original and optimized set of charges as end states. These
calculations were performed using the Ligand Charge Optimiser package
as follows. For these calculations, 12 equally spaced lambda windows
(except when explicitly stated otherwise) were used. Between these
windows, the charge parameters were interpolated simultaneously from
the original to optimized charges. All windows were sampled independently
with 2 ns of Langevin dynamics totaling 24 ns of sampling. All simulation
conditions were identical to the optimization MD calculations described
above. The samples collected in each intermediate state were decorrelated
based on an estimate of the statistical inefficiency of the reduced
potential time series before carrying out the MBAR analysis with PyMBAR
3.0.1.[7]In addition to testing the
ΔΔGtotal, we also wished to
calculate the ΔΔGbinding for
a set of chemical changes informed by the optimal charges. This ΔΔGbinding is defined as ΔΔGdesigned and is the change in binding free energy
between the original ligand and a designed ligand with some chemical
mutation. To perform these calculations, the protocol was the same
for the full FEP calculation of ΔΔGtotal but with the following additional considerations. Now,
in addition to the charges, the van der Waals and bonded terms were
all interpolated simultaneously from the original to the designed
state. In the case of hydrogen to fluorine mutations, the original
hydrogen was constrained; therefore, its associated C–H bond
could not be interpolated to a C–F bond. When neglecting the
interpolation of this bond, the fluorine appeared at the position
of the hydrogen, instead of the true physical position of the fluorine.
To avoid this issue, for fluorinations, we used a hybrid topology
approach where a massless interaction site at the position of the
mutation was added. This virtual site represents the fluorine and
its position was defined relative to the position of the parent hydrogen
such that the C–F distance is always 1.24 times the C–H
distance.[53] When mutating a hydrogen to
fluorine, the relevant hydrogen was turned off and fluorine Lennard-Jones
and charge parameters were applied to the additional site. When simulating
these systems, all bonds to hydrogen were constrained. Because the
position of the fluorine was defined relative to the position of its
parent hydrogen, it was also implicitly constrained. We therefore
made the assumption that the C–F bond length oscillations were
negligible. To prevent the hybrid topologies from interacting, the
additional sites were excluded from interacting with their parent
hydrogens.
Results
Using the optimization methodology
mentioned above, the partial
atomistic charges of three ligand–target systems are optimized.
For each optimization, we would like to inspect the convergence over
the number of steps. A good metric to analyze the results of the optimization
is the set of optimized charges taken as a vector. In the methodology,
a mention was made to limiting the rmsd between the original and optimized
charges to some value of the rmsd limit. The values which are chosen
for the rmsd limit are 0.01, 0.03, and 0.05e. The
optimization is therefore repeated with the rmsd bound of these three
values. With an rmsd bound of 0.01e, the optimizer
is limited to seven steps as adequate convergence is seen at this
point. For a larger rmsd, the convergence is slower and therefore
for optimizations with rmsd bounds of 0.03 and 0.05e, the optimizer is limited to 20 steps. To assess convergence across
simulation steps, we take the dot product of the normalized vector
of new charges with the normalized vector of original charges for
each step of the optimization (see Figure ).
Figure 1
Dot product of the normalized optimized charges
with the normalized
original charges showing variation of charge vector direction with
step. Results are shown for rmsd limits 0.01, 0.03, and 0.05e with the FXa, P38, and androgen receptor systems labeled
(a–c), respectively.
Dot product of the normalized optimized charges
with the normalized
original charges showing variation of charge vector direction with
step. Results are shown for rmsd limits 0.01, 0.03, and 0.05e with the FXa, P38, and androgen receptor systems labeled
(a–c), respectively.Figure shows that
the direction of the charge vectors over all systems and rmsds are
well converged. The direction of these charge vectors represents where
the charge is being applied on the molecule and this is the information
that will be used in the following section to make chemical mutations
to improve ΔΔGbinding. It
can also be seen that the dot product between the original and optimized
charges is different for different rmsds. To quantify this difference,
the dot product between the set of optimal charges obtained for rmsd
0.01 with 0.03 and 0.05e can be taken and the results
of these projections can be seen in Table . Here, we see that sets of charges for the
same system are pointing in the same direction. Thus, only the value
of the charge changes are dependant on the rmsd, while the direction
and relative magnitude of the charge changes are completely consistent.
This is an important result because it shows that the design principles
identified by the approach will not depend on the arbitrary choice
of the rmsd. The invariance in where the charge is being applied can
also be seen by the eye if the atoms are colored by change in charge.
Figures illustrating this are presented in Figures S5–S13 for all sets of optimized charge.
Table 2
Dot Products of the Normalized Vector
of Optimal Charges Using an rmsd of 0.01 q with the Normalized Vector of Optimal Charges Using
Different rmsds
rmsd (e)
FXa
P38
androgen receptor
0.01
1.00
1.00
1.00
0.03
1.00
1.00
1.00
0.05
0.99
0.98
0.99
In Table , we can
see the dot product of the optimized charges from the optimization
with an rmsd of 0.01e with themselves returns 1.00
as expected. The dot product of the vector of charges with rmsd =
0.01e and with rmsd = 0.03e also
returns 1.00 as these vectors are extremely similar in direction.
The dot product of the vector of charges with rmsd = 0.01e and with rmsd = 0.05e returns approximately 1.00
as these vectors are extremely similar in direction but not as close
as 0.01e with 0.03e. To summarize,
each optimization for a system added the charge in approximately the
same place. It is only the magnitude of this charge that varies with
rmsd. We can also look at the convergence of ΔΔGstep with the optimization step, and this can
be seen in Figure .
Figure 2
Cumulative sum of ΔΔGstep averaged over three replicates for each step of the optimizer. Three
optimizations are shown with rmsd bounds of 0.01, 0.03, and 0.05e, with the FXa, P38, and androgen receptor systems labeled
(a–c), respectively.
Cumulative sum of ΔΔGstep averaged over three replicates for each step of the optimizer. Three
optimizations are shown with rmsd bounds of 0.01, 0.03, and 0.05e, with the FXa, P38, and androgen receptor systems labeled
(a–c), respectively.With an rmsd of 0.01e, Figure demonstrates that ΔΔGstep is well converged for all systems. For
an rmsd of 0.03 or 0.05e, the results are only well
converged for the androgen receptor system, Figure c. This suggests that ΔΔGstep for the optimized set of charges is dependent
on the rmsd and ΔΔGstep is
slow to converge for larger ligands such as those in the P38 and FXa
test cases. ΔΔGtotal can be
calculated between the original and the optimized charges with the
cumulative sum of the SSP ΔΔGstep for all optimization steps. This gives a ΔΔGtotal that will be termed ΔΔGtotalSSP. We
compare this SSP ΔΔGtotal for
each set of optimized charges with full FEP calculations, see Table . These FEP calculations
use the original and optimized charges as the two end states and the
resulting ΔΔGtotal will be
termed ΔΔGtotalFEP.
Table 3
Calculated ΔΔGtotal for the Set of Optimal Chargesa
FXa
rmsd (e)
0.05
0.03
0.01
ΔΔGtotalFEP [kcal/mol]
–8.7,
–6.3,
–3.1,
[−9.2, −8.2]
[−6.5, −6.1]
[−3.4, −2.9]
ΔΔGtotalSSP [kcal/mol]
–11.3,
–8.1,
–3.9,
[−12.4, −10.1]
[−9.1, −7.0]
[−4.4, −3.3]
P38
rmsd (e)
0.05
0.03
0.01
ΔΔGtotalFEP [kcal/mol]
–9.4,
–6.6,
–3.2,
[−10.8, −8.0]
[−7.2, −6.0]
[−3.4, −3.1]
ΔΔGtotalSSP [kcal/mol]
–11.1,
–8.3,
–3.5,
[−11.5, −10.8]
[−8.7, −7.9]
[−4.0, −3.1]
Androgen Receptor
rmsd
(e)
0.05
0.03
0.01
ΔΔGtotalFEP [kcal/mol]
–11.5,
–8.8,
–4.2,
[−12.0, −10.9]
[−8.8, −8.8]
[−4.2, −4.1]
ΔΔGtotalSSP [kcal/mol]
–11.9,
–8.9,
–4.2,
[−12.3, −11.5]
[−9.2, −8.7]
[−4.4, −4.0]
SSP ΔΔGtotal values are calculated by summing the average
of
forward and backward SSP calculations made for each step of the optimizer.
FEP ΔΔGtotal values are calculated
from an alchemical transformation from the original charges to the
optimal charges. SSP and FEP predictions are reported as the mean
of three replicates with 95% confidence interval reported between
square brackets computed as mean ± t2·SEM,
where t2 is the t-distribution statistic
with two degrees of freedom, and SEM is the standard error of the
mean computed from the sample standard deviation of the three independent
replicate predictions.
SSP ΔΔGtotal values are calculated by summing the average
of
forward and backward SSP calculations made for each step of the optimizer.
FEP ΔΔGtotal values are calculated
from an alchemical transformation from the original charges to the
optimal charges. SSP and FEP predictions are reported as the mean
of three replicates with 95% confidence interval reported between
square brackets computed as mean ± t2·SEM,
where t2 is the t-distribution statistic
with two degrees of freedom, and SEM is the standard error of the
mean computed from the sample standard deviation of the three independent
replicate predictions.Table shows that
the SSP and FEP calculations well agree with an rmsd of 0.01e (differing by less than 1.0 kcal/mol in all cases). For
an rmsd of 0.03 and 0.05e, SSP and FEP are less well
agreed (differing by more than 1.0 kcal/mol in some cases). Table also shows clearly
that changing the rmsd changes the calculated ΔΔGtotal. The reason is that increasing the rmsd
bound increases how much the charges can be changed and so increases
the change in ΔΔGtotal. However,
as discussed above, the convergence of ΔΔGtotal is an unnecessary condition, providing no additional
information. It is only critical that the direction of the charge
vectors are well converged and consistent for all rmsd values for
all test cases, which has been shown in Figure and Table , as it is this information that will inform what chemical
mutations are proposed for the ligands. As such, we then use where
the optimizer has placed the charge as a design tool to generate ideas
for beneficial chemical mutations and this is presented in Figure .
Figure 3
Panels (a–c) show
the FXa, P38, and androgen receptor ligands
with atoms colored by change in charge relative to the original partial
charges. The optimized charge is taken from the optimization with
an rmsd bound of 0.03e. Blue represents atoms which
are more negative, and red represents atoms which are more positive.
Selected sites for chemical modification are numbered.
Panels (a–c) show
the FXa, P38, and androgen receptor ligands
with atoms colored by change in charge relative to the original partial
charges. The optimized charge is taken from the optimization with
an rmsd bound of 0.03e. Blue represents atoms which
are more negative, and red represents atoms which are more positive.
Selected sites for chemical modification are numbered.We developed specific design ideas to improve ΔΔGbinding based on the changes in charge. First,
analyzing the FXa ligand, three options are selected:Replacing the hydrogen with a fluorine
at position 1a.Replacing the nitrogen
with a carbon at position 2a.Replacing
one or more hydrogen atoms with a fluorine
atom on the methyl group at position 3a.Analyzing the P38 ligand, four options are selected:Replacing the hydrogen with a fluorine
at position 1b
or 4b.Replacing the nitrogen with a
carbon at position 1b
or 4b.Replacing the oxygen with a sulfur
at position 2b.Replacing one or more
hydrogen atoms with a fluorine
atom on the methyl group at position 3b.The final set of changes applies to the ligand of the androgen
receptor with three options selected:Replacing the oxygen with a sulfur at position 1c and
2c.Replacing the hydrogen with a fluorine
at position 3c,
4c, or 5c.Replacing the bonded carbon
with a nitrogen at positions
4c or 5c.ΔΔGdesigned for all these
changes was calculated using the FEP protocol discussed in the Methods section. Each FEP calculation was performed
in triplicate, and the averaged results of these calculations can
be seen in Table .
Table 4
ΔΔGdesigned for Proposed Chemical Mutations to the FXa, P38, and
Androgen Receptor Ligands Calculated with FEPa
The positions denoted numerically
correspond to numerical positions in Figure . 2D structures of mutation are presented
in the column-labeled mutant. FEP predictions are reported as the
mean value of three replicates with 95% confidence interval reported
between square brackets computed as mean ± t2·SEM, where t2 is the t-distribution
statistic with two degrees of freedom, and SEM is the standard error
of the mean computed from the sample standard deviation of the three
independent replicate predictions. The asterisk label * indicates
single or double fluorinations of a methyl group. These are averaged
over every hydrogen or pair of hydrogen in the methyl group and as
such, these data represent the averaging of nine replicates with the
confidence interval reported such that t2 is now
the t-distribution statistic with eight degrees of
freedom. The obelisk label † denotes calculations that were
slow to converge and run with 24 lambda windows of 2 ns. The diesis
‡ label denotes data taken from previous work of the authors.[23]
The positions denoted numerically
correspond to numerical positions in Figure . 2D structures of mutation are presented
in the column-labeled mutant. FEP predictions are reported as the
mean value of three replicates with 95% confidence interval reported
between square brackets computed as mean ± t2·SEM, where t2 is the t-distribution
statistic with two degrees of freedom, and SEM is the standard error
of the mean computed from the sample standard deviation of the three
independent replicate predictions. The asterisk label * indicates
single or double fluorinations of a methyl group. These are averaged
over every hydrogen or pair of hydrogen in the methyl group and as
such, these data represent the averaging of nine replicates with the
confidence interval reported such that t2 is now
the t-distribution statistic with eight degrees of
freedom. The obelisk label † denotes calculations that were
slow to converge and run with 24 lambda windows of 2 ns. The diesis
‡ label denotes data taken from previous work of the authors.[23]The
atoms indicated by the optimization to beneficially be more
negative in Figure line up with experimental work on these test cases.[40−42] Mutants 1, 6, and 19 are predicted by FEP to be beneficial (−2.2,
−2.2, and −2.5 kcal/mol, respectively), and this is
in good agreement with experimental data (−2.1, −2.3,
and −1.1 kcal/mol, respectively). Experimental data do not
exist for the remaining proposed mutations. However, 73% of the mutations
in Table are predicted
to be favorable by FEP. Both the FXa and androgen systems have a higher
success rate with 80 and 89% of ideas, respectively, from charge optimization
being beneficial as assessed by FEP. P38 has a lower (though still
promising) success rate with 50% of mutations being beneficial as
assessed by FEP.
Conclusions
We have demonstrated
ligand charge optimization in explicit solvent
to be a useful tool to rationally design ligands with improved binding
affinities. The electrostatics of three ligand–receptor systems
were systematically optimized using the alchemical SSP method, yielding
sets of optimal ligand charges. These sets of optimal charges were
used to generate design principles for chemical mutations to the ligands
that would yield improved receptor binding affinity. These chemical
mutations were assessed with a more rigorous FEP method. Using FEP,
73% of the predicted chemical mutations were found to be beneficial.
The average improvement of the beneficial mutations was approximately
1 kcal/mol. In three of these cases, experimental data exist and are
in excellent agreement with calculations, with mutants 1, 6, and 19
in Table predicted
by FEP to be beneficial (−2.2, −2.2, and −2.5
kcal/mol, respectively) compared to the experimental data (−2.1,
−2.3, and −1.1 kcal/mol, respectively).The major
advantage of SSP shown in this work is the calculation
of the gradient. SSP allows for many highly related mutations to be
assessed quickly, as is required to calculate the gradient via a finite
difference method. For comparison, the collection of 2.5 ns of sampling
for the FXa system with 99 000 and 13 000 atoms in the
complex and solvent systems, respectively, takes 29 min. To calculate
a gradient from this sampling takes 15 min and so, including sampling,
this totals to 44 min per gradient. The calculation of a perturbation
of 0.00015e with full FEP (assuming 1.0 ns of sampling
converges ΔΔGbinding, see Figure S16 for convergence plot) takes 14 min.
With 58 charges in the FXa ligand, which must all be perturbed in
the complex and solvent systems, this gives approximately 27 h to
calculate one gradient. This advantage would only be compounded if
a more complex optimization scheme, which required a calculation of
the Hessian, was used. Both these calculations of the gradient are
run in parallel (see the Supporting Information for parallelization strategy) across 4 NVIDIA P100 GPUs using OpenMM
7.3.0 and CUDA 10.0.This ligand charge optimization methodology
could be easily extended
by considering the optimization of any other parameters of the force
field with respect to the binding free energy. For example, the van
der Waals parameters could be “optimized”. Additionally,
the calculation for the gradient of force field parameters with respect
to potential energies could be used in the systematic optimization
of a small molecule force field. Because here we have demonstrated
a methodology for quickly calculating gradients of force field parameters
with respect to free energy, this could lead to some interesting studies
of force field optimization using experimental ligand–receptor
binding energies as a target data set. This method is relatively unique
in providing drug design principles from alchemical free-energy calculations
along with a rationale for increase or decrease in binding because
of specific changes to the ligand.[54]In summary, we have presented a novel technique for identifying
partial charges that optimize protein–ligand binding affinities
and highlighted ways in which these predictions can be turned into
design principles for drug discovery. The method is fast compared
to traditional FEP, easy to interpret, and simple to test by using
more thorough approaches such as MBAR.
Authors: Hans W Horn; William C Swope; Jed W Pitera; Jeffry D Madura; Thomas J Dick; Greg L Hura; Teresa Head-Gordon Journal: J Chem Phys Date: 2004-05-22 Impact factor: 3.488
Authors: Lingle Wang; Yujie Wu; Yuqing Deng; Byungchan Kim; Levi Pierce; Goran Krilov; Dmitry Lupyan; Shaughnessy Robinson; Markus K Dahlgren; Jeremy Greenwood; Donna L Romero; Craig Masse; Jennifer L Knight; Thomas Steinbrecher; Thijs Beuming; Wolfgang Damm; Ed Harder; Woody Sherman; Mark Brewer; Ron Wester; Mark Murcko; Leah Frye; Ramy Farid; Teng Lin; David L Mobley; William L Jorgensen; Bruce J Berne; Richard A Friesner; Robert Abel Journal: J Am Chem Soc Date: 2015-02-12 Impact factor: 15.419
Authors: Katarina Roos; Chuanjie Wu; Wolfgang Damm; Mark Reboul; James M Stevenson; Chao Lu; Markus K Dahlgren; Sayan Mondal; Wei Chen; Lingle Wang; Robert Abel; Richard A Friesner; Edward D Harder Journal: J Chem Theory Comput Date: 2019-03-04 Impact factor: 6.006
Authors: Peter Eastman; Mark S Friedrichs; John D Chodera; Randall J Radmer; Christopher M Bruns; Joy P Ku; Kyle A Beauchamp; Thomas J Lane; Lee-Ping Wang; Diwakar Shukla; Tony Tye; Mike Houston; Timo Stich; Christoph Klein; Michael R Shirts; Vijay S Pande Journal: J Chem Theory Comput Date: 2012-10-18 Impact factor: 6.006