Eko Aditya Rifai1, Marc van Dijk1, Nico P E Vermeulen1, Arry Yanuar2, Daan P Geerke1. 1. AIMMS Division of Molecular and Computational Toxicology, Department of Chemistry and Pharmaceutical Sciences , Vrije Universiteit Amsterdam , De Boelelaan 1108 , 1081 HZ Amsterdam , The Netherlands. 2. Faculty of Pharmacy , Universitas Indonesia , Depok 16424 , Indonesia.
Abstract
Binding free energy (ΔGbind) computation can play an important role in prioritizing compounds to be evaluated experimentally on their affinity for target proteins, yet fast and accurate ΔGbind calculation remains an elusive task. In this study, we compare the performance of two popular end-point methods, i.e., linear interaction energy (LIE) and molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA), with respect to their ability to correlate calculated binding affinities of 27 thieno[3,2-d]pyrimidine-6-carboxamide-derived sirtuin 1 (SIRT1) inhibitors with experimental data. Compared with the standard single-trajectory setup of MM/PBSA, our study elucidates that LIE allows to obtain direct ("absolute") values for SIRT1 binding free energies with lower compute requirements, while the accuracy in calculating relative values for ΔGbind is comparable (Pearson's r = 0.72 and 0.64 for LIE and MM/PBSA, respectively). We also investigate the potential of combining multiple docking poses in iterative LIE models and find that Boltzmann-like weighting of outcomes of simulations starting from different poses can retrieve appropriate binding orientations. In addition, we find that in this particular case study the LIE and MM/PBSA models can be optimized by neglecting the contributions from electrostatic and polar interactions to the ΔGbind calculations.
Binding free energy (ΔGbind) computation can play an important role in prioritizing compounds to be evaluated experimentally on their affinity for target proteins, yet fast and accurate ΔGbind calculation remains an elusive task. In this study, we compare the performance of two popular end-point methods, i.e., linear interaction energy (LIE) and molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA), with respect to their ability to correlate calculated binding affinities of 27 thieno[3,2-d]pyrimidine-6-carboxamide-derived sirtuin 1 (SIRT1) inhibitors with experimental data. Compared with the standard single-trajectory setup of MM/PBSA, our study elucidates that LIE allows to obtain direct ("absolute") values for SIRT1 binding free energies with lower compute requirements, while the accuracy in calculating relative values for ΔGbind is comparable (Pearson's r = 0.72 and 0.64 for LIE and MM/PBSA, respectively). We also investigate the potential of combining multiple docking poses in iterative LIE models and find that Boltzmann-like weighting of outcomes of simulations starting from different poses can retrieve appropriate binding orientations. In addition, we find that in this particular case study the LIE and MM/PBSA models can be optimized by neglecting the contributions from electrostatic and polar interactions to the ΔGbind calculations.
A quantitative knowledge
of protein–ligand binding affinities
is essential in understanding molecular recognition; hence, efficient
and accurate binding free energy (ΔG) calculations are a central goal in computer-aided
drug design.[1] An arsenal of ΔG calculation approaches
is available, ranging from rigorous alchemical methods such as free
energy perturbation (FEP)[2] and thermodynamic
integration (TI)[3] to fast empirical scoring
functions featured in molecular docking.[4] The latter are prominent in predicting protein–ligand binding
poses and in discriminating binders and nonbinders within large chemical
databases,[5] but they typically lack accuracy
in quantitatively ranking and predicting ΔG values.[6] In contrast, rigorous alchemical methods may provide reliable estimates
for ΔG but
require extensive sampling of multiple intermediate nonphysical states
and are thus computationally more expensive and still impractical
for use in high-throughput scenarios.[7]Compared to these counterparts, the alternate end-point methods
offer an intermediate in terms of effectiveness and efficiency in
ΔG computation
by allowing one to explicitly explore ligand, protein–ligand,
and solvent configurational space in the protein–ligand bound
and unbound states only.[8] This provides
advantages both over rigorous free energy calculations (in terms of
efficiency) and empirical scoring functions (in terms of potential
accuracy). Frequently applied end-point methods make use of linear
interaction energy (LIE) theory[9] and the
molecular mechanics/Poisson–Boltzmann surface area (MM/PBSA)
approach.[10]In LIE, ΔG is assumed to be linearly
proportional with the differences in van
der Waals and electrostatic interactions involving the ligand and
its environment, as obtained from simulations of its protein-bound
and unbound states in solvent. Differences in these interactions (modeled
using Lennard-Jones (LJ) and Coulomb potential-energy functions) are
scaled by LIE parameters α and β, respectively.[9] Originally, β was set to several fixed
values according to a series of studied systems.[9,11−15] Later, it turned out that fixed values for β are usually only
suitable for particular systems of interest and not generally transferable
between different systems.[16] To mitigate
this, a proposal was made to treat both α and β as freely
adjustable parameters that can be fitted based on a set of experimentally
determined ΔG values.[17,18] The fitted values of α and β
(and optionally offset parameter γ) determine the LIE scoring
function to be used for predicting ΔG for ligands with unknown affinity, which is given
by[9]with γ set to zero in this
work (unless
noted otherwise), andandThe
terms on the right-hand side in eqs and 3 are the MD-averaged van
der Waals (vdw) and electrostatic
(ele) interaction energies, respectively, between
the ligand (lig) and its surroundings (surr) as obtained in complex with the protein bound or unbound in solvent.Another popular end-point method
is molecular mechanics combined
with Poisson–Boltzmann and surface area (MM/PBSA). MM/PBSA
calculations can be performed using either results from protein–ligand
complex simulations in a single-trajectory (one-average) setup or
from three separate simulations per compound (i.e., of the complex,
the protein, and the unbound ligand) in a multi-trajectory or three-average
setup.[19] Use of the single-trajectory approach
is more widespread owing to its simplicity, efficiency, precision,
and accuracy compared to the multi-trajectory setup.[6,20,21] This single-trajectory approach
of MM/PBSA resembles LIE in a way that they both do not account explicitly
for changes in internal energy and configurational entropy of the
ligand and protein upon binding. A difference is that single-trajectory
MM/PBSA assumes the conformational distribution for the bound and
unbound ligand to be the same, while LIE does not.[22] Molecular dynamics (MD) trajectories of the protein–ligand
complex as obtained in the one-average strategy can be used to evaluate
each free energy term on the right-hand side of ΔG within the following equation[10,23]In eq , G represents
the free energy of
the complex, and G and G represent
the free energies of the unbound protein and ligand, respectively.
The separate free energy terms G for the protein, ligand, or protein–ligand complex
are each quantified as[23]where V comprises bond-stretch, angle-bend, torsion, and improper-dihedral
energies, and V and V are the van der Waals and
electrostatic nonbonded interaction energies, respectively. Together,
the sum of these terms make up the vacuum MM energy terms, while G and G constitute the solvation free
energies calculated using a continuum (implicit) solvation model,
representing the free energy change due to converting a solute in
vacuum into its solvated state. In MM/PBSA, G is obtained by solving the Poisson–Boltzmann
equation, while the G term is often approximated by a solvent accessible surface
area (SASA) term.[24] Replacing Poisson–Boltzmann
with Generalized Born[25] for calculating G is on the basis of the
alternative MM/GBSA approach.[10] Here, T and S denote the temperature of the system
and the entropy of the solute in vacuum, respectively. The TS terms entering eq via G, G, and G account for the change
in conformational entropy of the protein–ligand complex upon
ligand binding. This can be approximated, for example, via normal-mode
analysis of the protein–ligand coordinates obtained from the
simulation of the complex. Often this term is neglected,[26] which is justified, for example, by the typical
interest in relative binding free energies of similar compounds.As previously reported by Genheden and Ryde[27] among others, LIE can be more compute efficient up to almost
1 order of magnitude when compared to MM/GBSA (which is due to the
entropy[27] and polar term[28] in eq ),
while the relative accuracy of both methods in predicting experimental
data can be system dependent, and this conclusion also applies to
MM/PBSA.[27] In the current work, we evaluate
if LIE can achieve similar (or higher) accuracy compared to MM/PBSA
in calculating binding affinities of 27 thieno[3,2-d]pyrimidine-6-carboxamide
analogs for the sirtuin 1 (SIRT1) receptor. SIRT1 is a member of the
sirtuin family which is responsible for regulating and/or deacetylating
stress resistance and metabolism processes,[29] and inhibition of SIRT1 has been proposed for treating cancer, HIV-1
infections, mental retardation syndrome, and parasitic diseases.[30] Results will also be compared to a MM/GBSA model
of Karaman and Sippl[31] for these ligands
binding to SIRT1.Although LIE needs experimental data to train
the model parameters
in eq , their calibration
gives access to direct estimates of ΔG for individual (query) compounds. This
makes it straightforward to use a Boltzmann-like weighting scheme
to combine results of multiple MD simulations starting from different
protein conformations and/or ligand binding modes into a single calculation
per compound.[32,33] Previously, we successfully used
this strategy to compute binding affinities for flexible proteins
such as Cytochrome P450s that can bind substrates and/or inhibitors
in multiple ways.[33−36] In the current work, we evaluate the potential of this strategy
to compute SIRT1 binding free energies by concatenating results of
MD simulations starting with binding poses obtained from unsupervised
docking and clustering. In addition, we investigate if the Boltzmann-like
weighting scheme can identify binding poses as those manually selected
by Karaman and Sippl[31] in their MM/GBSA
study as the most probable ones. In a final step, we monitor the contributions
from the different terms in the central LIE and MM/PBSA equations
to the calculations in (trends in) binding affinity, in order to evaluate
the possibility to further increase efficiency and accuracy in computing
SIRT1 affinities.
Methods
Reference Data and Input
File Preparation
Reference
data for the SIRT1 ligand set (i.e., experimentally determined values
ΔG for protein–ligand
binding free energies) were taken from Disch et al.[37] (Table ). All 27 compounds have a thieno[3,2-d]pyrimidine-6-carboxamide
scaffold, and ΔG ranges between −29.3 and −50.3 kJ mol–1. Values for ΔG were consistently obtained from the set of IC50 inhibition
constants reported by Disch et al., using the Cheng–Prusoff
relation[38,39]where R is the gas
constant, T is the temperature, and the concentration
of agonist used
to measure the IC50 data is assumed to be equal to its
EC50 value.[39]
Table 1
Compound ID, Molecular Structure,
and Experimental Values for Binding Free Energies (ΔG) of SIRT1 Inhibitors Used
in the Current Study, with ΔG Values Derived from Experimental Data Obtained by Disch et
al.[37],a
The compound numbering (IDs)
as listed below is the same as used by Disch et al.[37]
The compound numbering (IDs)
as listed below is the same as used by Disch et al.[37]Analogous to
previous MD studies,[31,40] the SIRT1
crystal structure with PDB ID 4I5I(41) was used
as the protein template structure for docking and subsequent MD. Along
with the protein atom coordinates, the positions of the zinc ion and
one of the water molecules in the binding pocket[31] were retained to obtain the template. Protein structure
preparation was performed using the QuickPrep function of Molecular
Operating Environment (MOE) version 2015.10[42] to add hydrogens, assign protonation states and partial charges,
and perform energy minimization prior to docking. During minimization,
the protein and zinc ion were described using the Amber14SB force
field.[43]To obtain starting structures
for ligand binding poses, two alternative
strategies were used: ligand–binding poses were either taken
directly from Karaman and Sippl[31] or obtained
from our docking. An advantage of using the Karaman poses can be that
these were carefully selected for all ligands such that their scaffold
showed maximal structural overlap with the scaffold of ligands 11c, 28, and 31, which were co-crystallized
in PDB structures 4JSR, 4JT8, and 4JT9 of the homologous
SIRT3 protein, respectively,[37] after fitting
these onto the 4I5I SIRT1 structure. In this way, the superimposed poses of all ligands
resemble the poses of the co-crystallized inhibitors, which suggests
that they occupy both the acetyl lysine substrate channel and the
nicotinamide C-pocket of SIRT1 that overlaps with the NAD+ binding channel.[31,37]Ligand-binding poses were
also generated using the PLANTS (Protein–Ligand
Ant System) docking software version 1.2[44] with the speed 1 setting and evaluated using the
ChemPLP scoring function.[45] Coordinates
for the docking center were defined as the average of the center-of-mass
of ligands in the Karaman binding poses, and a docking radius of 0.8
nm was used. For each ligand, 100 docking poses were generated and
subjected to hierarchical clustering with the maximum number of clusters
set to 3. For each cluster, a representative pose with the smallest
deviation to the average cluster coordinate set was selected as an
initial structure for MD simulations. Ligand topologies for MD were
generated according to the general Amber force field (GAFF)[46] at the AM1-BCC[47] level
of semi-empirical theory by using the antechamber(48) and sqm packages of
AmberTools15.[49] This topology generation
was performed and converted to GROMACS format by using ACPYPE (Rev:
7268).[50]
MD Simulations
Prior to MD, the protein–ligand
complex was immersed in a dodecahedral periodic box and solvated in
approximately 15,300 TIP3P[51] water molecules.
Nine Na+ counterions were added to neutralize the net charge
of the system, and the system was energy minimized using a steepest
descent algorithm. Thermal pre-equilibration and (grid-based) pair-list
updates during MD were performed as described previously.[34] The equations of motion were integrated using
a leap-frog algorithm.[52] Particle mesh
Ewald (PME)[53] was used to treat the long-range
electrostatic interactions during MD. The cutoff for nonbonded interactions
was set to 0.9 nm. Hydrogen atoms were assigned a mass of 4.032 amu,[54] and all bonds were constrained using the LINCS
algorithm,[55] allowing a time-step of 4
fs. Coordinative bonds with a length of approximately 0.2 nm were
added between the Zn2+ ion and sulfur atoms of its coordinating
residues (Cys131, Cys134, Cys155, Cys158) to ensure coordination during
simulation and kept constant using the LINCS algorithm as well. A
Berendsen thermostat and barostat[56] were
used to maintain the system temperature and pressure close to their
reference values of 300 K and 1.01325 bar, using coupling constants
of 0.1 and 0.5 ps, respectively. All MD stages, including 20 ns production
runs in explicit solvent were carried out using GROMACS 4.5.5,[57] in which energy and coordinate trajectories
were written out to disk every 10 ps, and center-of-mass motion was
removed every 10 ps. Every ligand pose was used twice as the starting
structure of replicated simulations[33] starting
from different Maxwell–Boltzmann distributions of randomly
assigned atomic velocities. Separate duplicated 20 ns production simulations
were performed of the unbound ligands in water in periodic dodecahedral
boxes with minimal solute atom-box edge distances of 1.8 nm, using
identical settings as in the protein–ligand complex simulations
described above. Energy minimization and MD simulations were run in
an automated fashion as implemented in eTOX ALLIES,[58] and the trajectories obtained from
the production runs were used for subsequent LIE and MM/PBSA calculations
as described below.
LIE Calculations
Average electrostatic
and van der
Waals ligand–environment interaction energies were obtained
by averaging over concatenated duplicated simulations performed per
protein–ligand binding pose or per ligand free in solvent.
LIE model parameters were calibrated based on experimental ΔG values (Table ) in an ordinary least squares
(OLS) fitting procedure using the Python scikit-learn 0.17 package.[35,59]Inspired by the ability
to deal with flexible proteins and/or possible multiple ligand-binding
modes in LIE binding affinity computation for, for example, Cytochrome
P450s (CYPs)[32,34,35] or nuclear receptors,[60,61] we also report here
SIRT1 LIE models in which results from MD simulations starting from
different binding poses are combined into a single ΔG calculation per ligand.
For this purpose, the weight W of the contribution from simulations starting from an individual
binding pose i to the binding free energy is calculated
as[32,62]where k is Boltzmann’s
constant, T is the
temperature of the simulated system, and ΔG is the calculated
binding free energy for a given binding pose according to eq . The W’s can be used to combine results
from simulations starting from N different binding
poses by using the following equationwhere α and β (and the optional
offset parameter γ) are again parametrized based on curated
experimental values for ΔG, but they need to be trained in an iterative fashion (in which
simulation data for multiple binding modes are combined into individual
computations) when N > 1[32] because the W’s
depend on the LIE model parameter values.
MM/PBSA Calculations
MM/PBSA models were generated
using g_mmpbsa(24,63) by first PBC correcting
the MD trajectories. The Zn2+ ion was included as part
of the protein in the MM/PBSA calculations. We used a single-trajectory
MM/PBSA protocol which makes use of protein–ligand complex
simulations only and assumes the protein and ligand conformation in
the bound and unbound states to be identical. Thus, in eq , the V terms as well as the intramolecular nonbonded
interaction energies cancel out,[64] resulting
in the final equationwith V and V being the protein–ligand
van der Waals and electrostatic interaction energies, respectively,
whileandThe Poisson–Boltzmann
equation
for the polar term is evaluated using the following equation[10,63,65]where φ(r) is the solute’s
electrostatic potential, ε(r) is the dielectric
constant, and ρf(r) is the fixed
charge density. As the calculation of the polar solvation energy is
known to depend on the chosen value for the dielectric constant ε of the complex,[66] its value was varied in order to inspect its effect on
predicted ΔG values. By default, ε was
set to 2 and tuned to 1, 4, or 8 to examine the sensitivity of calculated
binding free energies to the solute dielectric constant. Other default
settings in g_mmpbsa were left unaltered (i.e., use
of atomic radii as proposed by Bondi,[67] linear PB equation solver, 0.05 nm grid resolution, and smoothed
van der Waals surface).[24]To estimate
nonpolar solvation energies, g_mmpbsa assumes a linear
dependence of G on the SASA, expressed as follows[68]where γ is related to the surface tension of the solvent and is set
to its
default value of 2.26778 kJ mol–1 nm–2, while a default value (3.84982 kJ mol–1) was
also used for fitting parameter b. A solvent probe
radius of 0.14 nm was employed to determine the SASA.[24] Due to its high computational demand and large standard
error compared to other contributing terms,[6,20,31,66,69−72] the entropic term was not included in the MM/PBSA
models presented in this work.
Protein–ligand Interaction
Profiling
To analyze
protein–ligand interactions that occurred during the MD simulations,
we built a dedicated Python-based biomolecular analysis library MDInteract, which we made freely available at https://github.com/MD-Studio/MDInteract. MDInteract adds structure analysis methods to
the API of the popular pandas data analysis toolkit.[73] The MD trajectory and structure file formats
are imported using the MDTraj library.[74] MD trajectories were analyzed for the following
biomolecular interaction types at a 100 ps resolution similar to previous
studies performed by us:[35,61,75]Hydrophobic interactions: identified
as all contacts between carbon atoms within 0.4 nm having only C or
H atoms as neighbors. Interacting pairs from the same source atom
to multiple target atoms in the same residue, in both directions,
were filtered to retain only the pair with the smallest distance.
Note that atoms involved in π- or T-stacking were separately
considered (see below).π-
and T-stacking: identified
as the geometric centers of two aromatic rings within 0.55 nm, with
an offset of no more than 0.2 nm after projection of the geometric
center of one ring onto the other and an angle between the normals
of both rings of 180° ± 30° for π-stacking or
90° ± 30° for T-stacking.[76] Aromatic rings were detected using a SSSR algorithm (Smallest Set
of Smallest Rings) filtering for aromatic planar rings.Hydrogen bonds (H-bond): identified
as all interactions between H-bond donor–acceptor pairs within
0.41 nm where the donor–H–acceptor angle was within
50° from the ideal in-plane 180° orientation and the heavy
atom acceptor neighbor–acceptor–H angle did not deviate
more than 90°.[77] Donor atom SYBYL
types: N.3, N.2, N.acid, N.ar, N.am, N.4, N.pl3, N.plc, and O.3 with
at least one attached hydrogen. Acceptor atom SYBYL types: N.3, N.2,
N.1, N.acid, N.ar, O.3, O.co2, O.2, S.m, and S.a not of a donor character.
H-bonds part of a salt bridge were labeled as such by the salt-bridge
detection method.Salt bridges: identified as all interactions
between the heavy atoms representing the geometric centers of two
charged groups of opposite signs within 0.55 nm distance.[78] The H-bond component of the salt bridge was
added by the H-bond detection method.Cation−π interactions:
identified as all interactions between the center of an aromatic ring
and a center of positive charge if the distance was within 0.6 nm
and the offset between charge center and ring center after projection
on the same plane was less than 0.2 nm. In case the charge center
was a tertiary amine, the angle between charge center and ring plane
should be within 90° ± 30°.[79]Halogen bonds: identified
as pairs
of halogen bond donor (I, Br, Cl, F) and acceptor atoms (O, N, and
S having a single atom neighbor of type P, C, or S) within 0.41 nm
where the donor angle (C–X–O) was within 165° ±
30° and the acceptor angle (Y–O–X) was within 120°
± 30°.[80]The frequency of interactions per ligand pose to interacting
amino acid residues was normalized by the number of replicates (i.e.,
by dividing interaction counts by two) and was represented as heat
maps using the seaborn Python package version 0.8.[81] Ligand–residue interaction counts that
are less than 20 were discarded for further analysis.
Results
and Discussion
LIE and MM/PBSA Models
Tables and 3 summarize the
predictive ability of our LIE and MM/PBSA models, which was evaluated
in terms of obtained deviations from and correlation with experimental
data. The latter was assessed by calculating r2, Pearson’s r (r), and Spearman’s ρ (ρ) coefficients.[82] For the calibrated LIE models, we also report correlation coefficients q2 obtained from leave-one-out cross validation
(LOO-CV). Deviations of computed ΔG values from experiments were analyzed in terms
of root-mean-square errors (RMSEs) and standard deviations of errors
in predictions based on LOO-CV (SDEPs). In Tables and 3, RMSEs and
SDEPs are only reported for our LIE models because the MM/PBSA calculations
do not allow one to directly compare predicted and experimental ΔG values for individual
ligands.[66,83−86] This is exemplified by the range
of predicted values obtained from one of the MM/PBSA models discussed
below, as presented in Figure (right panel).
Table 2
Model Parameters
α and β
of LIE Models for SIRT1 and Their Respective Root-Mean-Square Error
(RMSE), Standard Deviation of Error in Prediction (SDEP, based on
leave-one-out cross validation), and Correlation Metrics Represented
by r2, q2,
Pearson’s r (r), and Spearman’s ρ (ρ) for the Data Set of 27 Compoundsa
α
β
RMSE
SDEP
r2
q2
rP
ρS
LIEsingle
0.40
–0.04
4.21
4.66
0.52
0.35
0.72
0.72
LIEγ=–10.24
0.30
–0.05
4.00
4.54
0.52
0.38
0.72
0.73
LIEmult
0.42
–0.19
6.16
6.59
0.16
–0.30
0.40
0.43
LIEcomb
0.40
–0.05
4.30
4.88
0.49
0.29
0.70
0.70
LIEBoltzmann4
0.40
–0.02
4.21
4.67
0.49
0.35
0.70
0.69
LIEfirst1ns
0.41
–0.04
3.96
4.39
0.54
0.42
0.74
0.70
LIEfirst5ns
0.41
–0.02
4.41
4.91
0.46
0.28
0.68
0.68
LIEfirst10ns
0.40
–0.04
4.27
4.73
0.51
0.33
0.71
0.72
LIEfirst15ns
0.40
–0.04
4.27
4.71
0.51
0.34
0.71
0.71
LIElast1ns
0.41
0.03
4.58
5.01
0.48
0.25
0.69
0.65
LIElast5ns
0.40
–0.03
4.28
4.75
0.52
0.32
0.72
0.69
LIElast10ns
0.40
–0.05
4.41
4.83
0.50
0.30
0.70
0.70
LIElast15ns
0.40
–0.05
4.30
4.74
0.52
0.33
0.72
0.72
LIEequil
0.41
–0.02
4.50
5.30
0.45
0.16
0.67
0.66
LIEβ=0
0.41
–
4.24
4.40
0.51
0.42
0.71
0.67
LIEα=0
–
–1.67
31.3
32.1
0.05
–29.9
0.23
0.18
vdwbound
0.19
–
3.79
3.94
0.58
0.54
0.76
0.76
vdwunbound
0.35
–
4.39
4.56
0.49
0.38
0.70
0.67
RMSEs and SDEPs are given in
kJ mol–1.
Table 3
Performance of MM/PBSA Models for
SIRT1 in Terms of Their Correlation Metrics, Represented by r2, q2, Pearson’s r (r), and
Spearman’s ρ (ρ) for
the Data Set of 27 Compounds
r2
rP
ρS
MM/PBSAε=2
0.41
0.64
0.61
MM/PBSAε=1
0.39
0.63
0.60
MM/PBSAε=4
0.22
0.47
0.45
MM/PBSAε=8
0.11
0.33
0.31
MM/PBSAfirst1ns
0.28
0.53
0.46
MM/PBSAfirst5ns
0.41
0.64
0.64
MM/PBSAfirst10ns
0.42
0.65
0.63
MM/PBSAfirst15ns
0.41
0.64
0.63
MM/PBSAlast1ns
0.36
0.60
0.56
MM/PBSAlast5ns
0.35
0.59
0.56
MM/PBSAlast10ns
0.36
0.60
0.58
MM/PBSAlast15ns
0.39
0.62
0.60
MM/PBSAele
0.19
0.43
0.08
MM/PBSAvdW
0.60
0.77
0.76
MM/PBSApolar
0.36
–0.60
–0.57
MM/PBSASASA
0.57
0.76
0.75
MM/PBSAnoele
0.04
0.20
0.14
MM/PBSAnopolar
0.49
0.70
0.71
Figure 1
Scatter plots for experimental (exp) and calculated (calc) free
energies ΔG of ligand–SIRT1 binding. Calculated values are obtained using
the LIE (left panel) or MM/PBSAε=2 model (right panel). The solid blue lines represent
linear fits to the data, while the distributions of data are represented
by kernel density plots.
RMSEs and SDEPs are given in
kJ mol–1.Scatter plots for experimental (exp) and calculated (calc) free
energies ΔG of ligand–SIRT1 binding. Calculated values are obtained using
the LIE (left panel) or MM/PBSAε=2 model (right panel). The solid blue lines represent
linear fits to the data, while the distributions of data are represented
by kernel density plots.When calibrating a LIE
model using per ligand the binding pose
proposed by Karaman and Sippl[31] as a (single)
starting pose for MD, we obtain a model (LIE in Table )
with RMSE = 4.21 kJ mol–1, which is within the typical
experimental uncertainty of 4–5 kJ mol–1.[87,88] Overall, this suggests satisfactory performance of the model parameters
used. Including the optional offset γ parameter did not improve
model performance further: The resulting LIEγ=–10.24 model (with γ = −10.24 kJ mol–1)
showed a slight reduction in RMSE and SDEP of 0.21 and 0.12 kJ mol–1, respectively, but it did not show an increase in
correlation coefficients compared to LIE (Table ). Figure illustrates that
LIE shows a slightly better correlation
between calculated and experimental ΔG values than the MM/PBSA model in which
ε is set to its default g_mmpbsa value of 2 (MM/PBSAε=2 in Table ). This is confirmed
by the slightly higher r2 (0.52 vs 0.41)
and Pearson’s and Spearman’s coefficients (0.72 vs 0.64
or 0.61) of LIE compared to MM/PBSAε=2 (Tables and 3). It should be kept in mind
however that α and β in the LIE equations are fitted parameters.
Here, q2 of LIE (0.35) is in fact slightly lower than r2 for MM/PBSAε=2 (0.41), and the value of
−0.04 for β hints at possible overfitting. As discussed
in more detail below, this could be resolved by constraining β
to 0 in a separate LIEβ=0 model, which has a q2 value of 0.42 (Table ).In accordance with previous findings
from MM/PBSA studies on other
protein systems,[66] we found a clear effect
of ε on obtained correlations
in the MM/PBSA models. Setting its value to 1 or 4 led to slightly
and significantly lower correlation coefficients, respectively, while
using ε = 8 further decreased
the quality of the model when compared to MM/PBSAε=2 (Table ). Even though,
values of 6–7 or higher have been reported as average estimates
for dielectric permittivities of protein binding pockets,[89] ε is
often set to 1 or 2.[24,90] Our finding that the most successful
models of MM/PBSA in predicting SIRT1–ligand binding use a
low value of ε is in line
with the hydrophobic acetyl–lysine binding interface in SIRT1;[91] as discussed by Karaman and Sippl,[31] the binding site of all inhibitors is characterized
by six hydrophobic (Ala22, Ile30, Phe33, Phe57, Ile107, Phe174) and
two charged side chains (Asp108, His123).Our MM/PBSAε=2 model performs similarly as one
of the MM/GBSA models presented by Karaman and Sippl,[31] who used a ε value
of 1 and reported r2 values ranging between
0.4 and 0.6 in their MM/GBSA study on SIRT binding. In addition, the
similar performance of LIE and MM/PBSAε=2 is in line with the earlier work of Uciechowska et
al., who found comparable performance of LIE and MM/PBSA in reproducing
experimental data for a smaller set of SIRT2 binders.[72]
Inclusion of Multiple Binding Poses
In a next step,
we investigated if we could use the iterative LIE scheme to improve
calculations and to develop a binding affinity model in an unsupervised
manner. For that purpose, multiple binding poses were directly obtained
from docking and clustering. These were used as starting structures
in MD simulations i in eq . The resulting LIE model showed significantly higher RMSE and lower correlation
coefficients than the LIE model
(Table ). We analyzed
if this decrease in predictive quality was due to either a lack of
correct MD starting poses in the LIE model or the fact that the protein does not bind its ligands in
multiple orientations (or due to a combination of both).We
first explored the interactions between the ligands and SIRT1 residues
during simulations used in the LIE and LIE models. The obtained interaction
heat maps are depicted in Figure . They show high frequencies in three ligand–SIRT1
hot-spot interactions that were present in the starting poses[31] and persistent during the simulations used in
the LIE and MM/PBSA models (i.e.,
with Phe33, Ile107, and Asp108; Figure ). Figure shows that these frequencies are significantly lower for
most of the LIE simulations, especially
those involving interactions with Asp108. This finding indicates that
the unsupervised selection of starting poses for MD leads to a larger
spread in protein–ligand interactions during simulation than
in the LIE model. For most ligands,
we found a relatively large root-mean-square deviation (RMSD) in atomic
positions between the MD starting poses used in LIE and the LIE pose; for only
eight of the ligands, at least one of the three docked starting poses
showed an RMSD of less than 0.1 nm (Table S1). Only when combining results of the MD simulations initiated from
the starting poses of LIE with those
from LIE, we could train a combined
model LIE with similar accuracy and
correlation coefficients (and α and β) as LIE (Table ). These findings indicate that docking and clustering for
the purpose of LIE training did not
lead to sufficiently appropriate binding poses, while the retained
predictive quality of the LIE model
(compared to LIE) implies that incorporating
too many binding poses does not necessarily lead to lower predictive
quality. Thus, when being able to generate and select appropriate
binding poses from docking and/or experimental information on protein–ligand
interactions, iterative LIE training can subsequently be performed
in an unsupervised manner.
Figure 2
Ligand–residue interaction profiles for
MD simulations used
in the LIE and MM/PBSA models (upper
panel) and in the LIE model (lower
panel). Interaction counts are presented for individual SIRT1 residues
(x-axis labels) and averaged over duplicated 20-ns
runs per ligand (y-axis labels in upper panel) or
per combination of ligand and starting pose used for MD (y-axis labels in lower panel).
Figure 3
Residues
of SIRT1 (Phe33, Ile107, Asp108) that show hot-spot interactions
with the ligands during the simulations used, for example, in the
LIE model. All ligands included
in the model are depicted in stick representation using different
colors (in their starting structure used for MD), while the interacting
residues are labeled and shown in ball-and-stick representation.
Ligand–residue interaction profiles for
MD simulations used
in the LIE and MM/PBSA models (upper
panel) and in the LIE model (lower
panel). Interaction counts are presented for individual SIRT1 residues
(x-axis labels) and averaged over duplicated 20-ns
runs per ligand (y-axis labels in upper panel) or
per combination of ligand and starting pose used for MD (y-axis labels in lower panel).Residues
of SIRT1 (Phe33, Ile107, Asp108) that show hot-spot interactions
with the ligands during the simulations used, for example, in the
LIE model. All ligands included
in the model are depicted in stick representation using different
colors (in their starting structure used for MD), while the interacting
residues are labeled and shown in ball-and-stick representation.Here, we evaluated if Boltzmann-like weighting
(eq ) within the LIE model predicted the manually prepared
binding poses of Karaman
and Sippl as the most probable ones by inspecting per ligand the starting
pose used in the MD simulation with the highest weight W (Figure ). We found that for 22 of the 27 ligands, the simulation
starting from the binding pose used in the LIE model shows the largest weight W. For the five ligands for which the largest weight
comes from a simulation starting from one of the docked poses used
in LIE (19, 20, 21, 27, 38), we inspected
the corresponding pose (Figure ) and discovered that the scaffolds of three out of five ligands
(19, 20, 21) show clear structural
overlap with the ligand pose used for LIE (cf. Figure and the relatively low RMSDs in Table S1). Interestingly, docked starting poses of ligands 27 (pose 1) and 38 (pose 1) for the simulations with the
highest weight W in
the LIE model show less overlap (Figure ) and higher RMSDs
(Table S1), but they still share similar
interaction profiles with the simulations used for these ligands in
LIE (Figure ). This highlights that similarity in protein–ligand
interactions may well be a better measure for comparable modes and/or
affinities of binding than structural overlap alone, in agreement
with results from previous studies in which we used iterative LIE
to evaluate binding to flexible proteins such as CYPs or nuclear receptor
FXR.[34,35,61] This is further
exemplified by MD simulations starting from poses which show relatively
high contributions to LIE calculations
such as in the case of pose 1 of compound 26, which does
not have the lowest RMSD of poses 1–3 when compared to the
LIE starting pose (Table S1), but shows a similar pattern of protein–ligand
interactions (Figure ). Subsequently, we trained a LIE model using per ligand the simulation
with the highest weight W from LIE, and we found that this
LIE model performs similarly
as LIE and LIE (Table ),
confirming that the latter does not utilize too many different MD
starting poses.
Figure 4
Boltzmann weights W for the simulations (x-axis) used per ligand
(y-axis) in the combined LIE model. Starting poses 0 and 1–3 were starting poses
for the
simulations used in the LIE and
LIE model, respectively, and the intensity
of the color indicates the size of W per simulation.
Figure 5
Comparison
of ligand starting poses for the MD simulations that
have the highest weights W in the combined LIE model
(solid cyan) with the LIE pose (transparent
green).
Boltzmann weights W for the simulations (x-axis) used per ligand
(y-axis) in the combined LIE model. Starting poses 0 and 1–3 were starting poses
for the
simulations used in the LIE and
LIE model, respectively, and the intensity
of the color indicates the size of W per simulation.Comparison
of ligand starting poses for the MD simulations that
have the highest weights W in the combined LIE model
(solid cyan) with the LIE pose (transparent
green).
Impact of Simulation Length
We also explored the effect
of simulation length on ΔG computations by the LIE and
MM/PBSAε=2 models. For that purpose, models were
developed based on averages for the interaction energy and solvation
terms over either the first or the last 1, 5, 10, or 15 ns of the
production simulations starting from the Karaman and Sippl poses (and
using ε = 2 in the MM/PBSA models). The performance of these
models is listed in Tables and 3 (using subscripts firstXns or lastXns) and was compared to the models based
on 20-ns averages as described above.MM/PBSA studies in the
literature often report simulations of tens of nanoseconds from which
the last few nanoseconds are used to compute averages for the different
terms in eq or 9.[64] Our analysis shows
that for the current system of interest a simulation length of 5 ns
(in the MM/PBSA model) is sufficient to obtain comparable correlation coefficients
for MM/PBSAε=2. However, by running longer simulation
times and taking the last 1 or 5 ns to average over gives a slight
deterioration in model performance (cf. MM/PBSA and MM/PBSA in Table , respectively). It is also worth mentioning that 1-ns LIE
simulations are sufficient to achieve a model (LIE) that shows even slightly better performance
compared to LIE (Table ). The finding that shorter
simulation lengths can improve LIE calculations is in line with previous
results in iterative models that showed improved performance when
restricting sampling to local parts of conformational space,[62,92] and it can be seen as a clear advantage of LIE in terms of its efficiency.
It prompted us to generate a LIE model (LIE) using the output of our thermal equilibration simulation
at 300 K only, but this in turn showed a slightly higher RMSE (4.50
kJ mol–1) and lower r (0.67) than those for LIE and LIE (Table ). The reason
that longer equilibration may not necessarily lead to improved correlation
for the MM/PBSA model (which is in line with previous findings in
literatures[66,93]) and that longer simulations
are more required for MM/PBSA than for LIE is due to the relatively
slow convergence of the MM/PBSA polar term.[90,94] This is illustrated in Figure , which shows an example of a typically observed time
series of the different MM/PBSA and LIE terms. In conclusion, longer
equilibration times do not necessarily lead to an improved agreement
with experimental data which makes the selection of the optimal simulation
time to be an effective model parameter.
Figure 6
Time series of different
ligand–environment interaction
energies and solvation free energy terms used in the LIE and/or MM/PBSA
models, as obtained in one of the MD simulations of ligand 11a either bound to SIRT1 or free in water (red, ΔG; orange, ΔG; light green, ⟨V⟩; green, ⟨V⟩; dark green, V; light blue, ⟨V⟩; blue, ⟨V⟩; dark blue, V). Plotted values are averages over 10 consecutive data points.
Time series of different
ligand–environment interaction
energies and solvation free energy terms used in the LIE and/or MM/PBSA
models, as obtained in one of the MD simulations of ligand 11a either bound to SIRT1 or free in water (red, ΔG; orange, ΔG; light green, ⟨V⟩; green, ⟨V⟩; dark green, V; light blue, ⟨V⟩; blue, ⟨V⟩; dark blue, V). Plotted values are averages over 10 consecutive data points.
Contributions of Different LIE and MM/PBSA
Terms to ΔG Calculations
In the LIE models discussed above, β
is close or practically
equal to zero (Table ). This indicates that electrostatic interactions do not significantly
contribute to trends in binding affinity, which can be in line with
the hydrophobic character of the SIRT1 binding pocket.[91] In a previous LIE study on a protein with a
hydrophobic active site (i.e., Cytochrome P450 1A2 or CYP1A2), fitted
values for β were similarly small and constrained to zero in
the final presented model.[95] As for CYP1A2,
a SIRT1 LIEβ=0 model with β set to zero shows
nearly identical performance as LIE when considering RMSEs and correlation metrics and a slightly lower
SDEP (and higher q2) (Table ). The low contribution of electrostatic
interactions to predicted trends in binding affinity can be understood
from the relatively low absolute values for ΔV obtained from MD (Table S2), suggesting that hot-spot and other hydrogen-bonding
interactions with SIRT1 residues compensate for the ligand–solvent
interactions in the unbound state. As a result, training a LIE model
with α set to zero (and using only β as fitting parameter)
is of no meaning at all (cf. LIEα=0 in Table ).Similarly as for LIE,
we tried to simplify the MM/PBSA models by only taking van der Waals
interactions or the nonpolar/SASA term into account. Table shows that models that only
include the van der Waals V interactions (MM/PBSA, r = 0.77) or the SASA ΔG term in eq (MM/PBSA, r = 0.76) show
higher predictivity than models based on electrostatic V interactions (r = 0.43) or the polar ΔG term only (r = −0.60) and even than the overall MM/PBSAε=2 model. Our finding that LIE and MM/PBSA models with
a reduced number of model terms can give similar or slightly improved
performance would make it of interest to explore the use of extended
LIE approaches such as the one recently proposed by He and co-workers,[96] in which MM/PBSA-like terms are included together
with six fitting parameters. Based on our current findings, this number
may also be effectively reduced in a model for ligand binding to SIRT1.
In addition to the small electrostatic contribution to trends in binding
to the hydrophobic SIRT1 pocket (Table S2), the reason that neglecting electrostatics and polar contributions
can improve our MM/PBSA predictions can be explained by the relatively
large uncertainty in determining V and especially ΔG (cf. Figure ). The difficulty to reach convergence in the latter can be due to
fluctuations in contributions not only from the protein binding site
but also from distant protein residues that may be less relevant for
ligand binding.[22] The uncertainty in determining
ΔG was also
highlighted in a previous benchmarking study on g_mmpbsa,[24] in which it was observed that ΔG is more prone to be different
(by more than 20 kJ mol–1) between results of g_mmpbsa and AMBER simulation tools[97,98] than the other MM/PBSA terms. This was attributed to the different
algorithms implemented in APBS[63] (used
in g_mmpbsa) and the PBSA package of AMBER and/or
subtle differences in the implementation of the AMBER force field
used in AMBER and GROMACS.As the MM/PBSA method only considers
protein–ligand interactions
and MM/PBSA was found to be the best
performing MM/PBSA model in Table , we conducted further investigation on a LIE model
with β set to zero and in which ΔV in eq is replaced by ⟨V⟩. This LIE model is annotated as vdw in Table and thus considers van der Waals interaction energies in
the protein–ligand bound state only. It shows the highest accuracy
of the LIE models presented in Table , with RMSE = 3.79 kJ mol–1 and r = 0.76. Interestingly, when
calibrating a LIE model vdw with
β = 0 and ΔV in eq replaced
by ⟨V⟩, comparable performance
was obtained as for, for example, LIE (Table ). Comparable
correlation coefficients could even be obtained by considering ligand
SASAs only (for which r2 = 0.43 and r = 0.66). In this particular
case, binding prediction can apparently be made highly efficient by
considering ligand simulations or properties only. This is probably
again due to the dominance of nonpolar protein–ligand contacts
in determining trends in binding affinity, but we note that the most
predictive LIE and MM/PBSA models are the ones based on protein–ligand
van der Waals interactions (Tables and 3), as exemplified, for
example, by the high q2 value for the
vdw LIE model.
Conclusions
By comparing fitted LIE models for a set of 27 compounds binding
to SIRT1 with single-trajectory-based MM/PBSA models, we find calculated
ΔG values from
our LIE models with a RMSE of 4.2 kJ mol–1, while
the obtained correlations are comparable (Pearson’s r = 0.72 and 0.64 for LIE and MM/PBSAε=2, respectively). This indicates that
if agreement with experimental affinities in absolute terms is not
of interest, both methods have a comparable predictive quality. We
also found that longer equilibration may not necessarily lead to improved
correlation with experiments and that longer simulations are more
required for MM/PBSA than for LIE due to the relatively slow convergence
of the MM/PBSA polar term. In addition to the higher cost in computing
this term when compared to the terms in the central LIE equation,
this makes LIE more efficient.For the studied protein and ligands,
calibration of an iterative
LIE model (that uses weighted contributions from multiple simulations
starting from different ligand-binding poses) did not lead to satisfactory
computations when generating starting poses for MD simulations in
unsupervised docking and clustering. During these simulations, we
found relatively low frequencies in ligand interactions with hot-spot
protein residues, as analyzed using our Python library MDInteract. In other words, careful selection of proper MD starting poses can
be crucial in generating iterative LIE models. Encouragingly, when
combining results from simulations starting from the docked poses
and from poses based on homologous crystal structure data (in the
LIE model), Boltzmann-like weighting
of the multiple simulations was able to identify appropriate starting
configurations.For our simulations with starting structures
obtained in a supervised
way, we found relatively small values for the difference ΔV in electrostatic interactions
in the bound and unbound state. Apparently, ligand interactions with
hot-spot residues can largely compensate for the loss in polar interactions
with the solvent upon binding. As a result, LIE parameter β
was fitted to values close to zero. By constraining it to zero, we
could obtain a model with reduced overfitting compared to models in
which β was used as a free parameter. This was illustrated by
a lower SDEP and higher q2 for the former
model in leave-one-out cross validation. We found that the correlation
between calculated and experimental ΔG values could be maximized by including
protein–ligand van der Waals interactions only (i.e., either
in a MM/PBSA model with V in eq as the only
term or in a LIE model with α as the only free parameter and
which only includes results from the bound-complex simulations). On
the other hand, our models could be made most efficient by only including
ligand properties, e.g., in a SASA-only-based model or in a LIE model
that only includes van der Waals interaction energies involving the
unbound ligand. The finding that predictive models could be obtained
in terms of van der Waals interactions only for this specific protein
is in line with the hydrophobic character of its binding site and
with the uncertainty in determining the ΔG MM/PBSA term.
Authors: P A Kollman; I Massova; C Reyes; B Kuhn; S Huo; L Chong; M Lee; T Lee; Y Duan; W Wang; O Donini; P Cieplak; J Srinivasan; D A Case; T E Cheatham Journal: Acc Chem Res Date: 2000-12 Impact factor: 22.384
Authors: Michael Feig; Alexey Onufriev; Michael S Lee; Wonpil Im; David A Case; Charles L Brooks Journal: J Comput Chem Date: 2004-01-30 Impact factor: 3.376
Authors: Nad'a Spacková; Thomas E Cheatham; Filip Ryjácek; Filip Lankas; Luc Van Meervelt; Pavel Hobza; Jirí Sponer Journal: J Am Chem Soc Date: 2003-02-19 Impact factor: 15.419
Authors: Kay S Schaller; Jeppe Kari; Gustavo A Molina; Kasper D Tidemand; Kim Borch; Günther H J Peters; Peter Westh Journal: ACS Omega Date: 2021-01-06