Lucia Fusani1,2, David S Palmer2, Don O Somers3, Ian D Wall1. 1. Molecular Design UK, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, Hertfordshire SG1 2NY, U.K. 2. Department of Pure and Applied Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow G11XL, U.K. 3. Protein, Cellular and Structural Sciences, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, Hertfordshire SG1 2NY, U.K.
Abstract
Identification of correct protein-ligand binding poses is important in structure-based drug design and crucial for the evaluation of protein-ligand binding affinity. Protein-ligand coordinates are commonly obtained from crystallography experiments that provide a static model of an ensemble of conformations. Binding pose metadynamics (BPMD) is an enhanced sampling method that allows for an efficient assessment of ligand stability in solution. Ligand poses that are unstable under the bias of the metadynamics simulation are expected to be infrequently occupied in the energy landscape, thus making minimal contributions to the binding affinity. Here, the robustness of the method is studied using crystal structures with ligands known to be incorrectly modeled, as well as 63 structurally diverse crystal structures with ligand fit to electron density from the Twilight database. Results show that BPMD can successfully differentiate compounds whose binding pose is not supported by the electron density from those with well-defined electron density.
Identification of correct protein-ligand binding poses is important in structure-based drug design and crucial for the evaluation of protein-ligand binding affinity. Protein-ligand coordinates are commonly obtained from crystallography experiments that provide a static model of an ensemble of conformations. Binding pose metadynamics (BPMD) is an enhanced sampling method that allows for an efficient assessment of ligand stability in solution. Ligand poses that are unstable under the bias of the metadynamics simulation are expected to be infrequently occupied in the energy landscape, thus making minimal contributions to the binding affinity. Here, the robustness of the method is studied using crystal structures with ligands known to be incorrectly modeled, as well as 63 structurally diverse crystal structures with ligand fit to electron density from the Twilight database. Results show that BPMD can successfully differentiate compounds whose binding pose is not supported by the electron density from those with well-defined electron density.
Crystallographic studies
of ligands bound to biological macromolecules
(proteins and nucleic acids) play a key role in structure-based drug
design (SBDD). The presence of ligands in crystal structures must
be supported by convincing experimental evidence, which is represented
by the electron density (ED). The interpretation of the observed ED
in a binding site as the ligand of interest or water or buffer molecules
is far from trivial; it is a subjective work that requires good judgment
and expertise, but sometimes mistakes can lead to erroneous modeling
decisions in crystal structures. The visualization of the ligand–protein
model together with the ED maps can be a useful way for assessing
ligand placement, but a numerical measurement to quantify ligand reliability
is also required to allow a more consistent classification. For example,
the most commonly used comparison metric is the real space correlation
coefficient[1,2] (RSCC). It is a local measure of how well
the calculated ED density of a ligand matches the observed ED ranging
between 1 (perfect correlation) and −1 (perfect anticorrelation)
with values below 0.8, indicating a poor fit where the ligand might
have been incorrectly modeled. The “consumers” of the
structural information are often not expert crystallographers; therefore,
the usage of misinterpreted crystal structures as the starting point
of computational experiments such as ligand docking, active site identification,
and lead optimization could lead to unreliable results from which
erroneous conclusions are consequently drawn. As reported in several
papers,[3−7] there are still cases in the Protein Data Bank (PDB, http://www.pdb.org) in which the presence
and/or location of the ligand is not fully supported by the underlying
ED. Given the fundamental importance of the crystal structures in
the progression of SBDD projects, several tools that rank and assess
their quality based on the fit of the ED have been developed.[8−10] In the present paper, the possibility to identify and separate ligands
that are correctly placed and supported by ED from those that are
misplaced and/or not supported by ED is studied with computational
methods.If the structural flexibility of a biological system
needs to be
studied, molecular dynamics (MD) simulation is the technique of choice.
However, when an efficient sampling of structural dynamics is required
in a limited timescale, enhanced sampling methods such as metadynamics
are often more useful. Metadynamics[11,12] enables sampling
of a complex free-energy landscape by adding a history-dependent bias
into the system as a function of a carefully chosen collective variable
(CV). In this way, the system is discouraged from revisiting previously
sampled regions and at the same time is forced to escape stable free-energy
minima, where it would normally be trapped, facilitating the exploration
of the entire free-energy landscape. Such a methodology has been used
to reconstruct the full free-energy landscape of protein–ligand
binding[13−15] and predict the ligand binding pose. Binding pose
metadynamics[16] (BPMD), a variation of metadynamics,
is an automated, enhanced sampling, metadynamics-based protocol, in
which the ligand is forced to move in and around its binding pose,
whose higher mobility under the biasing potential is considered indicative
of binding mode instability. Clark et al.,[16] the developers of the tool, showed the ability of BPMD to reliably
discriminate between the correct ligand binding pose and plausible
alternatives generated with Induced Fit Docking (IFD).The aim
of this paper is primarily to validate the BPMD tool using
reliable and questionable protein–ligand binding poses obtained
from crystal structures, identified using both RSCC value and analysis
of the ED maps. It is hypothesized that ligands supported by underlying
ED should display stability under the BPMD bias, whereas rapid ligand
fluctuations, indicating instability, are expected for ligands that
are misplaced in the ED and/or are not in a well-defined energy minimum.
These studies will help to understand how the stability of ligand
binding poses could inform on crystal structures in which the ligand
has been incorrectly or questionably modeled and potentially expand
the BPMD tool usage to assess the quality of crystal structures before
undertaking SBDD campaigns.
Methods
Data Sets
The
first three test cases were identified
by searching the primary literature: epothilone[17−20] bound to tubulin alpha-1β
chain protein, ampicillin[21] bound to penicillin-binding
protein, and a ligand bound to IκB kinase β.[22] Extra cases were identified from the RCSB PDB
in a semiautomated fashion. The RSCC[1,2] is a local
measure of how well the calculated density of a ligand matches the
observed ED and is defined aswhere
ρ’s are the ED values at
grid points that cover the residue in question, obs and calc refer
to the experimental and model ED, respectively, and cov and var are
the sample covariance and variance, respectively. RSCC is a statistical
measurement which is publicly available for deposited PDB structures
through the Protein Data Bank in Europe (PDBe; http://pdbe.org/). In the worldwide
Protein Data Bank (wwPDB) X-ray validation report, it is stated that
a RSCC value above 0.95 normally indicates a good fit, whereas a poor
fit results in a RSCC value around or below 0.8. All the ligands present
in the PDB with RSCC annotation were taken from the Twilight[8] database and were merged with their Uniprot (https://uniprot.org/) entry name
as found in the primary structure section of the PDB file (DBREF section).
In this way, all the proteins in which at least one structure was
present with RSCC < 0.8 (ligand not fully supported by the underlying
ED) and one with RSCC > 0.9 (ligand with good ED fit) were selected;
the set of two structures with different ligands but same Uniprot
entry will be referred to as a “pair”. Furthermore,
crystal structures with resolution worse than 2.5 Å were discarded
as well as all the proteins in which the biological assembly was not
monomeric. A total of about 11,000 structures were retrieved, and
ligands that are part of the crystallization buffer or solvent were
removed (7538 total structures). The number of structures was further
reduced by grouping them with their Uniprot entry name and RSCC value.
A representative member of each protein was visually inspected; all
the structures of that protein were discarded if the protein was challenging
to model; for example, it contained unresolved portions or was difficult
to parameterize. This procedure provided a set of 63 structures (61
unique ligands) as a reference set to validate the BPMD tool. Because
it was difficult to find sufficient pairs for the data set, the 2.5
Å resolution criterion was relaxed to allow five extra structures
(PDB: 2ITY, 3W16, 3QCQ, 4QE8, and 5HIB) to be selected.The (2mFo–DFc) maps and
the (mFo–DFc) maps for the
reference data set were downloaded from the PDBe database (http://www.ebi.ac.uk/pdbe/) contoured at +1σ and ±3σ, respectively, and visually
inspected with Coot.[23] Fo and Fc are the
experimentally measured and model-based amplitudes, respectively, m is the figure of merit, and D is the
σA weighting factor.[24] In general, protein–ligand models with RSCC < 0.8 have
poor ligand ED fit. This can range from complete absence of ED for
significant portions of the ligand to broken ED throughout the entirety
of the ligand.[6] A low RSCC score can also
be the result of a combination of good ED fit for the portions of
the ligand interacting with the protein target and the remaining portion(s)
having very poor ED fit because of high ligand moiety mobility. It
is clear that the interpretation of the final ED fit is a subjective
process, and there is likely to be a wider range of individual fitting
interpretations when the ED is poorly defined.In this work,
we want to address the capability of BPMD to discriminate
between high- and low-probability ligand binding modes; therefore,
it is important to separate cases in which the ED is almost absent,
indicating that the ligand presence is not supported by experimental
evidence, from cases where the ED quality does not allow a complete
determination of the ligand pose because of partial disorder, indicating
that the ligand presence is partially supported by ED. In order to
better define the data set, omit maps were calculated with σA style maximum likelihood-weighted mFo–DFc and 2mFo–DFc
map coefficients for all the structures by removing the ligand in
question followed by maximum likelihood refinement in REFMAC.[25] The omit map here refers to the fact that the
ligand is omitted from the model refinement to reduce model bias in
the ED map. If the ligand molecule is present, the shape of the resulting
difference ED will provide corresponding evidence. After the omit
map was created and visually inspected with the original coordinates
overlaid, the data set was more accurately divided in three distinct
categories (Table ): (1) green: ligands with RSCC > 0.9 that show very good fit
with
the ED; (2) amber: ligands with RSCC < 0.8 that are only partially
supported by the ED, that is, the ligand presence showing partially
occupied and/or disordered portions and a fractional positive difference
density is observed in the (mFo–DFc) omit maps; (3) red: ligands with RSCC < 0.8 that have no moieties
supported by the ED; that is, in the (2mFo–DFc) map, there is no ED that could explain the ligand,
no interpretable positive difference ED in the (mFo–DFc) omit maps and/or presence of negative
difference density.
Table 1
Overall Classification
of the Data
Set Used for Validating the BPMD Toola
category
number of
structures
green: ligand supported
by ED
29 (30)
amber: ligand
partially
supported by ED
18
red: ligand with
ambiguous
density
16 (21)
The number in parenthesis represents
the overall number of ligands identified from both manual literature
search and Twilight database filtering.
The number in parenthesis represents
the overall number of ligands identified from both manual literature
search and Twilight database filtering.
System Preparation
Complexes were downloaded from the
PDB and prepared with the Protein Preparation Wizard[26] in Maestro v.2018.04. Hydrogen atoms and missing residues
were added to the initial coordinates; bond orders were assigned to
the ligand in the crystal structures. The protein termini were capped
with ACE and NMA residues. Epik was used to find the most likely protonation
state of the ligand and the energy penalties associated with alternate
protonation states. The protein’s hydrogen bond network was
optimized using the ProtAssign algorithm in the H-Bond Refine Tab
of the Protein Preparation Wizard (Maestro v.2018.04) by correcting
both potentially transposed heavy atoms in asparagine, glutamine,
and histidine side chains and also the protonation states of histidine
residues. Finally, a restrained minimization using the Impref module
of Impact with the OPLS3e[27] force field
was used such that hydrogen atoms were freely minimized while allowing
for sufficient heavy atom movement to relax strained bonds, angles,
and clashes. The Force Field Builder (FFBuilder) tool was used to
automatically generate accurate force field torsional parameters derived
from quantum mechanics for all ligands containing substructures not
fully covered by the standard OPLS3e parameters.
Binding Pose
Metadynamics
BPMD as implemented in Maestro
v.2018.4 is a variation of metadynamics simulation in which 10 independent
metadynamics simulations of 10 ns are performed using CV as the measure
of the root-mean-square deviation (rmsd) of the ligand heavy atoms
relative to their starting position. The alignment prior to the rmsd
calculation was done by selecting protein residues within 3 Å
of the ligand. The CAs of these binding site residues were then aligned
to those of the first frame of the metadynamics trajectory before
calculating the heavy atom rmsd to the ligand conformation in the
first frame. The hill height and width were set to 0.05 kcal/mol (about
1/10 of the characteristic thermal energy of the system, kBT) and 0.02 Å, respectively. Before
the actual metadynamics run, the system was solvated in a box of SPC/E
water molecules followed by several minimization and restrained MD
steps that allow the system to slowly reach the desired temperature
of 300 K as well as releasing any bad contacts and/or strain in the
initial starting structure. The final snapshot of the short unbiased
MD simulation of 0.5 ns is then used as the reference for the following
metadynamics production phase.The key concept in BPMD is that
under the same biasing force, ligands that are not stably bound to
the receptor will experience a higher fluctuation in their rmsd as
compared to the stably bound ones. Three scores are provided by BPMD
that are related to the stability of the ligand during the course
of the metadynamics simulations: (1) PoseScore indicative of the average
rmsd from the starting pose. Rapid increase in the PoseScore is indicative
of ligands that are not in a well-defined energy minimum and hence
might not have been accurately modeled. (2) PersistenceScore (PersScore)
is a measure of the hydrogen bond persistence calculated as the fraction
of the frames in the last 2 ns of the simulation that have the same
hydrogen bonds as the input structure, averaged over all the 10 repeat
simulations. Low PersScore is found in structures in which their contact
network is weakened by the BPMD bias. It ranges between 0, indicating
that either the ligand at the start did not have any interaction with
the protein or that the interactions were lost in due course, and
1, indicating that the interactions between the starting ligand binding
mode and the last 2 ns of the simulations are fully kept. (3) CompositeScore
(CompScore) is a linear combination of PoseScore and PersScore obtained
from fitting the results on 42 different systems from the primary
paper[16] and is calculated as follows: CompScore
= PoseScore – 5 × PersScore.
Results
Each complex
was run using Desmond on a single node with 4 GPU
cards, taking for a typical system (1 complex = 1 × 10 metadynamics
run), 2–3 h. The results were assessed for pose stability based
on the PoseScore, that is, the rmsd of the ligand with respect to
the initial ligand heavy atoms coordinates. A PoseScore ≤ 2
Å was considered stable (this value of rmsd is often used as
a threshold defining success in prospective docking simulations).
In addition, the results were also analyzed looking at the PersScore,
which is an indication of the strength of the hydrogen bonds formed
between the ligand and the protein residues. If 60% of the total hydrogen
bonds were kept during the simulations (e.g., PersScore ≥ 0.6),
it was considered as a sign of good persistence. Finally, the linear
combination of the two scores, CompScore, was also investigated but
not used as a primary metric to assess pose stability as reported
in the Results from Twilight Database section.
Results
from Initial Literature Search
The evaluation
of BPMD started from three well-characterized literature cases which
are representative of problematic situations: (1) ligand with incorrect
ligand binding mode [epothilone A (EpoA) in tubulin alpha-1β
chain]; (2) ligand placed into ED belonging to another entity [ampicillin
in penicillin-binding protein 4]; and (3) ligand modeled with incorrect
geometry [inhibitor in IκB kinase β]. Each case will be
discussed in detail in the following sections.
Tubulin Alpha-1β
Chain
Epothilones are natural
compounds belonging to the microtubule stabilizing antimycotic agent
class that bind to the common binding site in β-tubulin. The
first atomic model of EpoA (Figure B) bound to α,β-tubulin was solved by Nettles
et al.[19] with a combined approach of electron
crystallography, NMR, and molecular modeling in 2004 (PDB: 1TVK). Serious doubts
have been raised for the proposed bound conformation of EpoA in this
model because of the inconsistencies with NMR information. In fact,
in 2013, Prota et al.[20] deposited a 2.3
Å X-ray crystal structure of EpoA in complex with α,β-tubulin,
the stathmin-like protein RB3, and tubulin tyrosine ligase (PDB: 4I50), which showed a
different binding mode for the ligand as compared to the 1TVK model (Figure A). For these reasons,
the structure of tubulin alpha-1β chain in complex with EpoA
was employed to examine the ability of BPMD to differentiate between
the correct and incorrect ligand binding mode.
Figure 1
Summary of EpoA results.
(A) Binding site and surface of tubulin
alpha-1β chain complexed with EpoA. (PDB: 4I50 blue color, upper
left corner; and PDB: 1TVK green color, lower left corner). The protein structures
were aligned using the backbone; the ligand rmsd between the two structures
is 9.35 Å. Hydrogen bonds between EpoA and protein are shown
with yellow dashed lines. (B) Structure of EpoA. (C) Average rmsd
of EpoA during the 10 × 10 ns metadynamics runs in PDB: 4I50 (blue line) and 1TVK (green line).
Summary of EpoA results.
(A) Binding site and surface of tubulin
alpha-1β chain complexed with EpoA. (PDB: 4I50 blue color, upper
left corner; and PDB: 1TVK green color, lower left corner). The protein structures
were aligned using the backbone; the ligand rmsd between the two structures
is 9.35 Å. Hydrogen bonds between EpoA and protein are shown
with yellow dashed lines. (B) Structure of EpoA. (C) Average rmsd
of EpoA during the 10 × 10 ns metadynamics runs in PDB: 4I50 (blue line) and 1TVK (green line).During the metadynamics calculation on the initial
structure (1TVK), EpoA shows an
average rmsd over the 10 runs that increased from the beginning to
the end of the simulation time, and the hydrogen bonds were present
for a limited time of the simulation run (Figure C). The overall PoseScore and PersScore are
4.0 and 0.1, respectively. Both these observations are indicative
of the instability of the binding mode and are consistent with reports
questioning the original structure. On the other hand, EpoA in 4I50
shows a different behavior, where the averaged rmsd reaches a steady
PoseScore of 0.9 that is kept constant until the end of the simulations
(Figure C). At the
same time, the PersScore indicates that the hydrogen bonds identified
at the start of the metadynamics run are kept for 70% of the averaged
time. In this case, the results suggest a stable binding mode. There
is a clear difference between the original structure (1TVK) and the more recent
one (4I50) both in the PoseScore and the overall rmsd profile during
the metadynamics run, showing, in this case, that BPMD can differentiate
between a stable and unstable binding geometry. Another potential
sign of instability for the ligand in 1TVK could also be seen in the high rmsd of
up to 3.6 Å between the 10 structures used as reference for the
metadynamics run; in 4I50, the ligand RMS deviation reaches 0.75 Å
at maximum. Finally, in the case of 4I50, EpoA under the BPMD bias
experiences no drastic rearrangement as compared to the initial pose;
hence, the ligand conformation in 4I50 is in a stable conformation
as opposed to the ligand modeled in 1TVK.
Penicillin-Binding Protein
4
Penicillin-binding proteins
are membrane-associated proteins that catalyze the final step of murein
biosynthesis in bacteria. Penicillins bind irreversibly to the active
site of those enzymes disrupting the cell wall synthesis. The crystal
structure of the penicillin binding protein 4 from staphylococcus
aureus in complex with ampicillin (PDB: 3HUN,[21] resolution
2 Å) was deposited in 2010, showing two separate chains in the
asymmetric unit, although the preferred biological assembly of the
complex is monomeric. In each of the deposited chains, A and B, ampicillin
is modeled with a different binding mode and the RSCC is low, suggesting
that there is poor fit between observed and modeled ED: RSCCchain A = 0.52 and RSCCchain B = 0.73 (Supporting Information). Moreover, as stated by Weichenberger
et al.,[28] the phenyl moiety of ampicillin
(ZZ7-501, chain B) is placed in a density that could better fit a
sulfate ion. Therefore, each chain containing the ligand ampicillin
was submitted to the BPMD protocol. The PoseScore of ampicillin in
chains A and B are 4.6 and 5.1, respectively (Figure ). The high rmsd from the reference conformation
of the ligand and the weakened hydrogen bond network during the simulations
are consistent with the fact that the ED does not support the ligand
presence in either of the chains.
Figure 2
Plot of rmsd estimate averaged over all
10 trials vs simulation
time for ampicillin from PDB: 3HUN chain A in green and chain B in blue.
Plot of rmsd estimate averaged over all
10 trials vs simulation
time for ampicillin from PDB: 3HUN chain A in green and chain B in blue.
IκB Kinase β
The structure
of IκB
kinase β bound to the ligand, as depicted in Figure B, was solved at a resolution
of 3.6 Å and deposited as PDB code 3QAD.[22] This has
been the focus of several papers concerning protein–ligand
crystal structures with poorly refined ligand geometries.[5,6] In this crystal structure, the aminopyrimidine ring of the bound
inhibitor had a pyramidal carbon in the pyrimidine ring instead of
the expected planar one, and the piperazine moiety is in an unfavorable
boat conformation (Figure A). The authors released a second structure (PDB: 3RZF) after re-refinement
of the erroneous one. However, even in the newly released structure,
the ligand is highly strained.[29] Moreover,
by analyzing the deposited ED, as explained in the Methods section, the (2mFo–DFc) map does not support the presence of the ligand. The
BPMD protocol was applied to the ligand as modeled in both crystal
structures. The averaged PoseScore and PersScore are 6.7 and 0 for
the ligand in the obsolete structure (PDB: 3QAD) and 4.9 and 0.009 for the ligand in
the refined structure (PDB: 3RZF), respectively. In both cases, the ligand rmsd increases
significantly from the starting conformation, suggesting that the
ligand binding pose is highly unstable under the BPMD bias supporting
the hypothesis that the ligand is not correctly modeled in the binding
site in either crystal structure (Figure C).
Figure 3
(A) Binding site of IκB kinase β
in complex with inhibitor.
On the left: upper corner, originally deposited crystal structure
(PDB: 3QAD)
in which the ligand geometry is highly strained with a pyramidal aromatic
carbon in the amino-pyrimidine moiety and the piperazine ring in a
boat conformation; lower corner, the redeposited but still highly
strained structure is shown (PDB: 3RZF). (B) Structure of ligand. (C) Average
rmsd of ligand during the 10 × 10 ns metadynamics runs in PDB: 3QAD (green line) and 3RZF (blue line).
(A) Binding site of IκB kinase β
in complex with inhibitor.
On the left: upper corner, originally deposited crystal structure
(PDB: 3QAD)
in which the ligand geometry is highly strained with a pyramidal aromatic
carbon in the amino-pyrimidine moiety and the piperazine ring in a
boat conformation; lower corner, the redeposited but still highly
strained structure is shown (PDB: 3RZF). (B) Structure of ligand. (C) Average
rmsd of ligand during the 10 × 10 ns metadynamics runs in PDB: 3QAD (green line) and 3RZF (blue line).
Basic Features of Ligand Data Set and Their
Protein Target
To further validate the methodology, we identified
and analyzed
a data set of 64 unique ligands bound to 32 different proteins, resulting
in a total of 69 complexes, including the structures from manual literature
search. As reported in Figure , the majority of the structures have a crystallographic resolution
below 2.5 Å, except for the cases identified from the manual
literature search (PDB: 1TVK, 3QAD, 3RZF) and
the five extra cases (see Methods section).
Transferases are the most well-represented structures (about 30% of
the whole data set); the rest of the data set is distributed across
11 protein families including hydrolase, isomerase, DNA-binding protein,
signaling protein, and ligase. The ligands appear to be widely distributed
in drug-like physicochemical space, implying the generality of the
data set used (Figure ).
Figure 4
Characteristics of the protein, structures, and binding pockets.
(1) Distribution of proteins, as displayed in the PDB and according
to the Uniprot access code. (2) Distribution of crystallographic resolution
values. (3) Distribution of ligand-binding pocket volumes using SiteMap.
In (2,3), the distributions are color-coded according to the category
definition: ligand supported by ED (green), partially supported by
ED (amber with tick diagonal stripes), and not supported by ED (red
with diamond grid).
Figure 5
Selected physicochemical
properties of the data set. From the upper
left: number of sp3 carbons, heavy atoms, and aromatic
rings, log D at pH = 7.4, number of rotatable bonds,
and hydrogen bond donor and acceptor. The histograms are color-coded
by the ligand category definition: green, ligand supported by ED,
red with diamond grid, ligand not supported by ED, and amber with
tick diagonal stripes, ligand with ambiguous density.
Characteristics of the protein, structures, and binding pockets.
(1) Distribution of proteins, as displayed in the PDB and according
to the Uniprot access code. (2) Distribution of crystallographic resolution
values. (3) Distribution of ligand-binding pocket volumes using SiteMap.
In (2,3), the distributions are color-coded according to the category
definition: ligand supported by ED (green), partially supported by
ED (amber with tick diagonal stripes), and not supported by ED (red
with diamond grid).Selected physicochemical
properties of the data set. From the upper
left: number of sp3 carbons, heavy atoms, and aromatic
rings, log D at pH = 7.4, number of rotatable bonds,
and hydrogen bond donor and acceptor. The histograms are color-coded
by the ligand category definition: green, ligand supported by ED,
red with diamond grid, ligand not supported by ED, and amber with
tick diagonal stripes, ligand with ambiguous density.
Results from Twilight Database
The results of the complexes
from categories green (ligand supported by ED) and red (ligand not
supported by ED) are first discussed. A total of 51 crystal structures,
30 with RSCC > 0.9 and 21 with RSCC < 0.8 that have been confirmed
by inspection of the ED maps, were subjected to the BPMD protocol
to assess ligand stability. Initially, the Twilight database was searched
for pairs of compounds crystallized in the same protein, in which
one was well supported by the ED and the other was not. By analyzing
the results for pairs of structures only, where each pair contains
one structure with RSCC > 0.9 and one with RSCC < 0.8 as defined
in the Methods section, it is observed that
in 7/11 pairs, a cutoff of PoseScore = 2 clearly distinguishes between
structures supported by ED and those that are not (Figure ). In 2/11 pairs (EPHA3 and
PPARG), the structures cannot be distinguished by PoseScore, while
in 2/11 pairs, the structures can be distinguished by the PoseScore,
but both fall below (BACE1) or above (ANDR) the threshold of 2. All
the test cases with RSCC > 0.9 have PoseScore < 2, except for
the
androgen receptor (Figure ).
Figure 6
PoseScore overview of complexes shown by protein in pairs. Test
cases are color-coded by ED (red and cross shape = the underlying
density is too poor to model the ligand, green and circle shape =
density is present). On the y-axis, a cutoff at a
PoseScore equal to 2 is reported to identify ligands that are stable
during the course of the BPMD runs.
PoseScore overview of complexes shown by protein in pairs. Test
cases are color-coded by ED (red and cross shape = the underlying
density is too poor to model the ligand, green and circle shape =
density is present). On the y-axis, a cutoff at a
PoseScore equal to 2 is reported to identify ligands that are stable
during the course of the BPMD runs.To get a more meaningful picture of the performance of the method,
a larger data set was needed. Because it proved difficult to find
pairs of compounds from the same protein where one of the ligands
was clearly not supported by the density, this requirement was removed,
and the protocol was tested on an expanded set as shown in Figure . Overall, if the
results of category green and red are analyzed ignoring the pair definition,
28/30 crystal structures supported by the ED have a PoseScore below
the cutoff threshold of 2. The only two outliers are the surface-exposed
ligand of the androgen receptor, PDB: 2PIT, and the fragment bound to the PHIP protein
with six heavy atoms, PDB: 3MB3 (see Discussion section).
Conversely, 17/21 crystal structures where the ligand is not supported
by the ED have a PoseScore > 2. In general, a PoseScore of 2 (which
is indicative of the average rmsd deviation from the starting pose)
has been identified as a practical threshold for distinguishing between
stable and unstable ligands as proposed previously by Clark and co-workers.[16] From these results, by using the number of structures
with RSCC > 0.9 and PoseScore < 2 as true positives (TP = 28)
or
PoseScore > 2 as false negative (FN = 2) and with RSCC < 0.8
and
PoseScore > 2 as true negatives (TN = 17) or PoseScore < 2 as
false
positive (FP = 4), combining structures found both in the literature
with manual search and in the Twilight database (Table ), the confusion matrix of the
PoseScore (Table )
gave a sensitivity of 0.94, a specificity of 0.84, and a κ value
of 0.78, which confirmed the ability of BPMD to correctly separate
the crystal structures where the ligand has a satisfactory ED from
crystal structures where the ED does not explain the ligand placement.
The general trend previously observed for the initial test cases identified
by manual search of the literature has proven to be generalizable.
This separation of the two classes of ligands also gives confidence
that the force field is performing well. The stability of the green
ligands under BPMD bias suggests that the instability of the red ligands
is not a consequence of issues with the force field. Indeed, the OPLS3e[27] force field is well validated for drug-like
ligands and has been show to perform well in comparison to other ligand
force fields such as those in CHARMM,[30] AMBER,[31] and older versions of OPLS.[27]
Figure 7
PoseScore of all the test cases supported by ED, in green
and circle,
and not supported by ED in red and cross.
Table 2
Confusion Matrix of the Structures
from the Green and Red Category Which Are Supported and Not by ED
Using PoseScore as a Metric
PoseScore < 2
PoseScore > 2
total
green: RSCC > 0.9
28
2
30
red: RSCC < 0.80
4
17
21
Table 3
Summary Statistics
for the Green and
Red Category Which Are Supported or Not by ED Using PoseScore as a
Metrica
sensitivity
0.94
specificity
0.84
precision
0.88
accuracy
0.88
negative predictive value
0.81
κ
0.78
Statistics were
generated by using
the green scored with PoseScore < 2 as true positive (28) or PoseScore
> 2 as false negative (2), the red scored with PoseScore < 2
as
false positive (4), or PoseScore > 2 as true negative (17).
PoseScore of all the test cases supported by ED, in green
and circle,
and not supported by ED in red and cross.Statistics were
generated by using
the green scored with PoseScore < 2 as true positive (28) or PoseScore
> 2 as false negative (2), the red scored with PoseScore < 2
as
false positive (4), or PoseScore > 2 as true negative (17).The overall results were also analyzed
by PersScore (Figure ). The correct threshold to
separate green and red categories is more difficult to identify unambiguously
for PersScore as compared to PoseScore. However, if a threshold of
0.6 is adopted, which corresponds to 60% of the interactions being
maintained on average across all 10 simulations, 23 out of 30 of the
ligands supported by ED are correctly identified, while in the remaining
7 cases, the interaction networks were kept between 40 and 57% of
the total averaged simulation time. The fragment in PDB: 3MB3, scored as unstable
by the PoseScore metric, is the only example with RSCC > 0.9 that
had no hydrogen bonds at the start of the simulation. The absolute
numbers of ligands in the red category that fell below and above the
threshold of 0.6 were equal to the number of complexes identified
by the PoseScore threshold: 17 red protein–ligand complexes
have an interaction network that is significantly altered by the BPMD
bias and in 4 red cases the network is preserved. The sensitivity,
specificity, and κ value if the PersScore is used to evaluate
ligand stability are 0.81, 0.84, and 0.58, respectively. The PoseScore
appears to be a better metric to separate cases where the ligand is
correctly modeled in the ED from those which are not. Finally, the
CompScore, which is a combination of the PersScore and PoseScore,
gives results that are comparable to PoseScore. As a further comparison
of the scores, a bootstrapping study was carried out to estimate the
standard errors associated with each metric, using three crystal structures
selected from the red and green categories (Supporting Information). The results of this study combined with the analysis
above showed that PoseScore is the preferred metric because it can
be estimated with greater precision than CompScore and is easier to
interpret than the PersScore. We note in passing that CompScore is
normally preferred when BPMD is used to rank docking poses.[32]
Figure 8
PersScore of all the test cases supported by ED, in green
and circle,
and not supported by ED in red and cross.
PersScore of all the test cases supported by ED, in green
and circle,
and not supported by ED in red and cross.
Analysis of Structures in the “Twilight” Data
Set
Inspection of the ED maps of the ligands belonging to
the amber category revealed that the regions of the ligand that were
not supported by ED were often outside the binding site and solvent-exposed
while the regions inside the protein binding site were mainly supported
by the ED (Figure ). For some of these cases, the ligands may be stable even though
they have a low RSCC. Hence, a BPMD PoseScore below the threshold
of 2 might be expected.
Figure 9
Examples of ligands belonging to amber category
in which the portion
not described by the ED is solvent-exposed and not interacting with
the protein. The (2mFo–DFc)
map contoured at 1σ is shown as a wireframe and the protein
is in cartoon representation. (1) PDB: 4D2R,[33] ligand
DYK, RSCC = 0.68; (2) PDB: 5P9H,[34] ligand 7G7, RSCC = 0.76;
(3) PDB: 2ITY,[35] ligand IRE, RSCC = 0.76; and (4) PDB: 2JKO,[36] ligand BJI with RSCC = 0.84.
Examples of ligands belonging to amber category
in which the portion
not described by the ED is solvent-exposed and not interacting with
the protein. The (2mFo–DFc)
map contoured at 1σ is shown as a wireframe and the protein
is in cartoon representation. (1) PDB: 4D2R,[33] ligand
DYK, RSCC = 0.68; (2) PDB: 5P9H,[34] ligand 7G7, RSCC = 0.76;
(3) PDB: 2ITY,[35] ligand IRE, RSCC = 0.76; and (4) PDB: 2JKO,[36] ligand BJI with RSCC = 0.84.The percentage of ligand atoms that are uncovered by the ED in
the amber category varies from about 6–38%, signifying that
most of the ligand is supported by the density with some portion that
is not supported by experimental evidence. It is observed that the
ligand PoseScore increases with increasing disorder in the ligand
structure with a r2 of 0.54 (Figure ). The ligand BJI
in PDB: 2JKO is about 26% uncovered by the ED, and given the trend of the other
structures in the amber category, it could be thought to be unstable.
Its predicted BPMD PoseScore is 1.1, suggesting high ligand stability.
After inspecting the original publication,[36] this is not surprising because the reasoning behind its design was
not to add new protein ligand interactions but to improve the ligand
solubility without affecting potency by including a piperazine ring[36] that is solvent-exposed (Figure , panel 4). Therefore, the high ligand stability
obtained from BPMD confirms the hypothesis that the piperazine ring
would not be detrimental to the stability of the ligand’s binding
mode, and the structural result can be trusted regardless of the missing
ED.
Figure 10
PoseScore of the structures belonging to Amber category (ligand
partially supported by ED) with respect to the percentage of atoms
that are uncovered by ED and color-coded by RSCC from low (light-yellow)
to up to 0.8 (dark-yellow).
PoseScore of the structures belonging to Amber category (ligand
partially supported by ED) with respect to the percentage of atoms
that are uncovered by ED and color-coded by RSCC from low (light-yellow)
to up to 0.8 (dark-yellow).
Discussion
Overall, BPMD showed good ability to identify
ligands whose modeled
pose is not supported by their ED (Figures and 7), even for
the cases in which the ligand is only partially supported by the ED
(Figure ). Investigating
the results in more detail reveals some insights into situations that
can be challenging for the method and caveats users should be aware
of, which are worthy of further discussion. Before being submitted
to the final metadynamics simulation, every complex underwent an equilibration
procedure as explained in the Methods section.
The last step of the last short unbiased MD simulation (0.5 ns) is
used as the reference structure for the PoseScore calculation. Consequently,
the reference structure for BPMD is not the input structure but the
equilibrated one which can, in some cases, differ significantly. Given
this observation, the PoseScore was compared to the rmsd between the
initial crystal structure and the reference equilibrated structure,
which is here called MD-rmsd, to understand if unstable poses could
be flagged in advance from the MD equilibration procedure. The PoseScore
correlates with the MD-rmsd, that is, the average displacement among
the reference structures with a r2 of
0.75 (Figure ):
the higher the MD-rmsd from the originally deposited structure, the
more likely the ligand will be unstable.
Figure 11
PoseScore correlation
with the ligand rmsd of the reference structures
obtained after the equilibration procedure, MD-rmsd. The structures
are color-coded by category: green (ligand supported by ED, circle
shape), amber (ligand partially supported by ED, diamond shape), and
red (ligand not supported by ED, cross shape). The error bar calculated
as the standard deviation of the rmsd from each of the 10 metadynamics
run is reported on top of each point. The black line is the best fit
line between PoseScore and MD-rmsd, whereas the dashed black line
indicates the line y = x.
PoseScore correlation
with the ligand rmsd of the reference structures
obtained after the equilibration procedure, MD-rmsd. The structures
are color-coded by category: green (ligand supported by ED, circle
shape), amber (ligand partially supported by ED, diamond shape), and
red (ligand not supported by ED, cross shape). The error bar calculated
as the standard deviation of the rmsd from each of the 10 metadynamics
run is reported on top of each point. The black line is the best fit
line between PoseScore and MD-rmsd, whereas the dashed black line
indicates the line y = x.It should be noted that the rmsd between the 10
reference structures
can be quite significant, especially in the cases of the red category
(Figure ) and in
the region of MD-rmsd > 3 Å. Here, the conformation of the
ligand
reference structures is not a very good representation of the initial
crystal structures. The high correlation between PoseScore and MD-rmsd
(Figure ) suggests
that MD-rmsd could be used as a preliminary indication that the initial
structure is not correctly modeled. When a shorter equilibration step
of 10 ps instead of 500 ps is used for all the cases in which at least
one reference structure shows a rmsd ≥ 3 Å, a systematic
decrease of MD-rmsd is observed, while the PoseScore does not change
significantly, confirming that the ligand in those structures are
highly unstable (Figure ).
Figure 12
Left panel: (A) PoseScore of the structures in which at
least one
of the reference structures has MD-rmsd > 3 Å. (B) PoseScore
of the same structures, run after a shorter unbiased MD simulation
of 10 ps instead of 500 ps (called NEW_PoseScore). The MD-rmsd is
below 2.5 Å for all cases. (C) PoseScore obtained with shorter
equilibration procedure (NEW_PoseScore) and with default protocol.
The best fit line is displayed as a black line and y = x as a black dotted line. In general, the overall
results did not change. Each structure is colored identically in A,
B, and C.
Left panel: (A) PoseScore of the structures in which at
least one
of the reference structures has MD-rmsd > 3 Å. (B) PoseScore
of the same structures, run after a shorter unbiased MD simulation
of 10 ps instead of 500 ps (called NEW_PoseScore). The MD-rmsd is
below 2.5 Å for all cases. (C) PoseScore obtained with shorter
equilibration procedure (NEW_PoseScore) and with default protocol.
The best fit line is displayed as a black line and y = x as a black dotted line. In general, the overall
results did not change. Each structure is colored identically in A,
B, and C.In the region of MD-rmsd up to
2 Å (Figure ), the average reference structure can be
considered close enough to the initial starting structure as opposed
to the ones in the region with MD-rmsd > 2.5 Å. The ligands
in
the green category have lower structural variability (low MD-rmsd),
in agreement with the PoseScore below the threshold of 2. The presence
of outliers in the region of MD-rmsd between 1.3 and 2 Å (Figure ) shows the advantage
of using the metadynamics production phase. In the case of 5XHK, 5MYM, and 4WZ1, the benefit of
BPMD is clear; in fact, despite the MD-rmsd being below 1.5 Å,
which could be indicative of ligand stability, the BPMD bias was needed
to correctly classify them as unstable (PoseScore > 2) in agreement
with the missing ED from the experiment. A different situation is
observed for 2HWQ and 5Q17 in
which PoseScore is found below the threshold of 2 and also that the
MD-rmsd is lower than 2 Å. The results obtained for this data
set suggest that BPMD can successfully discriminate between crystal
structures that are correctly modeled and those that are not, but
also that the MD-rmsd obtained from the short equilibration step might
be informative of dubious structures. A thorough investigation of
MD as compared to BPMD in identifying questionable crystal structures
is required to identify the optimum protocol to balance accuracy against
computational efficiency.
Figure 13
PoseScore in the range of MD-rmsd up to 2 Å
where the reference
structures can be considered similar to starting structure. The structures
identified with their PDB name represent the advantages (5XHK, 5MYM, and 4ZW1) and potential limitation
(5Q17 and 2HWQ) of using the BPMD
protocol. All the structures in the green category show low MD-rmsd.
PoseScore in the range of MD-rmsd up to 2 Å
where the reference
structures can be considered similar to starting structure. The structures
identified with their PDB name represent the advantages (5XHK, 5MYM, and 4ZW1) and potential limitation
(5Q17 and 2HWQ) of using the BPMD
protocol. All the structures in the green category show low MD-rmsd.Several of the structures that were considered
to be misclassified
by BPMD, that is, there was disagreement between the ED classification
and the PoseScore, deserve further discussion. The ligand 4HY found
in PDB: 2PIT(37) binds to the surfaced-exposed allosteric
pocket called Binding Function 3 of the androgen receptor. It has
a RSCC = 0.93 but resulted in a high PoseScore of 4.1. In the original
paper,[37] it is claimed that it binds weakly
(IC50 ≈ 50 μM) to the surface weakening the
contacts between the androgen receptor and coactivator proteins. In
this case of low binding affinity in an open solvent-exposed site,
the ligand can be readily displaced under the BPMD bias, despite being
supported by the ED. A similar exposed pocket was observed in the
case of three ligands PDB: 5MRD, 2XPA, and 2XP6,
where there is agreement between ED and PoseScore, so this is not
a consistently observed problem.Another interesting case is
that of the MB3 ligand crystallized
in the second bromodomain of PHIP (PDB: 3MB3(38)), which
has a high RSCC value of 0.93 but showed a PoseScore higher than 2.
Interestingly, the ligand is a very small fragment possessing only
seven heavy atoms, therefore, very different from others investigated
here and in previous BPMD studies. It is possible that the metadynamics
parameters may not be well optimized for such a small fragment.Among the incorrectly predicted crystal structures in the red category,
it is worth mentioning PDB: 2HWQ.[39] Despite a poor ligand
RSCC score of 0.23 and ambiguous fragmented density in the omit maps,
it has a PoseScore of 1.21. The binding pocket is narrow, and the
ligand binds deep into it. The same authors have solved structures
of related ligands bound to the same protein (PPARG) such as PDB: 2F4B(40) (RSCC = 0.73) and 2HWR(39) (RSCC = 0.71). Therefore,
the ligand may have been modeled by using prior knowledge from these
studies in addition to the ED. A crystal structure with comparable
ligand binding mode and good RSCC value (PDB: 5GTN(41)) was also solved a few years later, corroborating the deposited
ligand binding mode. Therefore, in this instance, although the ED
does not support the binding mode, the ligand may have been correctly
modeled using other information, explaining why it is miscategorized
in this work.The cases discussed above suggest that particularly
small ligands
or open binding sites may be challenging for BPMD. To investigate
whether these factors are a key influence on the BPMD score, heavy
atom count, protein–ligand H-bond count, and binding pocket
volume have been plotted against PoseScore (Figure ). No correlations were seen, suggesting
that these factors are not driving the prediction from PoseScore.
Figure 14
Relationship
between PoseScore and number of heavy atoms (A), volume
of binding site (B), and hydrogen bonds (C) of the ligands in the
data set. Structures belonging to category green, ligand supported
by ED are colored in green with circle shape; from category amber,
ligand partially supported by ED are in yellow with diamond shape;
and from category red, ligand not supported by underlying ED are in
red and with cross shape.
Relationship
between PoseScore and number of heavy atoms (A), volume
of binding site (B), and hydrogen bonds (C) of the ligands in the
data set. Structures belonging to category green, ligand supported
by ED are colored in green with circle shape; from category amber,
ligand partially supported by ED are in yellow with diamond shape;
and from category red, ligand not supported by underlying ED are in
red and with cross shape.Out of the 69 total structures, 34 had binding affinity data as
retrieved from the primary literature citation where they were first
discussed in the form of IC50. The binding data were converted
into pIC50, and a poor correlation was observed with PoseScore
(r2 = 0.33, see Figure ) as opposed to what was found by Clark
et al.[16] in which they use BPMD as a tool
to prioritize IFD ideas. In this work, despite the binding affinity
of the ligand under investigation, if the starting position is correctly
modeled, it will result in a PoseScore below 2; therefore, BPMD should
not be regarded in this way as a tool to prioritize ideas based on
the PoseScore.
Figure 15
PoseScore of all the crystal structures with pIC50 data
color-coded
by category (green and circle, complexes supported by ED; amber and
diamond, complexes partially supported by ED; and red and cross, complexes
not supported by ED).
PoseScore of all the crystal structures with pIC50 data
color-coded
by category (green and circle, complexes supported by ED; amber and
diamond, complexes partially supported by ED; and red and cross, complexes
not supported by ED).
Conclusions
We
have investigated the capabilities of the BPMD tool to correctly
identify stable ligands using a diverse data set of crystal structures
from published literature and collected from the Twilight database.
Primarily, this study has been used as a validation that BPMD is a
useful predictor of binding mode stability, in agreement with previous
studies looking at docking and scoring. From the conducted validation
on 69 complexes with different physical chemical properties, BPMD
has shown good performance categorizing ligand binding modes supported
by ED from those not supported. In particular, it was identified that
ligands supported by ED show a PoseScore below 2 as opposed to the
unsupported ones, which have a PoseScore above 2. BPMD scores do not
seem to correlate with binding affinity, suggesting that the method
is useful for assessing the stability of individual poses or the relative
stability of different poses of the same ligand, but not for scoring
different ligands.From the more challenging cases in which
the ligand is partially
supported by the underlying ED, it was observed that if the disorder
comes from parts of the ligand that do not participate in any protein
interaction, then the overall stability of the ligand, given that
the rest of it is correctly modeled, will not be affected. Thus, information
from BPMD simulations could help medicinal and computational chemists
in designing more potent compounds maximizing the ligand–protein
interactions where the most stable portion of the ligand lies and
incorporating flexibility and solubilizing groups in the noninteracting
portion without compromising the overall binding stability.Aside from the validation of the methodology, the data presented
here suggest that BPMD may be a useful tool for the crystallographer.
Solving crystal structures is an expert job incorporating information
beyond that contained in the ED, but this makes it open to a degree
of subjectivity. These results suggest that BPMD could be useful in
assessing preliminary poses generated by crystallography and highlighting
those that would need further investigation or confirmation before
undertaking more time-consuming and expensive strategies. Specific
cases have been identified, which can be challenging for BPMD, including
very small fragments, surface-exposed binding pockets, and interactions
with no hydrogen bonds. There are a number of parameters within BPMD
that could be investigated to see whether they give an improvement
in these cases, such as Gaussian width and height, choice of CV, and
run time. However, they are not investigated further here because,
overall, the default parameters offer good discrimination.Finally,
the average rmsd of the equilibrated structures prior
to the metadynamics simulations has appeared to be informative of
inherently unstable ligands. Interestingly, it is observed that the
reference structure used in the BPMD protocol might have substantially
changed from the input structure especially for the cases in which
the ligand is not supported by ED. A brief investigation of using
a shorter equilibration procedure on the cases with MD-rmsd > 3
Å
showed overall unchanged results with a PoseScore that does not change
significantly from the one obtained with the default protocol. It
would be interesting to know in a more systematic way which protocol
of unbiased MD is needed to obtain comparable BPMD results, and this
will be the subject of further studies.
Authors: James H Nettles; Huilin Li; Ben Cornett; Joseph M Krahn; James P Snyder; Kenneth H Downing Journal: Science Date: 2004-08-06 Impact factor: 47.728
Authors: Jason G Kettle; Peter Ballard; Catherine Bardelle; Mark Cockerill; Nicola Colclough; Susan E Critchlow; Judit Debreczeni; Gary Fairley; Shaun Fillery; Mark A Graham; Louise Goodwin; Sylvie Guichard; Kevin Hudson; Richard A Ward; David Whittaker Journal: J Med Chem Date: 2015-03-12 Impact factor: 7.446
Authors: Andrew T Bender; Anna Gardberg; Albertina Pereira; Theresa Johnson; Yin Wu; Roland Grenningloh; Jared Head; Federica Morandi; Philipp Haselmayer; Lesley Liu-Bujalski Journal: Mol Pharmacol Date: 2017-01-06 Impact factor: 4.436
Authors: Jun Young Jang; Minseob Koh; Hwan Bae; Doo Ri An; Ha Na Im; Hyoun Sook Kim; Ji Young Yoon; Hye-Jin Yoon; Byung Woo Han; Seung Bum Park; Se Won Suh Journal: Biochim Biophys Acta Proteins Proteom Date: 2017-03-22 Impact factor: 3.036
Authors: Huy H Nguyen; Yesim A Tahirovic; Valarie M Truax; Robert J Wilson; Edgars Jecs; Eric J Miller; Michelle B Kim; Nicholas S Akins; Lingjie Xu; Yi Jiang; Tao Wang; Chi S Sum; Mary E Cvijic; Gretchen M Schroeder; Lawrence J Wilson; Dennis C Liotta Journal: ACS Med Chem Lett Date: 2021-10-04 Impact factor: 4.632
Authors: Yuchen Zhou; Steven Ramsey; Davide Provasi; Amal El Daibani; Kevin Appourchaux; Soumen Chakraborty; Abhijeet Kapoor; Tao Che; Susruta Majumdar; Marta Filizola Journal: Biochemistry Date: 2020-12-04 Impact factor: 3.162
Authors: Alex Macpherson; Maisem Laabei; Zainab Ahdash; Melissa A Graewert; James R Birtley; Monika-Sarah Ed Schulze; Susan Crennell; Sarah A Robinson; Ben Holmes; Vladas Oleinikovas; Per H Nilsson; James Snowden; Victoria Ellis; Tom Eirik Mollnes; Charlotte M Deane; Dmitri Svergun; Alastair Dg Lawson; Jean Mh van den Elsen Journal: Elife Date: 2021-02-11 Impact factor: 8.140