Crystal N Nguyen1, Anthony Cruz2, Michael K Gilson1, Tom Kurtzman2. 1. Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego , 9500 Gilman Drive, La Jolla, California 92093-0736, United States. 2. Department of Chemistry, Lehman College, The City University of New York , 250 Bedford Park Blvd. West, Bronx, New York 10468, United States.
Abstract
Water molecules in the active site of an enzyme occupy a complex, heterogeneous environment, and the thermodynamic properties of active-site water are functions of position. As a consequence, it is thought that an enzyme inhibitor can gain affinity by extending into a region occupied by unfavorable water or lose affinity by displacing water from a region where it was relatively stable. Recent advances in the characterization of binding-site water, based on the analysis of molecular simulations with explicit water molecules, have focused largely on simplified representations of water as occupying well-defined hydration sites. Our grid-based treatment of hydration, GIST, offers a more complete picture of the complex distributions of water properties, but it has not yet been applied to proteins. This first application of GIST to protein-ligand modeling, for the case of Coagulation Factor Xa, shows that ligand scoring functions based on GIST perform at least as well as scoring functions based on a hydration-site approach (HSA), when applied to exactly the same simulation data. Interestingly, the displacement of energetically unfavorable water emerges as the dominant factor in the fitted scoring functions, for both GIST and HSA methods, while water entropy plays a secondary role, at least in the present context.
Water molecules in the active site of an enzyme occupy a complex, heterogeneous environment, and the thermodynamic properties of active-site water are functions of position. As a consequence, it is thought that an enzyme inhibitor can gain affinity by extending into a region occupied by unfavorable water or lose affinity by displacing water from a region where it was relatively stable. Recent advances in the characterization of binding-site water, based on the analysis of molecular simulations with explicit water molecules, have focused largely on simplified representations of water as occupying well-defined hydration sites. Our grid-based treatment of hydration, GIST, offers a more complete picture of the complex distributions of water properties, but it has not yet been applied to proteins. This first application of GIST to protein-ligand modeling, for the case of Coagulation Factor Xa, shows that ligand scoring functions based on GIST perform at least as well as scoring functions based on a hydration-site approach (HSA), when applied to exactly the same simulation data. Interestingly, the displacement of energetically unfavorable water emerges as the dominant factor in the fitted scoring functions, for both GIST and HSA methods, while water entropy plays a secondary role, at least in the present context.
The
binding of a drug-like molecule to a protein leads to displacement
of water molecules from the protein’s binding pocket, and the
thermodynamics of this displacement process is thought to contribute
significantly to the overall thermodynamics of protein–ligand
binding.[1−14] For example, displacement of water that is tightly bound via multiple
water–protein hydrogen bonds may incur a large energetic penalty,
whereas displacement of water from hydrophobic parts of the binding
pocket may help drive ligand-binding. Intuitively, one may view different
parts of the protein’s surface as imposing different surface
energies on the nearby water, with correspondingly different thermodynamic
consequences for water displacement by various ligands.The
use of molecular distribution functions[15−21] to analyze molecular dynamics (MD) simulations has led to important
advances in the study of binding site water and its role in molecular
recognition; parallel progress with the 3D RISM approach[22−24] also deserves mention but is not considered here. Key early contributions
include development of WaterMap[8,12] (Schrödinger
LLC), STOW,[25] and other approaches,[26,27] which have provided new insight and shown promise as tools to help
discover small molecules that will bind a targeted binding pocket.
Such methods frequently define spherical sites, where water is present
at high density, to represent the distribution of water in the binding
site. This hydration site approach (HSA) is motivated in part by the
practical consideration that, in regions where water is present at
lower density, it becomes more difficult to obtain converged values
of the local orientational entropy of water. This is a simple consequence
of the lower number of water samples available from the simulation
in such low-density locations. The HSA strategy of limiting attention
to hydration sites where water is present at high density maximizes
the chances for good numerical convergence of the orientational entropy.
However, as previously discussed,[28] the
regions in a binding site where water is present at high density can
have a complex shape, which is not easily represented by a collection
of spheres.This limitation has been addressed in a grid-based
implementation
of inhomogeneous solvation theory (IST), termed GIST.[28,29] Instead of constructing hydration sites, GIST discretizes the smooth
distributions of water density and other properties onto a fine, three-dimensional
grid. The problem of converging the local orientational entropy of
water is overcome through the use of a highly efficient nearest-neighbor
(NN) method, as opposed to histogram methods, which require more sampling
to reach adequate convergence.[30,31] GIST can also take
advantage of the fact that regions of lower density contribute proportionately
less than regions of higher density regions to the overall orientational
entropy of the displaced water. This density-weighting means that,
if one is interested in the integral of the orientational entropy
over a volume containing both high and low density regions, one can
converge the overall integral to an acceptable tolerance, so long
as the high-density regions are well converged. Alternatively, the
grid approach makes it straightforward to focus on regions where water
is present at high density, as done in HSA, without simplifying their
shapes.Here, we describe the first test of GIST for a ligand–protein
system. In order to establish a clear basis for comparing methods,
we study coagulation factor Xa (FXa) with a set of small molecule
inhibitors used in early studies of the WaterMap method,[12] and we derive scoring functions based on both
GIST and HSA methods. For this initial test of GIST’s applicability
to protein–ligand modeling, we do not seek to establish a full-fledged
protein–ligand scoring function, suitable for virtual screening
or lead optimization. Instead, as previously done,[12] we ask how well the GIST treatment of hydration can capture
affinity differences between closely related congeneric pairs of ligands,
where differences in binding affinity that result from contributions
other than solvation such as configurational entropy and protein–ligand
energy are minimal. Our results support the applicability of GIST
and, in addition, provide an unexpected outcome regarding the role
of energetically versus entropically unfavorable water.
Methods
We ran explicit-water MD simulations of FXa and
used both GIST
and our local HSA implementation to extract information about the
structure and thermodynamics of the water in the binding site. We
then considered the displacement of this binding site water by various
FXa inhibitors, whose binding site poses are known or could be inferred
from the known poses of very similar compounds. Candidate scoring
functions, based on the computed properties of the water displaced
by each ligand, were trained on subsets of the experimental affinity
data and then tested on separate sets, in order to assess the utility
of the hydration data to resolve the relative binding affinities of
pairs of congeneric ligands. Details of the computational methods
are presented in the following subsections.
Grid
Inhomogeneous Solvation Theory (GIST)
GIST
Implementation
As previously
detailed, GIST uses a three-dimensional rectangular grid of cubic
voxels in the region of interest and processes the snapshots of an
MD trajectory to compute the following thermodynamic quantities for
each voxel, k, centered at location r:• ΔEnorm, the mean solute–water interaction energy of a
water molecule in voxel k.• ΔEnorm, one-half the
mean interaction energy of the water in voxel k with
all other waters. The factor of 1/2, which was not included in the
definition of the corresponding water–water interaction energy
in our initial presentation of GIST,[28] is
customary in liquid-state theory; it allows the total energy of neat
water to be written as the sum of the individual water energies.• −TΔStrans,norm, the single-body (one-water) translational entropy of water in voxel k, relative to bulk, normalized to the mean number of waters
in the voxel.• −TΔSorient,norm, the single-body (one-water) orientational
entropy of water in voxel k, relative to bulk, normalized
to the mean number of waters
in the voxel.Note that we previously used a somewhat different
notation for
these quantities;[29] for example, ΔEnorm was ΔEswnorm(r). Here, a 20.5 × 20.5 × 22.5
Å grid was centered on the active site of FXa. The grid spacing
of 0.5 Å provides voxels large enough to give statistically meaningful
data but small enough to still give a high resolution description
of the density distribution functions.[28] It is worth noting that the volume of each voxel is 33.5 times less
than that of a hydration site (see below), as the latter represents
a sphere of radius 1 Å. The GIST Lennard-Jones and electrostatic
energies were computed from stored MD frames, using the minimum image
convention and no cutoff, and the reference value of the bulk water–water
interaction energy was computed in the same convention. The main GIST
calculations presented here used 100 000 frames saved at 1
ps intervals during a 100 ns production MD run, but shorter durations
were also examined, to study convergence, as detailed below.
Functional Form of GIST-Based Scoring Functions
When
a ligand binds a protein, it displaces water from the protein’s
binding site. If the displaced water was unfavorable relative to bulk,
then water displacement should make a favorable contribution to the
ligand’s binding affinity. With this in mind, we initially
looked for a correlation between measured ligand binding affinities
and regional hydration free energies (eq 25 of ref (28)), where the region was
defined as those voxels covered by each ligand in a bound pose. However,
finding little correlation (data not shown), we conjectured that any
underlying correlation had been obscured by noise, due to sharp variations
in the hydration energies with even small changes in ligand position.
This sharpness traces, at least in part, to our use of a single pose
for each ligand, and a restrained protein structure in the water simulations.
We addressed this issue by constructing scoring functions which are
based on the GIST data but are less sensitive to the details of local
water properties, due to the use of cutoffs in local water density,
energy, and entropy. The use of cutoffs to construct a well-behaved
scoring function from local hydration data was first introduced in
the context of a hydration site model.[12]We tested three such scoring functions based on the GIST hydration
data available from the grid described above. In all three cases,
voxel k can contribute to a ligand’s score
only if the voxel’s center, r, lies within the van der Waals radius of any atom
of the ligand. For a given ligand, then, each voxel is assigned a
binary displacement indicator, d, which equals 1 if the center of the voxel lies within the
van der Waals radius of any ligand atom and 0 if it does not. The
van der Waals radii are drawn from the software package Crystal Maker
(CrystalMaker Software Ltd.), which in turn relies on Bondi.[32] We also allowed the scoring function to focus
on voxels where water is present at high density by setting up an
additional binary indicator, g, which is set to 1 if the water density in voxel k exceeds a cutoff gco, and 0 if it does
not. This cutoff is one of the trained parameters, so it will be greater
than zero only if imposing a density cutoff actually improves the
accuracy of the scoring function. Finally, we set up similar cutoffs
for the total energy and entropy, Enorm and −TΔSnorm, associated with each voxel k, and used these to define additional binary masks based
on energy and entropy thresholds. Thus, the binary mask, e, equals 1 if Enorm exceeds the cutoff, Eco, and 0 otherwise;
and the binary mask, s, equals 1 if −TΔSnorm exceeds the cutoff, Sco, and
0 otherwise. Like the density cutoff, gco, the values of Eco and Sco are fitted parameters and hence are free to go to zero
if imposing these cutoffs does not improve the accuracy of the scoring
function. The total energy and entropy were computed as Enorm ≡ ΔEnorm + 2ΔEnorm – 2Enorm and −TΔSnorm ≡ −TΔStrans,norm – TΔSorient,norm, respectively. The quantity Enorm is the mean interaction energy of the water in voxel k with the protein and all other waters, relative to what its interactions
would be in bulk, 2Eww,bulknorm, computed in the same convention
as the other GIST quantities.With the voxels’ binary
masks in place for density, energy,
and entropy, we now define the three candidate scoring functions,
one using both the energy and entropy data from GIST, the second using
only the energy data, and the third using only the entropy data:Here, Eaff and Saff are additional
fitted parameters, which specify the affinity
increments provided by voxels surpassing the energy and entropy cutoffs,
respectively, and also meeting the criteria d = g = 1. Note that in this study, each of these scoring functions
was trained separately and has its own fitted values of gco and C, as well as Eco and Eaff and/or Sco and Saff.
Training and Testing of the GIST-Based Scoring
Function
We adjusted the scoring functions described above
to fit the measured relative binding free energies, ΔΔGexpt, of 28 different congeneric pairs of FXa
inhibitors (see below). Separate training and test sets were used,
in order to avoid overfitting of the parameters. Thus, we used a random
number generator to split the 28 pairs into two arbitrarily selected
sets of 14 apiece. Ten such random splits were carried out, creating
10 distinct training and test sets. Parameters were optimized for
each training set and then tested on the corresponding test set. We
report means and standard deviations over these 10 splits for the
resulting fitted parameters and accuracy metrics. We further assess
the significance of these results by comparing them with results obtained
after a shuffling operation, which used the gsl_permutation function
in the GNU Scientific Library, to randomly exchange the entropy and
enthalpy values among pairs of voxels.For each training set,
the parameters were adjusted as follows, using ΔES as an example. We scanned values
of Eco and Sco from 0 to 4 kcal/mol and values of gco from 0 to 4 (in units of neat water density), each in increments
of 0.1. This scan yields 41 × 41 × 41 = 68 921 combinations
of the three cutoff values. For each combination, the sums in eq 1 were computed for all ligands. Linear regression
was then used to obtain values of Eaff and Saff that provide the highest correlation
coefficient (R2) of the relative scores
for the congeneric pairs, ΔΔGES, to the corresponding experimental values, ΔΔGexpt, for the training set. The optimized values
of the five fitted parameters were then used to compute ΔΔGES for the congeneric pairs in the training
set, and the reliability of the scoring function was evaluated based
on the resulting value of R2 for the test
set. Analogous procedures were used for the other two scoring functions,
ΔE and ΔS. These require scanning only
41 × 41 = 1681 cutoff combinations and yield only one of the
two affinity parameters, Eaff or Saff, rather than both as for ΔES.
Hydration Site Analysis (HSA)
Assignment
of Thermodynamic Properties to
Hydration Sites
Hydration sites in the FXa binding site were
defined and analyzed thermodynamically based on the same MD simulation
used for the GIST calculations, using the first 10 ns (10 000
frames), in accord with the common practice of using approximately
2–10 ns simulations for HSA calculations.[12,27,33−35] We used every 10th frame
of this segment to identify the hydration sites. We first collected
all instances, in these 1000 frames, of water molecules within 5 Å
of any heavy atom of any bound ligand (see below). For each water
molecule in this set, we counted the number of neighboring waters
from the same set, using the criterion of an oxygen–oxygen
distance within 1 Å. With this definition, a water molecule can
count as its own neighbor, if two instances of it in different frames
meet the distance criterion. The location of the first hydration site
was then set to the coordinates of the wateroxygen with the most
neighbors. This water molecule and all of its neighbors were then
removed from consideration as potential hydration sites, and the location
of the next hydration site was set to the coordinates of the remaining
wateroxygen with the most neighbors, based on the initial counts.
This removal process was iterated until the number of neighbors of
all remaining waters was less than twice that expected for a 1000
frame simulation of bulk water (i.e., < 280 from 1000 frames).
Each hydration site then was associated with all water instances,
from the full 10 000 MD frames (above), whose oxygens lay within
1 Å of the site.Each hydration site i was associated with a mean energy E and a one-body entropy S. The energy of a water molecule in a given hydration
site was calculated as half the difference between the total energy
of the water–protein system with the water present and without
it. A script invoking the program AMBER,[36] with settings matched to those of the MD simulation, was used to
compute these energies. The mean energy of the hydration site then
is the average of these energies for all water molecules that populate
the site, minus the average energy of a water molecule in neat water
from matched calculations. The water entropy S associated with hydration site i is the sum of its one-body translational and orientational entropies, Strans ≡ −kρ g(r) ln g(r) dr and Sorient ≡ ((−kN)/Ω) g(ω) ln g(ω) dω, where r is position in
the protein frame of reference, k is Boltzmann’s
constant, ρ is bulk water density, g(r) is the local water density referenced to bulk, N is the number of water molecules
associated with hydration site i, Ω ≡
8π2, V indicates an
integral restricted to the spherical hydration site, and ω
defines orientational coordinates in the protein frame of reference.
The translational entropy was computed by the histogram method, where
spherical coordinates r, θ, ϕ centered
on the hydration site were divided into uniformly spaced bins in r, cos θ, ϕ to generate 512 three-dimensional
bins of equal volume, with r ∈ [0, 1] Å,
θ ∈ [0, π], and ϕ ∈ [0, 2π].
The orientational entropy associated with a hydration site was computed
via the same nearest neighbor method used by GIST for individual voxels.[28]
HSA-Based Scoring Function
We used
the hydration sites described above as the basis for three cutoff-based
scoring functions, whose functional forms build on prior work.[12] Like the three GIST scoring functions (above),
the free HSA-based scoring functions are based on, respectively, both
energy and entropy, energy alone, and entropy alone:Here, the sums
range over hydration
sites i; e and s equal
1 if E and –TS are greater than cutoff values Eco and Sco, respectively,
and 0 otherwise; d is
a displacement function, defined below, which accounts for the overlap
of the ligand, in a given pose, with hydration site i; Eaff and Saff are fitted constants; and C is a constant offset.
Note that the HSA scoring parameters in eq 2 are set independently of the GIST scoring parameters in eq 1, despite the use of some equivalent symbols.We tried two forms of the displacement function. One is identical
to that used in the previous paper,[12] while
the second, as discussed below, applies a physically motivated cap
to this quantity:Here, Θ(x),
the Heaviside step function,
equals 0 if x ≤ 0 and 1 otherwise; Rco is a distance cutoff; R is the distance between hydration site i and atom j of the ligand being scored;
and the sum runs over all atoms belonging to the ligand being scored, i. For dnocap, which is modeled on the prior
hydration site scoring function,[12] each
ligand atom within Rco of the hydration
site makes a contribution that scales between 0 and 1 as it approaches
the center of the site, so the displacement function accounts for
the degree to which a ligand displaces the water in the site. However,
this approach is nonphysical in the sense that, if R < Rco for multiple ligand atoms j and a single site i, then the displacement of solvent from the site may be
multiply counted. That is, for the energy/entropy scoring function,
a site might contribute more than Eaff + Saff; for the energy-only scoring
function, it might contribute more than Eaff; and for the entropy-only scoring function, it might contribute
more than Saff. Indeed, in the present
study, we found values of dnocap up to 3.2, implying
that a single site might contribute over three times. We therefore
also considered the alternative displacement function, dcap, which is capped at a value of 1, so that no hydration
site may contribute more than one-fold to a ligand’s score,
in accordance with the fact that the site cannot be displaced more
than once.
Training and Testing
of the HSA-Based Scoring
Function
The parameters of the HSA-based scoring functions
were adjusted in the same manner as those of the GIST-based scoring
functions, except that Rco took the place
of gco. Thus, values of Rco were scanned from 2 to 3 Å, in steps of 0.1 Å
and, as for GIST, values of Eco and Sco were scanned from 0 to 4 kcal/mol in increments
of 0.1 kcal/mol. (Note that no hydration sites had energies greater
than 3.7 kcal/mol.) The sums in eq 2 were evaluated
for each ligand and with each combination of Rco, Eco, and Sco. For each set of cutoffs scanned, values of Eaff and Saff were
obtained by linear regression against the differences in measured
binding free energies for a training set of congeneric ligand pairs,
for each scoring function in eq 2. The parameters
that yielded the highest correlation coefficients were chosen and
were tested for their ability to reproduce the difference in binding
affinities for the congeneric pairs in the test set. This procedure
was applied to the same 10 training and test sets used for the GIST
scoring function. As done for GIST, we assessed the significance of
the HSA results by comparing them with results obtained after a shuffling
operation, which randomly exchanges the entropy and enthalpy values
across pairs of hydration sites.
Molecular
Systems and Modeling
Molecular Dynamics Simulations
of Binding
Site Water
Both GIST and the HSA methods take as input a
Boltzmann sample of water configurations for a given configuration
of the protein. Here, we used molecular dynamics (MD) simulations
to generate this sample from the canonical ensemble (NVT), as follows.
We used the structure of FXa from Protein Data Bank[37,38] entry 1FJS,[39] as previously done,[12] and our assignment of protonation states was also consistent
with this prior study. We removed the ligand from the binding site
and used Tleap and other Amber Tools[36] to
assign protein parameters from the AMBER99sb force field[40] and solvate the protein with 8557 TIP3P water
molecules.[41] The simulations used a periodic
box with dimensions 66.5 Å × 72.2 Å × 60.9 Å,
which afforded at least 10 Å between any protein atom and the
edge of the periodic box. Four disulfide bonds were set up to join
cysteine pairs 7/12, 27/43, 156/170, and 181/209, and two ions observed
in the crystal structure (Ca2+ and Cl–) were restrained to their original positions. The resulting simulation
system had 29 338 atoms, comprising the protein, the two ions,
and the water molecules.Energy minimization, followed by MD
simulation, was carried out with the Amber 12 software using pmemd.cuda[42] on a single GPU. First, the energy of the system
was minimized in two rounds; both used 1500 steps of the steepest
descents algorithm followed by the conjugate gradient method for a
maximum of 2000 steps. In the first round, all protein atoms were
harmonically restrained to their initial positions with a force constant
of 100 kcal/mol/Å2. In the second round, the system
was further relaxed keeping only non-hydrogen protein atoms restrained,
with the same force constant. The energy minimized system was then
heated with a series of 20 ps constant-volume and -temperature MD
simulations with the first simulation at 50 K and the temperature
incremented by 50 K every 20 ps until 300 K was reached. The system
was then equilibrated for 10 ns at 300 K at a constant pressure of
1 atm. At the final volume, the system was then equilibrated for an
additional 5 ns at constant volume. The final MD production run of
100 ns was at constant number of particles, volume, and temperature
(NVT), and system configurations were stored every 1 ps, for a total
of 100 000 stored configurations. During all MD simulations,
all protein atoms were harmonically restrained to their positions
following the energy minimization step, with a force constant of 100
kcal/mol/Å. The SHAKE algorithm[43] was
used to constrain the lengths of all bonds involving hydrogen atoms.
Temperature was regulated by Langevin dynamics with a collision frequency
of 2.0 ps–1. A 9 Å cutoff was applied to all
nonbonded interactions. Particle mesh Ewald was implemented to account
for long-range electrostatic interactions, and the Leapfrog algorithm
was used to propagate the trajectory. For the constant pressure simulations,
isotropic position scaling was implemented with a pressure relaxation
time of 0.5 ps. The main GIST and HSA solvation maps were produced
from these configurations. However, in order to study the convergence
properties of GIST, we performed two additional 20 ns NVT production
runs storing configurations every 0.05 ps; one was begun identically
with the 100 ns run, and the other was begun with the last MD configuration
of the 100 ns run.
Ligand Data Set and Preparation
We trained and tested the scoring functions with a set of 28 congeneric
ligand pairs (see Supporting Information), where the members of each pair differ only by small, localized
chemical changes in rigid moieties, leading to differential displacement
of solvent. As previously discussed,[12] this
approach minimizes the contributions of free energy terms other than
hydration and thus allows a focus on the quantity of central interest
in this study. The 28 pairs used here are a subset of 31 drawn from
several experimental series[44−55] for use in a previous computational study:[12] we eliminated three pairs (Matter:25/Matter:28; Matter:28/2BMG:I1H;
Mueller:3/Mueller:2), because we were not confident of the conformation
of at least one member of the pair, and hence of the location of the
displaced solvent. In particular, Mueller:3 differs from Mueller:2
by a phenyl group whose orientation is not clear, because it is attached
by a rotatable bond; and for the other two pairs, an aromatic ring
changes to a nonaromatic ring, whose conformation is uncertain. It
is worth noting that, for this set of ligands, there is essentially
no correlation between binding free energy and molecular weight (R2 = 0.12).Ligand poses were drawn from
available cocrystal structures or generated from a cocrystal structure
of a closely related ligand by a small chemical adjustment. In all
cases, the cocrystal structure was aligned with the simulated protein
(above) to generate an initial pose in the binding site for which
the hydration structure was computed. Final poses for modeling solvent
displacement were generated by protonating the ligands, then minimizing
the initial poses in the simulated protein structure while allowing
the ligand and protein hydrogen atoms to relax. The atomic partial
charges for each ligand were obtained from the restrained electrostatic
potential (RESP) method,[56] using quantum-mechanically
derived electrostatic potentials with the 6-31G* basis set. Other
ligand force field parameters were obtained using GAFF.[57] Note that these parameters were used only to
generate the ligand poses studied with the GIST and HSA scoring functions.
Results
Comparison of Grid and
Site Representations
of Water Density
The HSA hydration sites are informative
about water density but do not capture the level of detail available
from GIST’s grid-based method. The two approaches are compared
in Figure 1, which displays the HSA sites (blue
spheres) computed for the binding pocket of Factor Xa, along with
GIST’s gridded representation of water density, contoured at
three different levels. Contours of water density at 6 times that
of bulk (g = 6) appear as discrete, mostly convex
droplets (yellow contours, left panel), although a few of these high-density
regions are elongated, rather than round. Every high-density droplet
is matched by a spherical HSA site, but there are many HSA sites in
the binding pocket that do not enclose one of these high-density droplets.
Contours at 4 times bulk density (g = 4) appear as
more and larger droplets and match the HSA sites rather well (gold
contours, middle panel). Contours at twice bulk density (g = 2) form long, curved strands, which follow the contours of the
protein surface (orange contours, right panel), and are not well represented
by the HSA sites. These begin to delineate the first hydration shell
of the protein. Finally, contours at still lower density (e.g., g = 1.5) include parts of the second hydration shell (not
shown). Overall, the HSA representation captures the droplet-like
distribution of the highest water densities but does not distinguish
between high and medium density regions and does not capture the complex
distributions of water density that become apparent at densities roughly
twice that of bulk. This observation has practical relevance, because
the scoring functions developed here include regions where water density
is of this order, as detailed in the next subsection.
Figure 1
Comparison of the GIST
and HSA representations of water density
in the Factor Xa binding pocket. HSA sites (blue spheres) are the
same in all three panels. From left to right, the GIST contour levels
are at g = 6, 4, and 2. The GIST water densities
are based on the occupancy of grid voxels by water oxygens, and the
boundaries of the grid box may be discerned in the right-hand panel.
A smoothed protein surface is shown in order to highlight the water
data.
Comparison of the GIST
and HSA representations of water density
in the Factor Xa binding pocket. HSA sites (blue spheres) are the
same in all three panels. From left to right, the GIST contour levels
are at g = 6, 4, and 2. The GIST water densities
are based on the occupancy of grid voxels by wateroxygens, and the
boundaries of the grid box may be discerned in the right-hand panel.
A smoothed protein surface is shown in order to highlight the water
data.It is also worth remarking that
a hydration site should not be
directly equated with a single bound water, as the sites’ occupancies
are typically well below one. Thus, for the HSA sites within 5 Å
of the bound ligands, we find a mean occupancy of 0.58 waters molecules,
with a standard deviation of 0.22. These values are similar to those
reported previously for Factor Xa in an early implementation of WaterMap:[12] the occupancies reported in Table 1 of the prior
report correspond to a mean of 0.51 water molecules per hydration
site, with a standard deviation of 0.22.
GIST
scoring functions
A GIST-based
scoring function based on both the local energy and one-body entropy
of displaced water yields good correlations with the measured binding
affinity differences for the 28 congeneric ligand pairs (Table 1, top row). The mean R2 value is 0.94 across 10 training sets drawn at random from the full
set of 28 pairs, and a high correlation (R2 = 0.84) is preserved when the optimized parameters are applied to
the respective test sets. The best results are obtained when voxels
are excluded if their water density is less than about twice the bulk
density (gco = 1.9). Each voxel which
furthermore meets the energy cutoff (about 0.6 kcal/mol/water above
bulk) contributes about −0.2 kcal/mol of affinity. The entropy
terms are somewhat puzzling: the entropy cutoff of 3.3 kcal/mol/water
is much higher than the energy cutoff of 0.6 kcal/mol/water, and the
positive value of Saff seemingly indicates
that displacing low entropy water disfavors binding, rather than favoring
it, as might be anticipated. These results suggest that the energy
term may be more meaningful; this topic is further discussed below.
Table 1
Scoring Functions Based on GIST and
HSA Water Calculationsa
test R2
train R2
gco or Rco
Eco
Sco
Eaff
Saff
C
actual
GIST
E/S
0.84 ± 0.06
0.94 ± 0.02
1.90 ± 0.32
0.62 ± 0.42
3.34 ± 0.46
–0.18 ± 0.05
0.08 ± 0.05
–0.23 ± 0.20
E
0.85 ± 0.05
0.92 ± 0.02
1.62 ± 0.57
0.65 ± 0.42
N.A.
–0.14 ± 0.04
N.A.
–0.23 ± 0.19
S
0.35 ± 0.11
0.49 ± 0.11
2.52 ± 0.69
N.A.
2.13 ± 0.44
N.A.
–0.10 ± 0.08
–0.50 ± 0.31
HSA Nocap
E/S
0.80 ± 0.07
0.88 ± 0.05
2.81 ± 0.28
0.28 ± 0.08
1.87 ± 0.50
–1.70 ± 0.31
0.09 ± 0.60
–0.36 ± 0.26
E
0.83 ± 0.07
0.85 ± 0.06
2.91 ± 0.07
0.30 ± 0.06
N.A.
–1.50 ± 0.15
N.A.
–0.43 ± 0.14
S
0.67 ± 0.10
0.66 ± 0.11
2.57 ± 0.17
N.A.
2.25 ± 0.05
N.A.
–1.52 ± 0.33
–0.68 ± 0.33
HSA Cap
E/S
0.66 ± 0.24
0.86 ± 0.07
2.35 ± 0.19
0.48 ± 0.52
2.10 ± 0.62
–2.44 ± 2.51
–0.18 ± 1.75
–0.47 ± 0.23
E
0.80 ± 0.08
0.83 ± 0.06
2.38 ± 0.06
0.28 ± 0.08
N.A.
–2.31 ± 0.31
N.A.
–0.53 ± 0.19
S
0.69 ± 0.12
0.72 ± 0.08
2.53 ± 0.09
N.A.
2.09 ± 0.26
N.A.
–2.38 ± 0.38
–0.50 ± 0.26
shuffled
GIST
E/S
0.42 ± 0.20
0.85 ± 0.06
2.45 ± 0.80
1.74 ± 1.14
1.19 ± 1.01
–0.52 ± 1.94
–0.44 ± 1.12
–0.64 ± 0.61
E
0.35 ± 0.20
0.62 ± 0.15
2.50 ± 0.75
0.36 ± 0.93
N.A.
0.24 ± 2.17
N.A.
–0.72 ± 0.70
S
0.44 ± 0.16
0.69 ± 0.11
1.91 ± 1.24
N.A.
1.56 ± 1.17
N.A.
–0.89 ± 0.65
–0.59 ± 0.45
HSA Nocap
E/S
0.44 ± 0.18
0.82 ± 0.08
2.89 ± 0.13
–1.52 ± 1.62
1.84 ± 1.39
0.12 ± 1.84
–0.88 ± 1.75
–0.57 ± 0.30
E
0.46 ± 0.14
0.50 ± 0.16
2.81 ± 0.06
–3.30 ± 2.20
N.A.
–0.90 ± 0.34
N.A.
–0.79 ± 0.32
S
0.38 ± 0.18
0.57 ± 0.19
2.85 ± 0.10
N.A.
2.02 ± 1.17
N.A.
–1.26 ± 0.90
–0.75 ± 0.31
HSA Cap
E/S
0.36 ± 0.20
0.78 ± 0.13
2.85 ± 0.09
–1.54 ± 2.26
1.96 ± 1.49
–0.34 ± 4.79
–3.32 ± 4.05
–0.67 ± 0.30
E
0.17 ± 0.16
0.36 ± 0.17
2.86 ± 0.12
–2.48 ± 1.78
N.A.
–1.48 ± 1.30
N.A.
–0.95 ± 0.37
S
0.28 ± 0.24
0.47 ± 0.22
2.89 ± 0.13
N.A.
2.00 ± 0.99
N.A.
–2.06 ± 1.36
–0.90 ± 0.28
The quality
of each scoring function
is reported in terms of coefficients of determination (R2). The test-set results (test R2) are the most meaningful; the training-set results (train R2) are included for completeness. All quantities
are averages across the 10 splits, with associated standard deviations.
Results are presented for the actual hydration data and for hydration
data shuffled among voxels (GIST) or hydration sites (HSA), and HSA
results are presented for capped and uncapped displacement functions
(see Methods). Scoring functions were constructed
based on both energy and entropy data (E/S), energy data only (E), and entropy data
only (S); the parameters are defined in the Methods section.
The quality
of each scoring function
is reported in terms of coefficients of determination (R2). The test-set results (test R2) are the most meaningful; the training-set results (train R2) are included for completeness. All quantities
are averages across the 10 splits, with associated standard deviations.
Results are presented for the actual hydration data and for hydration
data shuffled among voxels (GIST) or hydration sites (HSA), and HSA
results are presented for capped and uncapped displacement functions
(see Methods). Scoring functions were constructed
based on both energy and entropy data (E/S), energy data only (E), and entropy data
only (S); the parameters are defined in the Methods section.It is of interest to visualize the energy and entropy
scoring regions;
i.e., those voxels which meet either both the density and energy cutoffs
or both the density and entropy cutoffs, respectively. As shown in
Figure 2 (left), the energy scoring region
tends to localize at extended regions of the nonpolar surface. These
are places where water molecules lose more energy due to desolvation
by the protein than they gain by making new interactions with the
protein. (The projection from three to two dimensions makes some scoring
regions appear deceptively close to polar protein atoms.) The analogous
visualization for the energy-only scoring function, which is discussed
below, is very similar (data not shown). The locations of the entropy
scoring regions (Figure 2, middle) are more
complicated. Some (white arrows) lie at the surface of polar atoms;
others (pink arrows) lie at hydrophobic locations. The frequent localization
of entropy scoring regions at polar surfaces, and the unexpectedly
positive value of Saff, likely reflects
the fact that polar surfaces can tightly bind waters, leading to unfavorable
entropies but favorable energies (Figure 2,
right) typical of traditional entropy–enthalpy compensation.
Displacement of water from such regions may be net unfavorable, and
this might help account for the positive value of Saff. On the other hand, the displacement of entropically
disfavored water from subregions where energy is not particularly
favorable should favor ligand binding. Thus, the energetically mixed
nature of the entropic scoring regions further suggests that it may
not give a clear signal in the overall scoring function.
Figure 2
GIST contours
in binding site of FXa, in molecular surface representation.
Left: Energy scoring region, which meets both density and energy cutoff
criteria for the combined energy–entropy scoring function.
Middle: Entropy scoring region, which meets both density and entropy
cutoff criteria for the combined energy–entropy scoring function.
Right: Normalized total water energy, contoured at −2.6 kcal/mol/water.
White arrows: entropy scoring subregions at the polar surface. Magenta
arrows: entropy scoring subregions at hydrophobic surface. Red: oxygen.
Blue: nitrogen. Pale blue: carbon. Graphic generated with VMD.[58]
GIST contours
in binding site of FXa, in molecular surface representation.
Left: Energy scoring region, which meets both density and energy cutoff
criteria for the combined energy–entropy scoring function.
Middle: Entropy scoring region, which meets both density and entropy
cutoff criteria for the combined energy–entropy scoring function.
Right: Normalized total water energy, contoured at −2.6 kcal/mol/water.
White arrows: entropy scoring subregions at the polar surface. Magenta
arrows: entropy scoring subregions at hydrophobic surface. Red: oxygen.
Blue: nitrogen. Pale blue: carbon. Graphic generated with VMD.[58]In order to study the significance of the energy and entropy
terms
in more detail, we also considered a scoring function based only on
density and energy, and another based only on density and entropy.
As shown in Table 1 (second and third rows),
the energy-only scoring function performs just as well (R2 = 0.85 for the test sets) as the original one based
on both energy and entropy. In addition, the fitted parameters are
similar in magnitude and sign to those of the original energy term.
In contrast, the entropy-only scoring function yielded a poor correlation
with experimental data (R2 = 0.35 for
the test sets), and the sign of Saff is
reversed relative to that in the original scoring function. Thus,
the hydration energy alone carries all of the predictive power of
the GIST-based scoring function, at least for FXa with this ansatz.
This result is consistent with the analysis of the combined energy/entropy
scoring function, above.As a further check of the statistical
significance of the present
results, we shuffled the GIST data among voxels and then refitted
all three GIST-based scoring functions using the shuffled data. The
results are presented in the lower half of Table 1. This procedure was repeated with three independent randomizations.
Although correlations as high as R2 =
0.82 are obtained for the training sets, the test-set results are
all poor (R2 ≈ 0.4 ± 0.2).
This result supports the significance of the high correlations obtained
with the true (unshuffled) data and indicates that the low correlations
observed for the entropy-only scoring function should be viewed as
statistically insignificant.Finally, we examined the amount
of MD simulation data required
to generate the high correlations observed above. First, we reanalyzed
the original set of MD frames, which had been saved at 1 ps intervals.
For increasing numbers of frames from this set, we reran the 10 training
and test calculations and computed the mean and standard deviation
of the resulting 10 values of R2. As shown
in Figure 3 (left), the value of R2 and its standard deviation (error bars) appear to plateau
at about 60 ns for the combined energy–entropy scoring function
(top left). Interestingly, the plateau starts much earlier, at about
30 ns, for the energy-alone scoring function (lower left). It appears
that the more slowly convergent entropy term delays convergence of
the energy–entropy scoring function, so that leaving out the
entropy term in the energy-only scoring function speeds convergence.
This result is, again, consistent with the irrelevance of the entropy
term in the GIST scoring functions. We then asked whether shorter
MD simulations might give better convergence if frames were saved
at shorter time intervals. First, we reprocessed the first 20 ns of
the same simulation, now processing frames saved at 0.05 ps. As shown
in Figure 3 (top middle), the combined energy/entropy
scoring function now converges somewhat more quickly, especially as
measured by the reduction in the standard deviation of R2 across the 10 train/test calculations. The improvement
is more marked for the energy-only scoring function, as well-converged
results are now available after only 10 ns of simulation time, although
the standard deviation of R2 remains slightly
higher than that from the 100 ns simulation. We then extended the
100 ns simulation by 20 ns, saving frames at 0.05 ps intervals, and
examined convergence over this short simulation. The results are further
improved, with good convergence and tight error bars achieved within
about 5 ns for the energy-only scoring function, and 10 ns for the
combined energy/entropy one. The improvement in these results, relative
to those from the early 20 ns segment, suggests that the water structure
continued to equilibrate somewhat during the 100 ns run, so it would
have been appropriate to use a somewhat longer equilibration period
in the MD protocol. In summary, at least for FXa, a simulation of
10 ns or less suffices to gain all the benefit of these GIST scoring
functions.
Figure 3
Convergence of R2 values for GIST scoring
functions, as a function of simulation duration. Top row: combined
energy/entropy scoring function. Bottom row: energy-only scoring function.
Left: 100 ns simulation, frames saved at 1 ps intervals. Middle: The
first 20 ns of the same 100 ns simulation, frames saved every 0.05
ps. Right: 20 ns simulation initiated from the last frame of the 100
ns simulation, frames saved every 0.05 ps.
Convergence of R2 values for GIST scoring
functions, as a function of simulation duration. Top row: combined
energy/entropy scoring function. Bottom row: energy-only scoring function.
Left: 100 ns simulation, frames saved at 1 ps intervals. Middle: The
first 20 ns of the same 100 ns simulation, frames saved every 0.05
ps. Right: 20 ns simulation initiated from the last frame of the 100
ns simulation, frames saved every 0.05 ps.
HSA-Based Scoring Functions
As discussed
in the Methods section, a prior HSA scoring
function was constructed in such a way that the thermodynamic contribution
of each hydration could be counted multiple times, if more than one
ligand atom lay within a cutoff distance.[12] This function gave good results but is arguably nonphysical, because
once the water in a hydration site has been displaced, it cannot be
displaced again. Here, we present results for a set of similarly constructed
HSA-based scoring functions in which, as before, a hydration site
can be counted multiple times; as well as a second set, in which the
contribution of each hydration site is capped, so that it can only
contribute once. For both the uncapped and capped models, we examine
scoring functions based on energy and entropy, energy only, and entropy
only, as also done for GIST.With the original uncapped approach,
where a hydration site can contribute multiple times, the combined
energy/entropy scoring function provides high correlation coefficients
for both the training (R2 = 0.88) and
test (R2 = 0.80) sets (Table 1, fourth row, marked E/S). These values of R2 are slightly
lower than those for GIST, but the difference is not statistically
significant. Interestingly, the fitted value of Eaff (−1.70 kcal/mol) is much greater than that
of Saff, which in fact is assigned a positive
sign (0.09 kcal/mol). In addition, the training procedure puts much
sharper constraints on the energy terms than the entropy terms, as
indicated by the fact that the standard deviations of Eo and Eaff (0.08 and 0.31
kcal/mol, respectively) are lower than those for So and Saff (0.50 and 0.60
kcal/mol). Thus, the training results suggest that the entropy term
is of lower importance than the energy term, much as observed in the
case of GIST. Accordingly, a scoring function based entirely on energy
(Table 1, row 5, marked E)
performs as well as the one with both energy and entropy, and furthermore
yields values of Rco, Eco, and Eaff very similar
to those of the combined energy–entropy scoring function. On
the other hand, an entropy-only scoring function (Table 1, row 6, marked S) also performs fairly well, with a value
of R2 = 0.66 for the test sets, and the
fitted value of Saff, −1.52 kcal/mol,
is essentially the same as the fitted value of Eaff for the energy-only scoring function, −1.50 kcal/mol.
Thus, the entropy-only scoring function appears to largely replicate
the energy-only scoring function, with some drop in the correlation
with experimental data. These results are similar to those for GIST,
as in both cases, the energy-only scoring function performed as well
as the energy/entropy one, while the entropy-only one is worse. However,
the decline in performance on going to entropy-only is much greater
for GIST than for HSA.In the second variant of the HSA-based
scoring functions, no hydration
site can contribute more than one-fold to the difference between two
ligands’ affinities. Imposing this physically reasonable cap
on the contribution of each site reduced the experimental correlation
of the scoring function with experimental data to 0.66 for the test
sets with the combined energy–entropy scoring function but
had essentially no effect on the correlations for the energy-only
and entropy-only test-set results. Thus, the results remain consistent
with a conclusion that the energy term alone is enough to gain all
the benefit of the scoring functions. The only other major change,
relative to the uncapped version of the scoring function, is that
the fitted values of Eaff and Saff changed from about −1.5 kcal/mol
to about −2.3 kcal/mol, except that Saff for the combined energy/entropy scoring function remained
small. These increases presumably have the effect of compensating
for the reduced values of the displacement function, i.e., for the
fact that dcap ≤ dnocap in eqs 2 and 3.In order to further test the statistical significance of the HSA
scoring results, we shuffled the energies and entropies among hydration
sites and then refitted all six HSA-based scoring functions with the
shuffled data. As summarized in the last six rows of Table 1, the uncapped scoring functions now yield poor
correlations with experimental data (R2 ≈ 0.4), just as observed for GIST, and the correlations for
the capped scoring functions fall even lower. These results support
the significance of the correlations obtained with the actual (unshuffled)
data. The fact that the shuffled results are worse for the capped
HSA scoring functions suggests, but does not prove, that applying
the physically reasonable cap might reduce spurious correlations.It is of interest to examine the locations of HSA scoring sites
relative to the GIST scoring regions. We focus here on the energy-only
scoring functions, since it is not clear that including entropy adds
useful information, at least for FXa within the present functional
form. As shown in Figure 4 (left), the HSA
sites do not capture the complex shapes of the GIST scoring regions,
and substantial parts of the GIST scoring region are entirely missed
by the HSA scoring sites. These regions could presumably be used to
direct rational lead drug design. The spatial relationship of the
scoring sites to the ligands studied here may be appreciated from
Figure 4 (right), which shows the van der Waals
surface of a representative ligand.
Figure 4
Comparison of GIST and HSA scoring locations,
in the context of
the FXa binding site (gray). Left: Scoring sites from the capped HSA
energy-only scoring function (blue spheres) and energy scoring regions
from the GIST energy-only scoring function (orange contours). GIST
results are present only within the limits of the GIST grid (orange
lines). Right: a representative ligand, PDB HET ID 4QC, in van der
Waals representation (pink).
Comparison of GIST and HSA scoring locations,
in the context of
the FXa binding site (gray). Left: Scoring sites from the capped HSA
energy-only scoring function (blue spheres) and energy scoring regions
from the GIST energy-only scoring function (orange contours). GIST
results are present only within the limits of the GIST grid (orange
lines). Right: a representative ligand, PDB HET ID 4QC, in van der
Waals representation (pink).
Discussion
A key result of this first
application of GIST to protein–ligand
modeling is that the detailed representation of water structure and
thermodynamics it affords works at least as well in a simple scoring
function as the prior[12] and present site-based
HSA implementations. In addition, the GIST results converge well within
5–10 ns of MD simulation time, depending upon whether one uses
the energy-only model or the energy/entropy model. These simulation
times are commensurate with those normally used for the HSA approach;
see the Methods. Thus, the two approaches
provide similar overall performance in the present application. Additional
considerations include the fact that site-based approaches may require
less data storage and that they paint a simple picture of water structure
in a protein binding site. On the other hand, the GIST grid files
produced are still small (<1 MB), and we anticipate that the more
detailed rendering of hydration structure and thermodynamics afforded
by the grid approach will be useful for insight and prediction. Note,
in particular, that there are HSA sites which are only partly occupied
by the more refined GIST scoring regions, as well as GIST scoring
regions that are not identified by the HSA model, as evident from
Figure 4. More generally, the fact that GIST
is a more direct representation of the common underlying inhomogeneous
solvation theory facilitates the interpretation of its results and
provides clear pathways to future enhancements, such as the incorporation
of higher-order correlations, as touched on below. A fast implementation
of GIST, based upon focused grand canonical Monte Carlo sampling,[59] may be of particular interest in the near term
for high-throughput applications.Another novel and striking
result of the present study is that
both the GIST and HSA models provide clear signals that ligands can
gain affinity by displacing energetically unfavorable binding site
water, whereas the displacement of entropically unfavorable water
seems to play a negligible role. The energetically unfavorable water
highlighted by this study localizes at hydrophobic patches of the
protein surface, perhaps especially in concave regions where water
molecules are expected to lose hydrogen bonds.[60] The concept that energy may outweigh entropy in cases of
strong hydrophobic binding has been raised before,[61,62] in both experimental[8,63−66] and computational[67,68] contexts. Nonetheless, water entropy is typically thought to play
a central role in hydrophobic binding.[69−72] Here, interestingly, neither
the GIST or HSA models made a compelling case that the displacement
of entropically unfavorable water consistently enhances affinity.
We conjecture that the lack of a clear correlation of water entropy
with affinity may reflect the fact that low water entropy often results
from energetically favorable water–protein interactions, so
that water may actually be quite stable in many locations where its
entropy is low. This view is consistent with the experimental observation
that the entropy of hydration of small ions is strongly negative,[73] although the free energy is also strongly negative.
It may be possible in the future to devise a more sophisticated scoring
function that would account for the enthalpy–entropy compensation
between stabilizing energy and destabilizing entropy and focus on
regions where this compensation breaks down, such as in the binding
cavity of the synthetic host molecule cucurbit[7]uril.[28] It is also worth mentioning that different protein
binding sites affect water differently, so a different result might
be obtained for a different protein. Finally, it may be that capturing
the entropic aspect of the hydrophobic effect requires accounting
for water–water correlations, which are absent from the one-body
entropy considered here. If so, the entropy term may become more important
once pairwise correlations have been incorporated into GIST’s
entropy calculations.Our observation that the displacement
of high energy water plays
a greater role in ligand scoring than displacement of low entropy
water appears to contrast with a prior HSA-based study of the same
system, where the fitted scoring function placed approximately equal
weight on water energy and entropy.[12] However,
the range of fitted values for Saff in
our HSA models nearly spans the value of 0.66 kcal/mol for the corresponding
parameter in the prior study, Srwd. In
addition, the prior study did not examine the uncertainty in its fitted
energy and entropy parameters or evaluate a scoring function based
purely on water energy. Therefore, the results of these two studies
should not be regarded as inconsistent.In summary, the grid-based
GIST method of extracting information
about hydration thermodynamics from MD simulations with explicit water
has provided encouraging results in its first application to protein–ligand
binding. It thus appears to hold significant promise as a broadly
applicable method of understanding the role of binding site water
in protein–ligand binding, and as a tool to improve the accuracy
of methods for discovering high affinity targeted ligands. It is anticipated
that the detailed representation of water distributions and thermodynamics
which GIST affords will make it particularly informative. We are currently
working to develop such applications and to release an open-source
implementation of GIST within the AmberTools[36] software package for others to study and use.
Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971
Authors: M Adler; D D Davey; G B Phillips; S H Kim; J Jancarik; G Rumennik; D R Light; M Whitlow Journal: Biochemistry Date: 2000-10-17 Impact factor: 3.162
Authors: Donald J P Pinto; Michael J Orwat; Mimi L Quan; Qi Han; Robert A Galemmo; Eugene Amparo; Brian Wells; Christopher Ellis; Ming Y He; Richard S Alexander; Karen A Rossi; Angela Smallwood; Pancras C Wong; Joseph M Luettgen; Alan R Rendina; Robert M Knabb; Lawrence Mersinger; Charles Kettner; Steven Bai; Kan He; Ruth R Wexler; Patrick Y S Lam Journal: Bioorg Med Chem Lett Date: 2006-05-30 Impact factor: 2.823
Authors: H Nar; M Bauer; A Schmid; J M Stassen; W Wienen; H W Priepke; I K Kauffmann; U J Ries; N H Hauel Journal: Structure Date: 2001-01-10 Impact factor: 5.006
Authors: Michael E Wall; Gaetano Calabró; Christopher I Bayly; David L Mobley; Gregory L Warren Journal: J Am Chem Soc Date: 2019-03-11 Impact factor: 15.419
Authors: Ido Y Ben-Shalom; Charles Lin; Tom Kurtzman; Ross C Walker; Michael K Gilson Journal: J Chem Theory Comput Date: 2019-03-13 Impact factor: 6.006
Authors: Trent E Balius; Marcus Fischer; Reed M Stein; Thomas B Adler; Crystal N Nguyen; Anthony Cruz; Michael K Gilson; Tom Kurtzman; Brian K Shoichet Journal: Proc Natl Acad Sci U S A Date: 2017-07-31 Impact factor: 11.205
Authors: Camilo Velez-Vega; Daniel J J McKay; Tom Kurtzman; Vibhas Aravamuthan; Robert A Pearlstein; José S Duca Journal: J Chem Theory Comput Date: 2015-10-21 Impact factor: 6.006