Harrison Green1, Jacob D Durrant1. 1. Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States.
Abstract
Lead optimization, a critical step in early stage drug discovery, involves making chemical modifications to a small-molecule ligand to improve properties such as binding affinity. We recently developed DeepFrag, a deep-learning model capable of recommending such modifications. Though a powerful hypothesis-generating tool, DeepFrag is currently implemented in Python and so requires a certain degree of computational expertise. To encourage broader adoption, we have created the DeepFrag browser app, which provides a user-friendly graphical user interface that runs the DeepFrag model in users' web browsers. The browser app does not require users to upload their molecular structures to a third-party server, nor does it require the separate installation of any third-party software. We are hopeful that the app will be a useful tool for both researchers and students. It can be accessed free of charge, without registration, at http://durrantlab.com/deepfrag. The source code is also available at http://git.durrantlab.com/jdurrant/deepfrag-app, released under the terms of the open-source Apache License, Version 2.0.
Lead optimization, a critical step in early stage drug discovery, involves making chemical modifications to a small-molecule ligand to improve properties such as binding affinity. We recently developed DeepFrag, a deep-learning model capable of recommending such modifications. Though a powerful hypothesis-generating tool, DeepFrag is currently implemented in Python and so requires a certain degree of computational expertise. To encourage broader adoption, we have created the DeepFrag browser app, which provides a user-friendly graphical user interface that runs the DeepFrag model in users' web browsers. The browser app does not require users to upload their molecular structures to a third-party server, nor does it require the separate installation of any third-party software. We are hopeful that the app will be a useful tool for both researchers and students. It can be accessed free of charge, without registration, at http://durrantlab.com/deepfrag. The source code is also available at http://git.durrantlab.com/jdurrant/deepfrag-app, released under the terms of the open-source Apache License, Version 2.0.
The
process of discovering and developing a new drug is both expensive
and time-consuming. In the earliest steps, researchers seek to identify
hit compounds that are active against a disease-implicated protein
of interest. These hits must then undergo lead optimization, which
involves adding or swapping chemical moieties with the goal of improving
binding affinity or other chemical properties related to absorption,
distribution, metabolism, excretion, and toxicity.[1]Computer-aided drug discovery (CADD) can accelerate
these early
stage steps. For example, structure-based virtual screening (i.e.,
computer docking) can identify compounds that are promising candidate
hits for subsequent experimental testing. Once a hit has been identified,
a number of computational techniques can also further lead optimization,
ranging from docking-based methods such as AutoGrow[2−4] to more advanced,
molecular dynamics (MD) “alchemical” methods[5] such as thermodynamic integration,[6] single-step perturbation,[7] and free energy perturbation.[8]We recently created a 3D convolutional neural network called DeepFrag[9] that aims to further lead optimization. To train
DeepFrag, we assembled a large set of crystal structures and systematically
removed fragments from the cocrystallized ligands. We then asked DeepFrag
to predict a molecular fingerprint (vector) describing the missing
fragment. The predicted fingerprints most closely matched the corresponding
missing fragments roughly 60% of the time when selecting from a reference
library of ∼6,500 fragments. Remarkably, even when the network
predicts the wrong fragment, the top predictions are often chemically
similar and may well be more optimal. In prospective practice, DeepFrag
can also be used to add novel fragments to an identified lead, in
addition to swapping existing moieties.To ensure usability,
we took great care to document the DeepFrag
Python source code and even created a Google Colab notebook so users
can test the network without having to download or locally install
any software, libraries, or dependencies,[10] but even this approach limits accessibility to those who are experts
in the field. While the negative impact of poor usability on software
adoption among scientists should not be understated, it is particularly
problematic in educational settings. Many students are unfamiliar
with Python, and expecting students to download, install, and use
a command-line program is often impractical.To address these
usability challenges, we have created the DeepFrag
browser app. By “browser app”, we mean software that
runs on users’ local computers, entirely in a web browser.
Browser apps have some notable advantages over server apps, which
run calculations on remote resources (“in the cloud”).
For example, rather than require users to upload their proprietary
data to a third-party server, browser apps download the software required
to run the calculations locally in the browser’s secure sandboxed
environment. Thanks to this de facto distributed
approach, browser apps do not require an extensive and difficult-to-maintain
remote computer infrastructure. Furthermore, calculations begin immediately
on the user’s own computer, so there is no need to wait in
lengthy queues for limited remote resources to become available.The DeepFrag browser app will be a useful tool for the CADD research
and educational communities. A working implementation can be accessed
free of charge at http://durrantlab.com/deepfrag, without registration. Its source code is available at http://git.durrantlab.com/jdurrant/deepfrag-app, released under the terms of the Apache License, Version 2.0.
Results
and Discussion
Input Parameters Tab
To run the
DeepFrag browser app,
users need only visit http://durrantlab.com/deepfrag, where they will encounter the “Input Parameters”
tab illustrated in Figure , on the left. In the “Input Receptor and Ligand Files”
subsection (Figure A), users can specify the protein receptor and ligand file for optimization
in any of several popular formats. The contents of these files are
loaded into the browser’s memory, but they are never transmitted/uploaded
to any third-party server. Users who wish to simply test DeepFrag
can instead click the “Use Example Files” button (not
shown) to load a preprepared structure of H. sapiens peptidyl-prolyl cis–trans isomerase NIMA-interacting 1 (HsPin1p) bound to a small-molecule ligand (PDB 2XP9(11)).
Figure 1
Input parameters tab (on the left) includes the (A) “Input
Receptor and Ligand Files” and (B) “Molecular Viewer”
subsections, as well as the (C) save/load and “Start DeepFrag”
buttons. The Output tab (on the right) includes the (D) suggested-fragments
table and (E) “Output Files” subsections. Some components
are not shown to simplify the presentation.
Input parameters tab (on the left) includes the (A) “Input
Receptor and Ligand Files” and (B) “Molecular Viewer”
subsections, as well as the (C) save/load and “Start DeepFrag”
buttons. The Output tab (on the right) includes the (D) suggested-fragments
table and (E) “Output Files” subsections. Some components
are not shown to simplify the presentation.The “Molecular Viewer” subsection (Figure B) contains a 3Dmol.js molecular
viewer[12] where the specified files are
displayed. This subsection also includes two toggle buttons. The “Delete
Atom” button allows users to remove ligand atoms from the structure
by clicking on them. We included this optional feature anticipating
that many users will wish to use DeepFrag to replace existing ligand
moieties. The “Select Atom as Growing Point” toggle
button allows users to indicate which ligand atom should serve as
the growing point (i.e., connection point) that connects the predicted
fragments to the parent ligand molecule. After users click the appropriate
ligand atom, a yellow transparent sphere indicates the location of
the growing point.A slider allows the user to control the grid-ensemble
size for
ensemble (consensus) predictions (not shown in Figure ). In the original DeepFrag publication,[9] we evaluated the impact of sampling multiple
random grid rotations for each protein/ligand input. A final ensemble
fragment fingerprint was then calculated by averaging the predicted
fingerprints associated with each rotated grid. This approach led
to modest improvements in accuracy (∼1.5% TOP-1 accuracy in
our tests when considering 32 rotations vs one[9]). To match the original DeepFrag implementation, the browser app
performs the full 32 rotations by default. Users who wish to accelerate
the in-browser calculation can optionally specify fewer rotations.A checkbox allows users to control how the DeepFrag browser app
performs grid rotations (not shown in Figure ). In the original DeepFrag implementation,[9] each of the ensemble-member grids is rotated
randomly. To match the original implementation, the browser app also
performs random rotations by default, but users can optionally instruct
the browser app to (1) always rotate in 90° increments and (2)
further consider grid reflections. These operations can be performed
rapidly using the TensorFlow.js reverse and transpose functions, thus speeding the in-browser calculation.
Given that the input structures are unlikely to have rotational or
reflection symmetry along any of the primary axes, using 90°
rotations and reflections should not introduce any bias into the predictions,
but this optional, accelerated approach does differ from the thoroughly
tested DeepFrag implementation described in our previous manuscript.[9] We therefore suggest using the default browser-app
settings to match the original implementation precisely.Several
buttons are present at the bottom of the Input Parameters
tab (Figure C). The
“Temporary Save” button saves the specified parameters
(i.e., receptor/ligand files, growing point, etc.) to the browser’s
session storage. These same parameters can be later restored using
the “Load Saved Data” button. Otherwise, the user simply
clicks the “Start DeepFrag” button to begin the DeepFrag
run.The DeepFrag browser app then generates tensor(s) from
the input
molecular structures and uses the trained model to predict appropriate
molecular fragments. The prediction typically takes at most half a
minute, even when running the DeepFrag app on a mobile phone.
Output
Tab
DeepFrag displays the “Output”
tab once the calculations are complete (Figure , illustrated on the right). The “Visualization”
subsection again displays the specified receptor, ligand, and growing
point for user convenience (not shown). Below the molecular visualization,
a table shows the SMILES strings, molecular structures (generated
using SmilesDrawer[13]), and DeepFrag scores
of the top 20 predicted fragments, sorted from most to least promising
(Figure D). To generate
a 3D structure of a given “fused” parent/fragment composite
molecule (e.g., for computer docking), users can simply click the
corresponding fragment SMILES string to launch a separate molecule-preparation
web app called Fuser (see the Git repository for details). As with
the original DeepFrag implementation, fragment scores are calculated
by considering the cosine similarity[14] between
the predicted fingerprint vector and the fingerprint vector of the
corresponding fragment.The “Output Files” subsection
(Figure E) allows
users to directly view DeepFrag output files. Users can also press
the associated “Download” buttons to save the files
to disk. These files include a more complete list of the predicted
fragments (TSV format), the 3D coordinates of the selected growing
point (JSON format), and the receptor and ligand files used for analysis
(PDB format).
Compatibility
We have tested the
DeepFrag browser app
on the browser/operating-system combinations shown in Table . It works well on both desktop
and mobile operating systems, as well as on all major browsers (e.g.,
Chrome, Edge, Firefox, and Safari).
Table 1
DeepFrag Browser
and Operating-System
Compatibility Tests
browser
operating
system
Chrome 88.0.4324.87
macOS 10.14.5
Firefox 84.0
macOS 10.14.5
Safari 13.1.1
macOS 10.14.5
Chrome 87.0.4280.141
Windows 10.0.19041 Home
Firefox 84.0.2
Windows 10.0.19041 Home
Edge 87.0.664.75
Windows 10.0.19041 Home
Chrome 87.0.4280.141
Android 10
Firefox 84.1.4
Android 10
Safari 14
iPhone SE iOS 14.3
Chromium 87.0.4280.141
Ubuntu Linux 18.04.5 LTS
Firefox 84.0.2
Ubuntu Linux 18.04.5 LTS
Example of Use: HsPin1p
In our original
manuscript describing the DeepFrag model,[9] we provided several test cases showing how it can be used for lead
optimization.[9] To show that the original
implementation and the browser app give comparable results, we here
reproduce one of those tests, which focused on the cancer target HsPin1p[15] bound to a phenyl-imidazole
ligand (IC50: 8 μM; PDB 2XP9(11)). We chose
this protein/ligand complex because neither the protein nor the ligand
was included in the DeepFrag training or validation sets. DeepFrag
is nondeterministic because it randomly rotates the voxel grids used
as input; that is, the program by design gives slightly different
results every time it is run. We thus do not expect the browser implementation
to always suggest fragments that are identical to those reported previously;
rather, we expect the fragments to be identical in many cases and
at least similar otherwise (Figure ).
Figure 2
A crystal structure of HsPin1p bound
to a phenyl-imidazole
ligand (PDB 2XP9(11)). We reassessed a carboxyl fragment
(A, in pink) and two phenyl fragments (B, in yellow; C, in green)
using the DeepFrag browser app. Select labeled protein residues are
shown in a sticks representation. The figure was rendered using BlendMol.[16]
A crystal structure of HsPin1p bound
to a phenyl-imidazole
ligand (PDB 2XP9(11)). We reassessed a carboxyl fragment
(A, in pink) and two phenyl fragments (B, in yellow; C, in green)
using the DeepFrag browser app. Select labeled protein residues are
shown in a sticks representation. The figure was rendered using BlendMol.[16]We first used the DeepFrag
browser app to remove carboxylate A
(Figure , highlighted
in pink) and to predict appropriate replacement moieties at the same
position. Like the original DeepFrag implementation,[9] the browser app also predicted the correct (known) carboxylate
moiety. This choice is sensible given that the carboxylate enables
electrostatic interactions with K63 and R69 and hydrogen bonds with
C113 and S114.We used the same approach to evaluate phenyl
B (Figure , highlighted
in yellow). The
original DeepFrag implementation suggested a bicyclic replacement,
*c1ccnc2c1C(=O)N(C)C2, at this position.[9] When run from the browser app, this same fragment
was ranked second, likely because it preserves π–π
interactions with H59 while improving hydrophobic interactions with
L122 and M130.[9] The first browser-app-predicted fragment was also a bicyclic replacement, *c1ncnc2c1C(C)CC2O, that is similarly composed of a six-member aromatic ring fused to
a five-member nonaromatic ring. Given these structural similarities,
the top browser-app fragment may adopt a similar binding pose within
the HsPin1p pocket.Finally, we removed phenyl
C (Figure , highlighted
in green) and similarly used
the browser app to predict appropriate replacements. The original
DeepFrag implementation suggested methyl and ethyl replacements at
this location, likely because they maintain potential hydrophobic
interactions with the R68 side chain.[9] The
browser app suggested the same two fragments, though it preferred
the ethyl replacement over the methyl.This work demonstrates
that the original DeepFrag implementation
and the browser app make comparable fragment predictions, as expected
given that the two implementations are functionally identical.
Example
of Use: DNA Gyrase B (24 kDa Domain)
Having
applied the DeepFrag browser app to an established test case (HsPin1p), we now provide a second, novel example that illustrates
how DeepFrag can suggest fragment additions that improve binding affinity.
We considered a recent fragment-based lead-optimization project undertaken
by Ushiyama et al., which identified several E. coli DNA gyrase B (24 kDa domain) inhibitors, including one in the low-nanomolar
range.[17] Importantly, none of the crystal
structures associated with the Ushiyama study (PDB IDs: 6KZV, 6KZX, 6KZZ, and 6L01(17)) were included in the original training, validation, or
testing sets used to create the DeepFrag model. While some unassociated
structures of E. coli DNA gyrase B were included
in the training set, they were bound to ligands that are quite distinct
(PDB IDs: 4DUH,[18]6F86,[19]6F94,[19] and 6F8J(19)).A number of the inhibitors
that Ushiyama et al. identified share the same inhibitory 8-(methylamino)-2-oxo-N-phenyl-1,2-dihydroquinoline-3-carboxamide scaffold, which
itself has an IC50 value of 0.24 μM per isothermal
titration calorimetry.[17] We applied DeepFrag
to this scaffold to see if it could identify the same fragment additions
that Ushiyama et al. selected and experimentally tested. There is
no crystal structure of the scaffold itself bound to DNA gyrase B,
but multiple crystal structures of bound analogues containing the
scaffold are nearly superimposable. We therefore created a model of
the receptor/scaffold complex from the 6KZZ structure[17] by simply removing any ligand atoms that did not belong to the scaffold
itself.Ushiyama et al. tested a number of fragment additions
at the phenyl
para position. We used the DeepFrag browser app to evaluate this same
position (Figure ).
The top DeepFrag-suggested addition was an acetic acid, *CC(=O)O,
which Ushiyama et al. had also selected and tested. In their hands,
this addition improved the IC50 value to 0.018 μM,[17] likely because it enables electrostatic interactions
with R136 and perhaps R76. The second DeepFrag-suggested fragment
addition at the para position was a carboxyl group. This compound
had also been tested and was found to have an improved IC50 value of 0.0017 μM.[17]
Figure 3
An inhibitory
scaffold bound to E. coli DNA gyrase
B. The protein is shown as a blue ribbon, and key amino acids are
shown as thin sticks (PDB ID 6KZZ(17)). The scaffold is shown
as thick sticks, with the relevant atomic coordinates taken from the 6KZZ ligand.[17] The benzene para and meta positions are marked
with a dagger and a double dagger, respectively. A 2D depiction of
the scaffold is overlaid, with its experimentally measured IC50 value (per Ushiyama et al.[17]).
DeepFrag-suggested fragment additions at the para and ortho positions
are ranked in the table below, with the associated IC50 values.
An inhibitory
scaffold bound to E. coli DNA gyrase
B. The protein is shown as a blue ribbon, and key amino acids are
shown as thin sticks (PDB ID 6KZZ(17)). The scaffold is shown
as thick sticks, with the relevant atomic coordinates taken from the 6KZZ ligand.[17] The benzene para and meta positions are marked
with a dagger and a double dagger, respectively. A 2D depiction of
the scaffold is overlaid, with its experimentally measured IC50 value (per Ushiyama et al.[17]).
DeepFrag-suggested fragment additions at the para and ortho positions
are ranked in the table below, with the associated IC50 values.Ushiyama et al. also found that
fragment additions at the ortho
position improved IC50 values, albeit more modestly. The
top DeepFrag-suggested addition at this position was a carboxyl group
(Figure ), an addition
that Ushiyama et al. had also tested (IC50 0.21 μM).[17] The second DeepFrag addition was an acetic acid,
*CC(=O)O, which had an experimentally measured IC50 value
of 0.16 μM.[17]These examples
illustrate that the DeepFrag browser app can suggest
fragment additions similar to those a trained medicinal chemist might
select and that those additions can in some cases dramatically improve
binding affinity.
Conclusions
Our original DeepFrag
model serves as a useful tool that aims to
help trained medical chemists and structural biologists in their lead-optimization
efforts, but as originally implemented, DeepFrag is a stand-alone
Python program tailored primarily to expert computationalists. To
enable use by a broader audience, we have implemented DeepFrag as
a browser app. Researchers, educators, and students can easily experiment
with DeepFrag optimization in their browsers, without ever having
to upload possibly proprietary structures to a third-party server
and without ever having to install any separate software.The
DeepFrag browser app will be a useful tool for the CADD research
and education community. It is functionally identical to the original
implementation and so yields comparable results, but the browser-app
version additionally provides a user-friendly interface for setting
up a DeepFrag run and for viewing predicted fragments. It is freely
accessible at http://durrantlab.com/deepfrag. A copy of the source code can be obtained free of charge from http://git.durrantlab.com/jdurrant/deepfrag-app, released under the terms of the Apache License, Version 2.0.
Implementation
We relied on several web technologies to implement the original
DeepFrag model as a browser app. Our implementation can be broadly
divided into setup, calculation, and results. The workflow is illustrated
in Figure and described
in detail below.
Figure 4
DeepFrag browser-app workflow can be broadly divided into
components
for setup, calculation, and displaying results. The setup and results
include graphical user interfaces (GUIs) powered by Vue.js, and the
calculations themselves depend on Transcrypt-compiled code and TensorFlow.js.
3Dmol.js[12] and SmilesDrawer[13] are useful JavaScript libraries for molecular
visualization.
DeepFrag browser-app workflow can be broadly divided into
components
for setup, calculation, and displaying results. The setup and results
include graphical user interfaces (GUIs) powered by Vue.js, and the
calculations themselves depend on Transcrypt-compiled code and TensorFlow.js.
3Dmol.js[12] and SmilesDrawer[13] are useful JavaScript libraries for molecular
visualization.
Setting up a DeepFrag Calculation
To simplify the process
of setting up a DeepFrag calculation, we created a browser-based GUI
so users can easily (1) load their receptor and ligand structures
into the browser’s memory, (2) select the growing point, and
(3) specify other DeepFrag parameters (Figure ). To build the GUI, we used the same approach
that we have used previously.[20] In brief,
the interface is written in the open-source Microsoft TypeScript programming
language, which compiles to JavaScript and so can run in any modern
web browser. It uses the open-source Vue.js framework (https://vuejs.org/) to provide reusable,
consistently styled HTML-like components (e.g., buttons, input fields,
etc.). Many of these components are derived from the open-source BootstrapVue
library (https://bootstrap-vue.js.org/), which makes it easy to implement the color, size, and typography
specifications of the Bootstrap4 framework (https://getbootstrap.com/).
We also adapted our existing molecular-visualization Vue.js component[20] for use in the DeepFrag app. This component
leverages the 3Dmol.js JavaScript library,[12] which displays molecular structures without requiring any separate
installation or browser plugin.To compile and assemble our
TypeScript codebase and the third-party libraries described above,
we used Webpack, an open-source module bundler (https://webpack.js.org/). This
compilation process included Google’s Closure Compiler (https://developers.google.com/closure/compiler), which automatically optimizes TypeScript/JavaScript code for size
and speed.
Running a DeepFrag Calculation
The
DeepFrag model requires
tensor grids as input. To convert molecular structures to grids in
the browser, we first created a pure-Python implementation of the
GPU-accelerated grid-generation code described in the original DeepFrag
publication[9] and transpiled it to JavaScript
using the Transcrypt compiler (https://www.transcrypt.org/) (Figure ).To run the DeepFrag model itself
in a browser environment, we used the Open Neural Network Exchange
framework[21] to convert our PyTorch model
to the equivalent Tensorflow model.[22] We
then used TensorFlow.js (https://www.tensorflow.org/js) to run the model in the browser.
Internally, the browser implementation is functionally identical to
the stand-alone version.
Viewing DeepFrag Results
To view
DeepFrag results (i.e.,
suggested fragments), we again created/reused the appropriate Vue.js
components (Figure ). Graphical depictions of the suggested fragments are generated
from the output SMILES strings using the SmilesDrawer JavaScript library.[13]
Authors: Sarah Narramore; Clare E M Stevenson; Anthony Maxwell; David M Lawson; Colin W G Fishwick Journal: Bioorg Med Chem Date: 2019-06-14 Impact factor: 3.641
Authors: Andrew Potter; Victoria Oldfield; Claire Nunns; Christophe Fromont; Stuart Ray; Christopher J Northfield; Christopher J Bryant; Simon F Scrace; David Robinson; Natalia Matossova; Lisa Baker; Pawel Dokurno; Allan E Surgenor; Ben Davis; Christine M Richardson; James B Murray; Jonathan D Moore Journal: Bioorg Med Chem Lett Date: 2010-09-17 Impact factor: 2.823