Peng Gao1, Miao Xu1, Qi Zhang1,2, Catherine Z Chen1, Hui Guo1, Yihong Ye2, Wei Zheng1, Min Shen1. 1. The National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Bethesda, Maryland 20850, United States. 2. National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institutes of Health (NIH), Bethesda, Maryland 20892, United States.
Abstract
The cell entry of SARS-CoV-2 has emerged as an attractive drug development target. We previously reported that the entry of SARS-CoV-2 depends on the cell surface heparan sulfate proteoglycan (HSPG) and the cortex actin, which can be targeted by therapeutic agents identified by conventional drug repurposing screens. However, this drug identification strategy requires laborious library screening, which is time consuming, and often limited number of compounds can be screened. As an alternative approach, we developed and trained a graph convolutional network (GCN)-based classification model using information extracted from experimentally identified HSPG and actin inhibitors. This method allowed us to virtually screen 170,000 compounds, resulting in ∼2000 potential hits. A hit confirmation assay with the uptake of a fluorescently labeled HSPG cargo further shortlisted 256 active compounds. Among them, 16 compounds had modest to strong inhibitory activities against the entry of SARS-CoV-2 pseudotyped particles into Vero E6 cells. These results establish a GCN-based virtual screen workflow for rapid identification of new small molecule inhibitors against validated drug targets.
The cell entry of SARS-CoV-2 has emerged as an attractive drug development target. We previously reported that the entry of SARS-CoV-2 depends on the cell surface heparan sulfate proteoglycan (HSPG) and the cortex actin, which can be targeted by therapeutic agents identified by conventional drug repurposing screens. However, this drug identification strategy requires laborious library screening, which is time consuming, and often limited number of compounds can be screened. As an alternative approach, we developed and trained a graph convolutional network (GCN)-based classification model using information extracted from experimentally identified HSPG and actin inhibitors. This method allowed us to virtually screen 170,000 compounds, resulting in ∼2000 potential hits. A hit confirmation assay with the uptake of a fluorescently labeled HSPG cargo further shortlisted 256 active compounds. Among them, 16 compounds had modest to strong inhibitory activities against the entry of SARS-CoV-2 pseudotyped particles into Vero E6 cells. These results establish a GCN-based virtual screen workflow for rapid identification of new small molecule inhibitors against validated drug targets.
Since the outbreak
of the COVID-19 pandemic, global communities
have suffered a significant loss of lives and economic growth. Although
the development of COVID vaccines can significantly inhibit the spreading
of SARS-CoV-2, the virus is constantly evolving into more infectious
and transmissible variants (e.g., the delta strain), resulting in
frequent breakthrough infections among vaccinated people.[1−5] The constant increase of hospitalized patients in the USA and around
the world despite the rollout of the vaccination programs has summoned
the need to develop potent small molecule therapeutics for COVID patients.The cellular entry of SARS-CoV-2 is one of the key steps in the
viral life cycle that represents a hot target for small molecule inhibitors.[6,7] To target SARS-Cov-2 viral entry, the most popular target is the
viral spike protein given that drugs targeting spike are less likely
to disrupt cellular processes in the host cells. On the host side,
a variety of proteins are being evaluated as potential anti-SARS-CoV-2
targets, which include angiotensin-converting enzyme 2 (ACE2), the
major receptor for spike on the cell surface,[8−12] and transmembrane serine protease 2 (TMSSPR2), the
protease involved in activating spike and other factors regulating
the endocytosis of virion particles.[13] In
addition, a recent report also discussed that alternative entry points
have been identified using neuropilin-1 (NRP1), which was found to
significantly enhance the infectivity of SARS-CoV-2 by increasing
viral entry into host cells rather than strengthening viral binding.[14] Previously, we and others identified the cell
surface heparan sulfate proteoglycans (HSPGs) as a critical cofactor
that facilitate the entry of SARS-CoV-2 virions.[8,9] We
further showed that HSPGs, as negatively charged biopolymers, also
facilitate the uptake of other positive charge-bearing endocytic cargos
such as supercharged green fluorescence protein (GFP) and preformed
α-synuclein pathogenic fibrils.[15] HSPGs are a family of glycoproteins bearing one or more negatively
charged polysaccharide chains consisting of repeated heparan sulfate
disaccharide units. Most HSPG family members are anchored to the cell
surface either as a single spanning membrane protein (e.g., syndecans)
or glycosylphosphatidylinositol (GPI)-anchored protein (e.g., glypicans).
Due to the enrichment of negatively charged sulfate groups, HSPGs
can effectively serve as an attachment anchor to increase the surface
dwell time for endocytic cargos bearing positive charges, facilitating
their engagement with a downstream receptor.[6,15,16] The internalization of HSPG cargos also
requires the cortex actin, a specialized layer of proteins associated
with the inner surface of the plasma membrane. A major component of
the cortex is actin filament, which associates with myosin motor proteins
and other actin-binding proteins. These proteins together maintain
plasma membrane dynamics to promote the maturation of clathrin-coated
pits.[15] The specific mechanism of the cellular
entry of SARS-CoV-2 is presented in Figure .
Figure 1
Schematic description of cellular entry of SARS-CoV-2.
Schematic description of cellular entry of SARS-CoV-2.We recently conducted a drug repurposing screen
based on our previous
study[6] and identified eight drugs that
inhibited HSPG-dependent entry of SARS-CoV-2 virions.[6] Intriguingly, despite structural dissimilarity, several
of the identified drugs display high potential of binding with heparin,
a heparan sulfate analogue, suggesting that they may target the polysaccharide
chain on the cell surface of HSPG to inhibit viral entry; however,
alternative host protein targets may also exist.[14] In addition to heparin-binding drugs, two structurally
unrelated drugs, Sunitinib and BNTX, can both effectively disrupt
the actin filaments underlying the plasma membrane (cortex actin)
to inhibit HSPG-mediated endocytosis.[9−12] Compared to drugs that target
viral proteins, drugs targeting host factors essential for viral entry
and replication are less likely to generate drug resistance as a result
of viral mutations. While a drug repurposing screen is an effective
strategy to rapidly adopt existing drugs for new therapeutic uses,
the original target(s) of the approved drugs often reduces their therapeutic
specificity, which may cause undesired side effects for treating viral
infection. For example, as a heparan sulfate-binding compound, mitoxantrone
delivers the most potent antiviral activity in vitro. However, because
mitoxantrone was originally approved as anticancer chemotherapy via
targeting DNA topoisomerases,[17] cytotoxicity
associated with DNA replication inhibition is an obvious concern.To identify additional inhibitors targeting HSPG-mediated viral
entry, we developed a graph convolutional network (GCN)-based classification
approach. GCN can efficiently translate 3D structures into molecular
graphs composed of nodes and edges and then utilize these graphs to
extract spatial information to achieve accurate molecular classification
and properties predictions.[18−21] Compared to other traditional computational methods
based on molecular dynamics (MD) simulations or density functional
theory (DFT), the computational cost of GCN is substantially lower.
These features allowed us to rapidly screen 170,000 compounds in several
NCATS libraries. From these libraries, we identified and confirmed
a set of compounds (256) as inhibitors of HSPG-dependent endocytosis
with the most potent IC50 value at 0.95 μM. Further
testing with a SARS-CoV-2 pseudotyped particle (PP) entry assay confirmed
16 compounds as entry inhibitors.
Methods
Computational
Details
GCN Model
GCN-based approaches display considerable
robustness for structural elucidations,[18−21] because it could fully utilize
the molecular graphs for information extraction with substantially
reduced computational cost, compared to MD- or DFT-based methods.[22−29] In addition, such an architecture can directly work on graph-based
inputs, instead of depending on collected descriptors; thus, it is
more suitable for spatial information processing.[30] At the same time, it is still flexible enough to include
different chemical knowledge as extra descriptors for specific assignments.[22,31−40] In this study, we employed the self-developed GCN package for activity
classifications, and the SchNet architecture was applied.[41] The workflow of the applied GCN is described
in Figure . For any
given drug molecule, its structural information was contained in the
simplified molecular-input line-entry system (SMILES) string, and
GCN can transform the molecular graph into a set of numerical descriptors
for computational processing.
Figure 2
Architecture of GCN classification model for
virtual screenings.
Architecture of GCN classification model for
virtual screenings.All the collected SMILES
strings of drug molecules were first translated
into molecular graphs through the TencentAlchemyData set within the
Deep Graph Library (DGL) library.[42,43] Each drug
molecule is composed of edges and nodes within 3D space. Within the
framework of GCN, the generated nodes represent atomic points within
a molecule, while the edges are corresponding to interatomic connections.
With these numerically encoded features, structure similarity can
be well summarized, and related molecular properties can be mapped
correspondingly. In fact, within any molecular or fragmentary graphs,
all the connections between every two atoms are fully solved for information
extraction; the specific values were recorded in distance tensors
at the radial basis function (RBF) layer, guaranteeing there is no
omission of important structure information. In addition, within the
GCN model, to decently solve molecular graphs at an atomic level,
multiple continuous-filter convolutions (cfconv) layers were employed
to optimize and record the interatomic evolution. For instance, at
the k + 1 layer, the ith atom’s
evolution can be expressed with the following equationin which “o”
represents element-wise multiplication, and ω is the filter generation that can map the atoms’ descriptions
to the filter bank. To efficiently control the evolution accuracy
via the applied filter values, a Gaussian-type function, gauss, was employed, which can be expressed with
the following equationwhere μ is the preset value of cutoff, and l represents the bonding distance among the ith atom and jth atom. The α is attributed
to hyper parameters, and it was set to 0.1 in this study.[41]For any prediction or classification task,
the computed value,
Pro, by the GCN model is calibrated with respect to experimental measurement,
Pro′, and the accuracy can be well indicated by the squared
loss functionIn this study, we applied the developed GCN package for drug
activity
classification; however, it is worth noting that this promising architecture
is also able to include various kinds of chemical and physical knowledge
for more challenging structural assignments.
Data Set
We applied
the above-described GCN model to
a previously reported COVID-19 related drug screening, which identified
drugs that block HSPG-dependent entry of α-synuclein fibrils.
A classification algorithm was based on NCATS’ collected activity
values. The model was first trained by the collected data, which consisted
of 3832 compounds.[6] Among them, 367 compounds
show activities, and 3465 are inactive. These compounds were randomly
divided with a ratio of 9:1; 90% was used as the training set and
the remaining 10% as the test set. The trained GCN model was validated
by the compounds in the test set, which scored an accuracy of 99.5%.
The trained model was then used to screen more than 170,000 compounds
contained in three independent libraries, Genesis, Sytravon, and NCATS
Pharmacologically Active Chemical Toolbox (NPACT), none of which had
been experimentally screened by endocytosis or SARS-CoV-2 PP entry
assays. NCATS has assembled the Genesis collection with 100,000 compounds
to provide a novel modern chemical library that emphasizes high-quality
chemical starting points, sp3-enriched chemotypes, and core scaffolds
that enable rapid purchase and derivatization via medicinal chemistry.
The Sytravon library is a retired Pharma screening collection that
contains 44,000 diverse and novel small molecules, with an emphasis
on medicinal chemistry-tractable scaffolds. The NPACT is a library
of about 5,000 annotated compounds that inform on novel phenotypes,
biological pathways, and cellular processes. There are more than 7,000
mechanisms and phenotypes identified in the literature and worldwide
patents that cover biological interactions within mammalian, microbial,
plant, and other model systems. The physicochemical properties and
the chemical diversity coverage of these three libraries are shown
in Figure .
Figure 3
(a–c)
Distribution of the chemical properties, molecular
weight (MW), SlogP, and logS for Genesis library. (d–f) Distribution
of the chemical properties, molecular weight, SlogP, and logS for
Sytravon library. (g–i) Distribution of the chemical properties,
molecular weight, SlogP, and logS for NPACT library. (k) PCA analysis
for the three independent libraries: Genesis, Sytravon, and NPACT.
(a–c)
Distribution of the chemical properties, molecular
weight (MW), SlogP, and logS for Genesis library. (d–f) Distribution
of the chemical properties, molecular weight, SlogP, and logS for
Sytravon library. (g–i) Distribution of the chemical properties,
molecular weight, SlogP, and logS for NPACT library. (k) PCA analysis
for the three independent libraries: Genesis, Sytravon, and NPACT.
α-Synuclein Fibrils Uptake Assay and
Drug Verification
Fluorescence-labeled α-synuclein
fibrils were generated as
previously described.[15] HEK293T cells were
dispensed into black, clear-bottomed 1536-well microplates (Greiner
BioOne, #789092-F)) at 5000 cells/well in 5 L media with 200 nM pHrodo
red-labeled α-syn fibrils and incubated at 37 °C, 5% CO2, and 85% humidity overnight (∼16 h). Compounds picked
from the virtual screen were titrated 1:3 with 11 points in DMSO and
transferred to assay plates at a volume of 23 nL/well by an automated
pintool workstation (Wako Automation, San Diego, CA). After 24 h of
incubation, the fluorescence intensity of pHrodo red was measured
by a CLARIOstar Plus plate reader (BMG Labtech). Data were normalized
using the wells with cells containing 200 nM pHrodo red-labeled syn
fibrils as 100% and wells without cells as 0%.
Image Processing and Statistical
Analyses
Confocal
images were processed using the Zeiss Zen software. To measure fluorescence
intensity, we used the Fiji software. Images were converted to individual
channels, and regions of interest were drawn for measurement. Statistical
analyses were performed using either Excel or GraphPad Prism 9. Data
are presented as means ± SEM, which was calculated by GraphPad
Prism 9. P values were calculated by Student’s t test using Excel. Nonlinear curve fitting and IC50 calculation was done with GraphPad Prism 9 using the inhibitor response
three variable model or the exponential decay model. Images were prepared
with Adobe Photoshop and assembled in Adobe Illustrator. All experiments
presented were repeated at least twice independently. Data processing
and reporting are adherent to the community standards.
SARS-CoV-2
PP Assay
HEK293T-ACE2-GFP cells seeded in
white, solid-bottomed 384-well microplates (Greiner BioOne) at 6000
cells/well in 15 μL medium were incubated at 37 °C with
5% CO2 overnight (∼16 h). Compounds were titrated
1:3 with 11 points in DMSO and dispensed into the assay plate at 23
nLl/well via pintool. Cells were incubated with compounds for 1 h
at 37 °C with 5% CO2 before 15 μL/well of PPs
were added. The plates were then spinoculated by centrifugation at
1500 rpm (453g) for 45 min and incubated for 48 h
at 37 °C 5% CO2 to allow cell entry of PPs and the
expression of luciferase. After the incubation, the supernatant was
removed with gentle centrifugation using a Blue Washer (BlueCat Bio).
Then 20 μL/well of Bright-Glo luciferase detection reagent (Promega)
was added to assay plates and incubated for 5 min at room temperature.
The luminescence signal was measured using a PHERAStar plate reader
(BMG Labtech). Data were normalized with wells containing PPs as 100%
and wells containing control DEnv PP as 0%.
ATP Content Cytotoxicity
Assay
HEK293T-ACE2-GFP cells
were seeded in white, solid-bottomed 384-well microplates (Greiner
BioOne) at 6000 cells/well in 15 μL medium and incubated at
37 °C with 5% CO2 overnight (∼16 h). Compounds
were titrated 1:3 in DMSO and dispensed via a pintool at 23 nL/well
to assay plates. Cells were incubated for 1 h at 37 °C 5% CO2 before 15 μL/well of media was added. The plates were
then incubated at 37 °C for 48 h at 37 °C 5% CO2. After incubation, 30 μL/well of ATPLite (PerkinElmer) was
added to assay plates and incubated for 15 min at room temperature.
The luminescence signal was measured using a Viewlux plate reader
(PerkinElmer). Data were normalized with wells containing cells as
100% and wells containing media only as 0%.
Results and Discussion
Overall
performance of GCN Model
Unlike traditional
computational drug discovery methods such as structural homology-based
drug search, the GCN classification model utilizes molecular graphs
to extract spatial information. The modeling process computes in a
bonding environment at an atomic or interatomic level within a fully
connected graph as opposed to utilizing simple descriptors. As a result,
the structural features of drug molecules can be well captured and
built from low-level logic,[38,41] making no omission
of important possibilities. This method results in a robust performance
with the classification accuracy as high as 99.5% for the training
set (workflow is described in Figure ).
Figure 4
Workflow of GCN classification model upon endocytosis
screenings.
Workflow of GCN classification model upon endocytosis
screenings.We also did a benchmarking analysis
by comparing the classification
accuracy of GCN with several popular machine learning algorithms.
From the calculated AUC values, GCN performs comparably better than
other methods (Table ), especially for the case in which the ratio of active/inactive
compounds is small; in another aspect, from the confusion matrix of
validation set, it is notable that GCN is advantageous to exclude
inactive compounds with a high accuracy (more details can be found
in Table S1, Supporting Information). At
the same time, based on a Y-randomization test using a random counter
training set (the activity was randomly shuffled in the training data
set), we noticed that the AUC value of GCN is largely reduced, indicating
its dependence on highly solved structure similarity for active compounds
detection. Additionally, the identified new compounds from our in-house
libraries also show some structural novelty with respect to the active
compounds in our training set, further highlighting its high applicability
in real practice of decent drug screenings, compared to other structural
assignment-based approaches.
Table 1
Calculated AUC Scores
of Machine Learning-Based
Classification Methods upon the Applied Data Set
Methoda)
AUC score
1
Random forest
0.585
2
AdaBoost
0.589
3
GaussianNB
0.537
4
LogisticRegression
0.604
5
GradientBoosting
0.575
6
SVM
0.594
7
GCN (SchNet)
0.683
The technical details can be found
on our GitHub page: https://github.com/tcsnfrank0177/Graph-convolutional-network-DrugScreening.
The technical details can be found
on our GitHub page: https://github.com/tcsnfrank0177/Graph-convolutional-network-DrugScreening.
Identification of Inhibitors
for HSPG-Mediated Endocytosis
We used the GCN-based model
to screen 170,000 compounds. Approximately
2000 compounds were shortlisted by the virtual screening, which generated
a small library that could be rapidly processed by a conventional
quantitative high-throughput screen (qHTS) (Figure a). We then employed pHrodo red-labeled α-synuclein
fibrils as a HSPG cargo in a combination screen because α-synuclein
fibrils share a similar entry mechanism as SARS-CoV-2.[15] Importantly, the fluorescence intensity of cells
treated with pHrodo-labeled α-synuclein fibrils is only dependent
on the amount of internalized cargo and the endolysosomal pH. By comparison,
the luciferase-based pseudoviral entry assay can be influenced not
only by the level of viral entry but also by other factors that impact
mRNA expression, translation, and luciferase stability. The screen
identified 256 active compounds with the most potent IC50 value of 0.95 uM. We picked the 10 top compounds based on their
potency and structural novelty (Figure b) and measured their cytotoxicity by an ATP content
assay. The results showed that for four out of the 10 compounds, the
IC50 for cytotoxicity was at least 10-fold larger than
that for the inhibition of α-synuclein fibril uptake (Figure b and c), suggesting
a safety window for the usage of these drugs as endocytosis inhibitors.
These newly identified compounds displayed significant structural
novelty when compared to drugs in the training set.[6] This can be verified by the Tanimoto similarity analysis
(see the Support Information Table S2 for
more details). Interestingly, among the 10 confirmed drugs, six of
them share some common structural characteristics (NCGC00411611-01,
NCGC00411727-01, NCGC00411705-01, NCGC00411733-01, NCGC00411718-01,
NCGC00411588-01), suggesting a possible common mechanism for inhibiting
HSPG-mediated endocytosis.
Figure 5
Identification of inhibitors for HSPG-mediated
endocytosis. (a)
Workflow of α-synuclein fibrils uptake assay for confirmation
of hits from virtual compound screen. (b) Summary of activities of
the top 10 compounds; IC50 was determined by titration
experiments. (c) Dose–response curves of compound’s
inhibitory effect on α-synuclein fibrils uptake. (d) Measured
fluorescence intensity of internalized α-synuclein fibril-Alexa596 by U2OS cells treated with compound at 2-fold of its IC50. The experimental repeat number is 3.
Identification of inhibitors for HSPG-mediated
endocytosis. (a)
Workflow of α-synuclein fibrils uptake assay for confirmation
of hits from virtual compound screen. (b) Summary of activities of
the top 10 compounds; IC50 was determined by titration
experiments. (c) Dose–response curves of compound’s
inhibitory effect on α-synuclein fibrils uptake. (d) Measured
fluorescence intensity of internalized α-synuclein fibril-Alexa596 by U2OS cells treated with compound at 2-fold of its IC50. The experimental repeat number is 3.To rule out false-positive hits due to compound-induced changes
in lysosomal pH, which could reduce the fluorescence of internalized
α-synuclein fibrils, we measured the uptakes of α-synuclein
fibrils labeled with a pH-insensitive dye (Alexa596) in
U2OS cells. When cells were treated with the top 10 inhibitors at
concentrations 2-fold higher than their respective IC50 values, we found that all compounds tested could significantly inhibit
the uptake of α-synuclein fibrils compared to control treated
cells (Figure d).
These results suggest that these chemicals are indeed endocytosis
inhibitors that block HSPG-mediated entry of α-synuclein fibrils.
We then treated cells with increased concentrations of NCGC00411718
and NCGC00159478, which showed the highest inhibition on the entry
of pHrodo-labeled α-synuclein fibrils. Drug-treated cells were
incubated with Alexa596-labeled α-synuclein fibrils
in the presence of the inhibitor for 2 h and imaged by a confocal
microscope. The results suggest that both compounds inhibit α-synuclein
fibril uptake in a dose-dependent manner with IC50 comparable
to that measured by pHrodo-labeled α-synuclein fibrils (Figure a–d).
Figure 6
Identification
of new endocytosis inhibitors targeting HSPG-mediated
endocytosis. (a) NCGC00411718-01 inhibits α-synuclein fibril-Alexa594
uptake by U2OS cells in a dose-dependent manner. (b) NCGC00159478-04
inhibits of α-synuclein fibril-Alexa594 uptake by U2OS cells
in a dose-dependent manner. (c, d) Quantification of internalized
α-synuclein fibril-Alexa594 fluorescence intensity with compound
treatment. Error bars indicate SEM. The experimental repeat number
is 2.
Identification
of new endocytosis inhibitors targeting HSPG-mediated
endocytosis. (a) NCGC00411718-01 inhibits α-synuclein fibril-Alexa594
uptake by U2OS cells in a dose-dependent manner. (b) NCGC00159478-04
inhibits of α-synuclein fibril-Alexa594 uptake by U2OS cells
in a dose-dependent manner. (c, d) Quantification of internalized
α-synuclein fibril-Alexa594 fluorescence intensity with compound
treatment. Error bars indicate SEM. The experimental repeat number
is 2.
Identification of SARS-CoV-2
Entry Inhibitors
To test
whether the newly identified endocytosis inhibitors could inhibit
the entry of SARS-CoV-2, we used a previously established pseudotyped
particle entry assay (Figure a). As shown previously,[6] the entry
of the pseudoviral particles into cells results in the expression
of the luciferase reporter. To control the impact of ACE2-GFP expression
levels on viral entry under drug-treated conditions, we normalized
the luciferase signals by the ACE2-GFP level. We also measured the
cytotoxicity of these chemicals in ACE2-GFP-expressing cells using
an ATP-based cell viability assay. We analyzed the top 27 compounds
from the 256 inhibitors identified from the α-synuclein fibril
uptake screen. Among them, 16 in total showed an inhibitory activity
against the viral entry with the most potent IC50 value
of 0.76 μM. It is notable that some toxicity was observed for
these compounds in HEK293T-ACE2-GFP cells after 48 h treatment. The
viral inhibition and cytotoxicity curves of the top six compounds
are shown in Figure b.
Figure 7
Identification of SARS-CoV-2 entry inhibitors. (a) Experimental
scheme for inhibitor testing in HEK293T-ACE2-GFP cells. (b) Dose-responsive
titration of compound’s inhibitory effect on SARS-CoV-2 entry
and cytotoxicity. The experimental repeat number is 3.
Identification of SARS-CoV-2 entry inhibitors. (a) Experimental
scheme for inhibitor testing in HEK293T-ACE2-GFP cells. (b) Dose-responsive
titration of compound’s inhibitory effect on SARS-CoV-2 entry
and cytotoxicity. The experimental repeat number is 3.
NCGC00115755 Inhibited SARS-CoV-2 Pseudotyped Particle Entry
by Disrupting Actin Filaments
We previously showed that the
actin network under the plasma membrane is critical for the entry
of HSPG-dependent endocytosis cargos including SARS-CoV-2.[6,15] We therefore asked whether any of the newly identified endocytosis
could inhibit the actin cytoskeleton. To this end, we stained U2OS
cells with Alexa488-labeled phalloidin, an actin binding dye. In control-treated
cells, actin filaments were readily detected, which often run in parallel
(Figure a). When cells
treated with the top 10 endocytosis inhibitors were stained by Alexa488-labeled
phalloidin, we observed dose-dependent disruption of cortex actin
filaments only in NCGC00115755-02-treated cells by confocal fluorescence
microscopy (Figure a), and it has antipseudotyped particle activity at IC50 of 5 μM. Live cell imaging of cells expressing GFP-tagged
Tractin, an actin binding reporter, showed that untreated cells contain,
in addition to stress fibers, many actin nucleation sites near the
plasma membrane, which assemble comet tails (Supporting Information: Video 1, Video 2, Video 3). By contrast, in drug treated cells,
the number of actin stress fibers were significantly reduced, and
actin comet tails were barely detectable (Supporting Information: Video 1, Video 2, Video 3). Altogether, these findings suggest
that NCGC00115755-02 disrupts actin filament assembly, resulting in
an endocytosis defect.
Figure 8
NCGC00115755-02 targets cellular actin cytoskeleton. (a)
Cells
treated with NCGC00115755-02 at the indicated concentrations were
incubated with Alexa594-labeled α-synuclein fibrils for 2 h.
Cells were stained with Phalloidin-Alexa488 in green to detect actin
filaments and DAPI in blue to reveal the nuclei and then imaged. Note
that cells treated with the drug have reduced levels of internalized
α-synuclein fibrils. NCGC00115755-02 treatment also causes the
disassembly of actin stress fiber and generates large actin aggregates.
(b) Quantification of Alexa594-labeled α-synuclein fluorescence
intensity in panel a. Error bars indicate SEM. The experimental repeat
number is 2.
NCGC00115755-02 targets cellular actin cytoskeleton. (a)
Cells
treated with NCGC00115755-02 at the indicated concentrations were
incubated with Alexa594-labeled α-synuclein fibrils for 2 h.
Cells were stained with Phalloidin-Alexa488 in green to detect actin
filaments and DAPI in blue to reveal the nuclei and then imaged. Note
that cells treated with the drug have reduced levels of internalized
α-synuclein fibrils. NCGC00115755-02 treatment also causes the
disassembly of actin stress fiber and generates large actin aggregates.
(b) Quantification of Alexa594-labeled α-synuclein fluorescence
intensity in panel a. Error bars indicate SEM. The experimental repeat
number is 2.
Conclusion
Artificial
intelligence-based virtual screening technologies have
the potential to efficiently select drug candidates for specific targets
with high accuracy at an affordable cost, and, therefore, are important
complementaries to conventional high-throughput small molecule screening
(HTS). SARS-CoV-2 viruses co-opt a cellular endocytosis pathway to
enter human airway epithelial cells. This key viral entry step has
been subjected to conventional drug repurposing screens and computer
docking-based screens, yielding several viral entry inhibitors.[6,14] In this study, we developed and trained a GCN model using the structural
information from previously identified SARS-CoV-2 entry inhibitors.
When this model was applied to untested chemical libraries, it efficiently
selected compounds with high probability of showing anti-SARS-CoV-2
activity. This model, when combined with conventional drug screening
assays, generates a powerful strategy that allows rapid identification
of new SARS-CoV-2 entry inhibitors. In principle, this strategy can
be applied to any drug targets, which can quickly expand the existing
inhibitor repertoire of any class. The findings shown in this study
have revealed a promising venue for accelerated drug development.
Data
and Software Availability
Technical details of the developed
package can be found on our
GitHub page: https://github.com/tcsnfrank0177/Graph-convolutional-network-DrugScreening. Programming environment: Python 3.6 or higher is recommended. Supporting
videos are provided in the Supporting Information.