The COVID-19 pandemic has been responsible for several deaths worldwide. The causative agent behind this disease is the Severe Acute Respiratory Syndrome - novel Coronavirus 2 (SARS-CoV-2). SARS-CoV-2 belongs to the category of RNA viruses. The main protease, responsible for the cleavage of the viral polyprotein is considered as one of the hot targets for treating COVID-19. Earlier reports suggest the use of HIV anti-viral drugs for targeting the main protease of SARS-CoV, which caused SARS in the year 2002-2003. Hence, drug repurposing approach may prove to be useful in targeting the main protease of SARS-CoV-2. The high-resolution crystal structure of the main protease of SARS-CoV-2 (PDB ID: 6LU7) was used as the target. The Food and Drug Administration approved and SWEETLEAD database of drug molecules were screened. The apo form of the main protease was simulated for a cumulative of 150 ns and 10 μs open-source simulation data was used, to obtain conformations for ensemble docking. The representative structures for docking were selected using RMSD-based clustering and Markov State Modeling analysis. This ensemble docking approach for the main protease helped in exploring the conformational variation in the drug-binding site of the main protease leading to the efficient binding of more relevant drug molecules. The drugs obtained as top hits from the ensemble docking possessed anti-bacterial and anti-viral properties. This in silico ensemble docking approach would support the identification of potential candidates for repurposing against COVID-19. Communicated by Ramaswamy H. Sarma.
The COVID-19 pandemic has been responsible for several deaths worldwide. The causative agent behind this disease is the Severe Acute Respiratory Syndrome - novel Coronavirus 2 (SARS-CoV-2). SARS-CoV-2 belongs to the category of RNA viruses. The main protease, responsible for the cleavage of the viral polyprotein is considered as one of the hot targets for treating COVID-19. Earlier reports suggest the use of HIV anti-viral drugs for targeting the main protease of SARS-CoV, which caused SARS in the year 2002-2003. Hence, drug repurposing approach may prove to be useful in targeting the main protease of SARS-CoV-2. The high-resolution crystal structure of the main protease of SARS-CoV-2 (PDB ID: 6LU7) was used as the target. The Food and Drug Administration approved and SWEETLEAD database of drug molecules were screened. The apo form of the main protease was simulated for a cumulative of 150 ns and 10 μs open-source simulation data was used, to obtain conformations for ensemble docking. The representative structures for docking were selected using RMSD-based clustering and Markov State Modeling analysis. This ensemble docking approach for the main protease helped in exploring the conformational variation in the drug-binding site of the main protease leading to the efficient binding of more relevant drug molecules. The drugs obtained as top hits from the ensemble docking possessed anti-bacterial and anti-viral properties. This in silico ensemble docking approach would support the identification of potential candidates for repurposing against COVID-19. Communicated by Ramaswamy H. Sarma.
Entities:
Keywords:
3CLpro ; COVID-19; SARS-CoV-2; cryptic pockets; main protease; repurposing
The COVID-19 pandemic caused by the SARS-CoV-2 is known to spread quite rapidly. The first
incidence of this disease was found in Wuhan, China in December 2019. The World Health
Organization (WHO) has reported over 2.8 million affected individuals and 1.9 million deaths
by the end of April 2020. In a very short period COVID-19 has spread all over the globe. The
SARS-CoV-2, has an RNA genome of around 30 K nucleotides. This genome is known to code for
the entire viral proteome (Xu et.al., 2005). The entire coding RNA is divided into three
regions, nonstructural protein (nsp), structural protein and accessory protein-coding region
(Andersen et al., 2020; Kim et al., 2020; Wang & Chiou, 2020). The nonstructural
protein region consisting of ORF1a and ORF1b, that codes for the polyprotein pp1a and pp1b.
These polyproteins are further cleaved to form the 16 nsp. The structural protein region
codes for the Spike(S)-glycoprotein, envelope protein, membrane protein and nucleoprotein
(Kim et al., 2020). These proteins are
responsible for the viral replication, viral functioning and viral–host interaction. Hence,
therapeutic studies targeting these proteins have gained importance in the drug industry (Li
& De Clercq, 2020). Few of these major proteins which have been considered as potential
drug targets based on the earlier therapeutics developed against old coronaviruses include,
viral proteases, RNA dependent RNA polymerase and viral surface Spike protein (Nadeem
et al., 2020; Prajapat et al., 2020). The need of the hour lies in the development
of fast therapeutics for combating the SARS-CoV-2. Drug repurposing is one such strategy
that is being extensively used worldwide to design vaccines against this coronavirus (Asai
et al., 2020; Cherian et al., 2020; Pawar, 2020; Tu et al., 2020). To understand
how similar is the SARS-CoV-2 in comparison to the earlier coronaviruses sequence comparison
studies have been performed to understand the variation in the sequence of these potential
target proteins (Bauer et al., 2020; Forster
et al., 2020; Li et al., 2020; Stefanelli et al., 2020; Wang et al., 2020; Wu et al., 2020;
Yadav et al., 2020). These studies would help in
repurposing and repositioning the drugs that have been used earlier in order to target this
novel coronavirus. Drug repurposing studies are being performed through experimental as well
as computational techniques (Adeoye et al., 2020;
Ahmad et al., 2020; Amin & Abbas, 2020; Anwar et al., 2020; Baron et al., 2020;
Beura & Chetti, 2020; Boopathi et al., 2020; Borkotoky & Banerjee, 2020; Chandra et al., 2020; Chen et al., 2020; de Oliveira
et al., 2020; Elfiky, 2020a, 2020b; Elmezayen
et al., 2020; Fan et al., 2020; Khan et al., 2020b;
Kruse, 2020; Liu et al., 2020; Muralidharan et al., 2020; Phadke & Saunik, 2020;
Quimque et al., 2020; Rosa & Santos, 2020; Serafin et al., 2020; Shah et al., 2020).
All the three, drug targets mentioned have been extensively studied using drug repurposing.
Main protease, RNA dependent RNA polymerase and spike protein have been screened for small
molecules from the Food and Drug Administration (FDA) approved database as well as naturally
occurring compounds (Aanouz et al., 2020; Abdelli
et al., 2020; Al-Khafaji et al., 2020; Arya & Dwivedi, 2020; Babadaei, Hasan, Bloukh et al., 2020; Babadaei, Hasan, Vahdani et al., 2020; Basit et al., 2020;
Bhardwaj et al., 2020; Caly et al., 2020; Choudhury, 2020; Das et al., 2020;Elfiky, 2020c, 2020d; Elfiky & Azzam, 2020;
Enmozhi et al., 2020; Gao et al., 2020; Gupta et al., 2020a, 2020b; Gyebi
et al., 2020; Hasan et al., 2020; Hendaus, 2020; Islam
et al., 2020; Joshi et al., 2020; Khan et al., 2020a,
2020c; Kumar et al., 2020a, 2020b, 2020c, 2020d; Lobo-Galo et al., 2020; Mahanta
et al., 2020; Mittal et al., 2020; Pant et al., 2020; Sarma et al., 2020;
Sinha et al., 2020; Sk et al., 2020; Smith & Smith, 2020; Umesh et al., 2020;
Wahedi et al., 2020). The viral main protease
also known as 3Chymotrypsin-like protease (3CLpro) or main protease is formed by
the autocleavage of the polyprotein and further responsible for cleavage of various other
nsp (Kim et al., 2020). 3CLpro is the
nsp5 amongst the 16 nsp. The high-resolution crystal structure of 3CLpro was
elucidated in February 2020 (Jin et al., 2020).
Before the availability of this structure, modeling studies were performed for the
SARS-CoV-23CLpro and it was found to be similar to the SARS-CoV and MERS main
protease (Phadke & Saunik, 2020). Drug
repurposing studies on this model using in silico approaches
revealed the role of previously used antivirals as a potential drug for COVID19 (Phadke
& Saunik, 2020). The inhibitor interactions
with 3CLpro have been well explored using electron density maps, the residues
crucial to initiate the inhibitory effect upon interacting with the drugs have been
identified in these experimental studies (Fearon et al., 2020). There are reports on protein–protein interaction networks, where the entire
viral proteome has been studied to find its interactions with the human proteins (Gordon
et al., 2020). This study reveals the role of
3CLpro in obstructing the inflammatory and interferon pathway in humans by
inhibiting the nuclear transport of epigenetic regulatory proteins (Gordon et al., 2020). There are experimental studies that reveal the
inhibitor binding site of the 3CLpro is divided into various subsites (Zhang
et al., 2020). These studies also suggest that
interaction with the residues in these sites would help in designing SARS-CoV-23CLpro inhibitors (Zhang et al., 2020). The crucial residues HIS 41 and CYS 145 form the catalytic dyad which is
responsible to form strong interactions with the inhibitor. In the case of the inhibitor N3,
it was observed to form a covalent bond with CYS 145. The residues surrounding this
catalytic dyad are known to form the complete active site of 3CLpro. This active
site consists of subsites S1 and S2 (Figure 1)
(Zhang et al., 2020). The S1 subsite consists of
the residues PHE 140, GLY 143, CYS 145, HIS 163, GLU 166 and HIS 172. The S2 subsite
consists of the residues THR 25, HIS 41, MET 49, MET 165 and GLN 189 (Zhang et al., 2020). A Pan DDA analysis of different inhibitor
molecules of 3CLpro also suggests that the inhibitory effect is observed when the
molecule forms strong bonded or nonbonded interactions with these critical residues of the
3CLpro active site (Fearon et al., 2020).
Figure 1.
Inhibitor binding site as seen in 6LU7. N3 inhibitor (represented in stick), Subsite S1
(F140, G143, C145, H163, E166, H172) and Subsite S2 (T25, H41, M49, M165, Q189).
Inhibitor binding site as seen in 6LU7. N3 inhibitor (represented in stick), Subsite S1
(F140, G143, C145, H163, E166, H172) and Subsite S2 (T25, H41, M49, M165, Q189).This current article describes the scope of repurposing drugs for COVID-19 considering the
conformational variation in the inhibitory binding site of 3CLpro of SARS-CoV-2.
The dynamic states of the protein than remain unexplored in experimental techniques such as
X-ray crystallography were witnessed using molecular dynamics simulations. These findings
further helped in identifying even better potential candidates for drug repurposing. The
crystal structure with PDB ID 6LU7 was observed to have the covalently bound inhibitor N3 in
its active site (Figure 1) (Jin et al., 2020). This structure has been widely used by the
research community for performing molecular docking and simulation studies (Hall & Ji,
2020; Kandeel & Al-Nazawi, 2020; Komatsu et al., 2020; Ortega et al., 2020). The 3D coordinates for the apo form of 3CLpro were obtained from
the PDB ID 6LU7 (Jin et al., 2020). This
SARS-CoV-2 drug target was screened for probable drug candidates against the FDA approved
and SWEETLEAD drug database (Centre for Drug Evaluation and Research (US), 2004; Novick
et al., 2013). Two approaches were used for drug
repurposing studies against the drug target 3CLpro (Figure 2). The first approach involved docking of the FDA approved and
SWEETLEAD drug database against the crystal structure 6LU7 (Figure 2(A)). This approach has been referred to as ‘Direct Docking’ further. The
second approach involved docking of the two mentioned databases against an ensemble of
structures obtained from molecular dynamics (MD) simulations of the apo form of
3CLpro (Figure 2(B)). This approach has
been referred to as ‘Ensemble docking’ further. MD simulations were performed for a
cumulative of 150 ns for the apo 3CLpro. An open-source 10 μs MD simulation data
of 3CLpro dimer was obtained from the simulations performed on the MDGRAPE-4A
supercomputing cluster located at RIKEN BDR, Japan (Komatsu et al., 2020). An ensemble generation to perform ensemble docking was
obtained using Root Mean Square Deviation (RMSD) based clustering and Markov State Modelling
(MSM) analysis. A total of 16 conformations of 3CLpro were obtained from
RMSD-based clustering and MSM analysis. Further, these 16 conformations were docked against
the FDA approved and SWEETLEAD drug database (Figure
2(B)). The top-ranked drugs obtained from both these approaches belonged to the
class of anti-bacterial and anti-viral drugs. The docking scores obtained for the ensemble
docking revealed to be better than those obtained through direct docking. The interaction
energies observed for the ensemble docked compounds were significantly better as compared to
those seen through direct docking. The conformation variation in the drug-binding site of
3CLpro was observed, as a greater number of target protein residues interacted
with the drug molecule. This lacked when the drugs were docked directly on to the crystal
structure. More number of interactions between the drug molecule and the 3CLpro
inferred that the drug-binding pockets were found to be more accessible in case of the
ensemble docking. The findings obtained through the direct docking and ensemble docking of
3CLpro of SARS-CoV-2 have been discussed further in this article. These
observations may prove to be useful in silico approaches in
designing/repurposing drugs against COVID-19.
Figure 2.
Two approaches used for the drug repurposing and docking studies viz. Direct docking
(A) and Ensemble docking (B) (ref.41: Komutsu et.al., 2020).
Two approaches used for the drug repurposing and docking studies viz. Direct docking
(A) and Ensemble docking (B) (ref.41: Komutsu et.al., 2020).
Methodology
The high-resolution crystal structure of 3CLpro was retrieved from the Protein
Data Bank with PDB ID: 6LU7 (Jin et al., 2020).
This PDB file was cleaned by removing the ligand coordinates. This PDB in the apo form was
further considered for the molecular dynamics simulations and docking studies. The detailed
protocol followed has been explained in Figure 2.
Direct docking was performed on the 3CLpro protein against the FDA approved drug
database and the SWEETLEAD database. This docking was performed using DOCK 6 (Allen et al.,
2015). The receptor preparation was done using
UCSF Chimera and further the active site pocket identification and docking was performed
using DOCK 6 (Figure 2(A)) (Allen et al., 2015; Pettersen et al., 2004). In the case of ensemble docking, the coordinates for
3CLpro were obtained from molecular dynamics simulations (Figure 2(B)). The simulations were performed using the AMBER 16
simulation package (Case et al., 2016). The AMBER
FF14SB force field was used for generation of the parameters. The system was neutralized by
Na + ions and solvated using the TIP3P water model. The minimization was performed for
20,000 steps using the steepest descent and the conjugate gradient method. The system was
gradually heated to 300 K using the Langevin thermostat. The SHAKE algorithm was employed
for dealing with the hydrogen restraints. The equilibration was performed for 1 ns at NPT
with temperature being 300 K and pressure being 1 atm. The production run was performed for
50 ns using the NPT ensemble. Three parallel runs of 50 ns each were performed based on the
explained MD protocol, hence, a cumulative of 150 ns data was generated. A 10 µs simulation
data on the dimer of 3CLpro was obtained MDGRAPE-4A, at RIKEN BDR, Japan (Komatsu
et al., 2020). However, only monomer data was
used from this 10 µs as the simulations performed in-house belonged to the monomer. A
cumulative data of 10.15 µs was subjected to RMSD based clustering and Markov State
Modelling (MSM) analysis. The RMSD cut-off of 1.7 Å was used for the clustering using the
dbscan method of cpptraj module of AmberTools 17 (Ester et al., 1966). A total of 12
representative conformations were obtained through RMSD-based clustering. The MSM analysis
was performed using the PyEmma software (Scherer et al., 2015). The backbone dihedral angles were used as the collective variable for
performing the MSM analysis. The complete details of the MSM analysis performed in order to
obtain the different states of 3CLpro have been given in the Supporting
Information as SI1. MSM analysis has been widely used for significant sampling of the MD
simulation data (Chodera & Noé, 2014; Chodera
et al., 2007; Jani et al., 2019; Prinz et al., 2011;
Sirur et al., 2016). A similar methodology for
identifying significant states from the simulation data was used in one of the earlier works
reported by our group (Jani et al., 2019). A
total of 4 representative conformations for 3CLpro were obtained from the MSM
analysis. Hence, a total of 16 conformations exploring the ensemble of 3CLpro
were considered for the ensemble docking approach described in Figure 2(B). The FDA approved and SWEETLEAD drug databases were
screened against these 16 ensemble structures (as shown in Supporting Information Figures S1 and S2). The details of
choosing top scored ligands from ensemble docking has been discussed in details in the
‘Results & Discussion’ section (Supporting
Information Figures S1 and S2). The interaction energies for the docked complexes
were calculated using the Prodigy-LIG server (Vangone et al., 2019). Prodigy-LIG predicts the interaction energy between
protein–ligand complexes using aa contact-based prediction method. It takes into account the
inhibition constant obtained from the available protein–ligand complexes. A HADDOCK
refinement method is employed, to predict the intermolecular energies based on the number of
atomic contacts formed between the protein and the ligand (Vangone et al., 2019).
Results and discussion
Direct docking
The direct docking of the 3CLpro against the FDA approved drug database was
performed using DOCK6 (Allen et al., 2015). The
drugs were screened and ranked based on their grid score. The grid scores quantify the
strength of binding of any small molecule owing to the nonbonded interactions it forms
with the active site of the receptor molecule. Hence, a better grid score indicated a
better binding of that small molecule to the receptor. The grid scores are energy values
which are obtained using force field equation given below,Where, E is the grid score and is the van der Waals contribution and is the electrostatic contribution. Each of these terms
in double summation over ligand atoms i and receptor atoms j.A more negative value of E, indicates better stability of
docking obtained by that particular molecule. The top ranked drug would be the one which
would have the most negative value of the grid score (E). An
E value greater than zero indicates unfavorable binding of
the molecule to the receptor which may have resulted due to steric clashes between the
ligand molecule and the amino acid residues of the receptor molecule. Only 9% of the drugs
were observed to have grid scores above zero while screening the FDA approved. The drugs
with the top ranked grid scores have been discussed below.The drugs that ranked in the top five were ceftazidime (PubChem CID: 5481173), quetiapine
(PubChem CID: 5002), cabergoline (PubChem CID: 54746), enoxolone (PubChem CID: 10114) and
apremilast (PubChem CID: 11561674) in the order of decreasing rank. These drugs and their
current usage for treating different diseases have been listed in Table 1.
Table 1.
Top five ranked drugs from the FDA approved database obtained through direct docking
of 3CLpro of SARS-CoV-2.
PubChem CID
Name of the drug
Earlier purpose
Structure
5481173
Ceftazidime
Antibacterial, used in
pneumonia
5002
Quetiapine
Bipolar disorder, used as a treatment
against Schizophrenia
54746
Cabergoline
Dopamine agonists, used in Parkinson’s
disease
1011
Enoxolone
Consists of Glycyrrhetinic acid, a plant
derivative, used as anti-allergitic, anti-bacterial and anti-viral
11561674
Apremilast
Psoriasis
Top five ranked drugs from the FDA approved database obtained through direct docking
of 3CLpro of SARS-CoV-2.The information on the current usage of these drugs was obtained from PubChem. The grid
scores obtained for these drugs have been shown in Figure
3(A). The top-ranked drug ceftazidime is known to be used as an antibacterial in
respiratory tract infections. Enoxolone, which ranked fourth as per the grid score has
glycyrrhetinic acid as a subcomponent, which is a known plant derivative and is also known
to possess anti-bacterial and anti-viral properties. The drugs, quetiapine and
cabergoline, that ranked as second and third, respectively, are known to be effective in
treating neurological diseases such as bipolar disorder and Parkison’s disease.
Apremilast, which ranked as fifth is known to be used in treating psoriasis. The goal of
docking against the FDA approved drug database was to find previously known drugs that
would be effective in treating the symptoms of the current disease in investigation. The
results obtained from direct docking infer that out of the top five docked drugs, two of
them appear to carry antibacterial and antiviral properties which were also found to be
specific to the respiratory tract infections.
Figure 3.
The top five ranked drugs from the FDA approved (A) and SWEETLEAD (B) drug database
obtained through direct docking approach.
The top five ranked drugs from the FDA approved (A) and SWEETLEAD (B) drug database
obtained through direct docking approach.The direct docking of 3CLpro was performed against the SWEETLEAD drug database
using DOCK 6. The SWEETLEAD drug hosts around more than 10 K drug molecules which include
the approved drugs, rejected drugs and molecules isolated from traditional medicinal herbs
(Novick et al., 2013). One of the docking
studies by Smith and Smith on the SARS-CoV-2spike protein mentioned the use of SWEETLEAD
for identifying few molecules with potential inhibitory activity against the viral spike
protein (Smith & Smith, 2020). The docking
of 3CLpro against this database led to the identification of a few small
molecules which may prove to be potential candidates for drug development against the
COVID-19 disease (Smith & Smith, 2020). In
the case, of SWEETLEAD drug database only 17% of the drugs were observed to have grid
score above zero, the top ranked drugs have been described further. The drugs that ranked
in the top five were dibekacin (PubChem CID: 470999), micronomicin (PubChem CID: 3037206),
catalposide (PubChem CID: 93039), dihydro-alpha-ergocryptine (PubChem CID: 114948) and
itopride (PubChem CID: 3792) in the decreasing order of their ranks. The details about the
current usage of these drugs for treating various diseases and their respective grid
scores have been given in Table 2.
Table 2.
Top five ranked drugs from the SWEETLEAD database obtained through direct docking of
3CLpro of SARS-CoV-2.
Top five ranked drugs from the SWEETLEAD database obtained through direct docking of
3CLpro of SARS-CoV-2.The grid scores of each of these drugs has been shown in Figure 3(B). The drugs that were observed to be ranked in the top two were
dibekacin and micronomicin which belonged to the class of aminoglycoside antibiotics.
Catalposide which ranked third is known to be a plant derivative and is known to be used
as an anti-inflammatory agent. The remaining two drugs dihydro-alpha-ergocryptine and
itopride are used for treating Parkinson’s disease and gastrointestinal ailments,
respectively. Two of the drugs amongst the top five ranked drugs are known to have
antibacterial activity viz. dibekacin and micronomicin.
Ensemble generation
The molecular dynamics simulations of the apo 3CLpro were performed to obtain
an ensemble of 3CLpro conformations capturing the protein dynamics. An
additional open source simulation data of 10 µs was used for ensemble generation (Komatsu
et al., 2020). The conformational variation in
the 3CLpro was measured by calculating the backbone RMSD of the monomers
against the 6LU7. Figure 4 shows the backbone RMSD
of the 3CLpro against 6LU7 for each of the monomers. It was observed that the
RMSD values for 10 μs open source simulation ranged between 1 and 4 Å (Figure 4(A)). In case of one of the monomers the RMSD
ranged around 3.5–4 Å in the last 3 μs, whereas in the intial 6.5 μs it ranged around
1–2 Å. In case of the other monomer, the initial 2 μs had an RMSD within the range of
1–2 Å. The remaining 8 μs had the RMSD values ranging within 2–3 Å. Figure 4(B) represents the backbone RMSD values for the three
replicates of 50 ns simulations performed for the monomer 3CLpro system. It was
observed that for all the three replicates the RMSD values ranged between 1 and 2 Å
throughout the simulations. The different ranges of RMSD suggest that different
conformations of the 3CLpro were explored in all these simulations. In order to
generate different representatives of these ensembles clustering was performed using two
different approaches. The first method employed was RMSD-based clustering. The simulation
data of 10.15 µs was clustered based on the all-atom RMSD of all the residues of the
3CLpro system. The reference used was the 6LU7 structure of
3CLpro. The cpptraj module of AMBERTOOLS 17 was used for performing this
clustering using the dbscan method (Ester et al., 1996). The RMSD cut-off used here was of
1.7 Å for the in-house as well as the open-source simulation data. A total of three
clusters were obtained from the in-house simulation data and nine clusters were obtained
from the open-source simulation data. The representatives of these 12 clusters were
considered for further docking studies. Figure 5
shows the RMSD values of these cluster representatives against the experimental structure
of 3CLpro, i.e. PDB ID: 6LU7. The structures represented in blue were the ones
that were obtained from the RMSD based clustering. Ten of the representative structures
from the ensemble had the RMSD value below 3 Å. In order, to have more significant
conformations from the ensemble a second approach of MSM analysis was performed on the
entire simulation data. The collective variable used for the MSM analysis was the backbone
torsion angle. Based on this CV, four significant states were obtained for the entire
simulation data (Supporting Information SI 1). The RMSD values of these four
representative structures obtained from MSM analysis have been given in the Figure 5. The representative structures shown in
purple were obtained from the MSM analysis (Figure
5). All the four representative structures had an RMSD of less than 3 Å, whereas
three of them showed an RMSD below 2 Å. These 16 structures represented the ensemble
covered by the protein throughout the simulations. The varying RMSD values infer that the
dynamics of the protein helped in surfacing out conformations that differ from the
experimentally derived static conformation of the protein. The flexibility of the protein
was captured in these representative structures which further helped in the docking of a
few other small molecules. Considering these ensemble structures, helped to explore a
wider range of drug molecules that would bind to the target protein, in this case the
3CLpro protein. There are studies where the role of molecular dynamics in
exploring the different conformations of the binding site also referred to as cryptic
pockets help in computer-aided drug discovery (Kuzmanic et al., 2020). Hence, the identification of different states of
3CLpro through MD simulations helped in visiting different conformations of
the binding site. The information on varying conformations of the binding site may also
lead to the identification of more significant drug molecules, further increasing the
scope of therapeutics through drug repurposing.
Figure 4.
(A) Backbone RMSD of the two monomers against 6LU7 (open source 10 μs simulation
data). (B) Backbone RMSD of the monomers against 6LU7 for three replicates of 50 ns
simulation data).
Figure 5.
Root Mean Square deviation (RMSD) values against 6LU7 for the 16 ensemble
representatives (C1 to C16) of 3CLpro obtained through MD simulations.
(A) Backbone RMSD of the two monomers against 6LU7 (open source 10 μs simulation
data). (B) Backbone RMSD of the monomers against 6LU7 for three replicates of 50 ns
simulation data).Root Mean Square deviation (RMSD) values against 6LU7 for the 16 ensemble
representatives (C1 to C16) of 3CLpro obtained through MD simulations.
Ensemble docking
A total of 16 ensemble representatives of 3CLpro, as explained in the
subsection ‘Ensemble Generation’ were selected from the MD simulation data using
RMSD-based clustering and MSM analysis. Sixteen independent molecular docking were
performed on each of these 16 ensemble representatives for screening each of the two
databases separately, using the identical protocol as explained in the ‘Methodology’
section of the article. Hence, a total of 32 independent docking studies were performed
for screening the FDA approved and the SWEETLEAD drug database against these 16
representative structures of 3CLpro. The Supporting Information Figure S1 depicts the grid scores obtained from each of
the 16 independent docking exercises. In the Supporting
Information Figure S1, the X-axis represents the
drug molecules from the FDA drug database and the Y-axis
represents the grid scores obtained through DOCK 6. The top ranked drug is at position 1
of the X-axis obtained from C1 to C16 independent docking
exercises. These top ranked drugs have been shown in Figure 6(A) of the manuscript and also in Table 3. The top ranked drugs from the SWEETLEAD database also have been
identified using the similar procedure. Figure 6
explains the name of the top-ranked drugs against these ensemble representatives and their
corresponding grid scores obtained from DOCK 6. Around, 3–10% of the drugs from either of
database were observed to have a grid score above zero in these 16 ensembles. These top
ranked drugs obtained for each of the 16 ensembles has been discussed below. A schematic
representation of the 16 independent docking leading to the identification of 16 top
ranked drugs from the FDA approved drug database has been given in Supporting Information Figure S2. The Supporting Information Figure S2 depicts the top ranked drug from the
FDA database for every ensemble representative. The docking performed considering the C1
ensemble representative as receptor, resulted in Hesperidin as the top ranked drug.
Similarly, in every individual docking exercise, the drug that ranked first has been shown
in Table 3 and Figure 6(A) of the manuscript. Table 4
and Figure 6(B) show the top ranked drugs obtained
through individual docking exercise while screening the SWEETLEAD drug database. The
details about these drugs obtained from FDA approved and SWEETLEAD drug database
mentioning their earlier purpose and their chemical structure has been described in Tables 3 and 4, respectively.
Figure 6.
The top ranked drugs from the FDA approved (A) and SWEETLEAD (B) drug database
obtained through ensemble docking approach.
Table 3.
Top ranked drugs of the FDA approved drug database obtained through 16 independent
molecular docking of 3CLpro ensemble representatives of SARS-CoV-2.
3CLpro Ensemble
representative
Name of the drug (PubChem
CID)
Earlier purpose
Structure
RMSD-based clustering
C1
Hesperidin (3594)
Bioflavonoid, anti-oxidant,
anti-inflammatory
C2
Etoposide (36462)
Chemotherapy drug, used in lung cancer
too
C3
Pranlukast (4887)
Anti-asthamatic, reduces
bronchospasm
C4
Azelnidipine (65948)
Treats hypertension, calcium channel
blocker
C5
Epicatechin gallate
(107905)
Flavonoid, treated for
pre-diabetes
C6
Brinzolamide (68844)
Ocular Hypertension
C7
Ceftiofur (6328657)
Anti-bacterial, veterinary
drug
C8
Artesunate (6917864)
Treats malaria, combination therapy
mefloquine
C9
Ivermectin (6321424)
Treats parasitic infections
C10
Peimine (131900)
Anti-inflammatory
C11
Empagliflozin (11949646)
Treats Type2-Diabetes
C12
Agenerase/Amprenavir
(65016)
Antiviral, inhibits HIV
protease
Markov state modelling (MSM)
Analysis
C13
Cefazedone (71736)
Antibacterial
C14
Indinavir (5362440)
Antiviral, inhibits HIV
protease
C15
Ceftin (6321416)
Antibacterial, used against
pneumonia
C16
Ceftizoxime (6533629)
Antibiotic, used against life threating
bacterial infections
Table 4.
Top ranked drugs of the SWEETLEAD drug database obtained through 16 independent
molecular docking of 3CLpro ensemble representatives of SARS-CoV-2.
3CLpro Ensemble
representative
Name of the drug (PubChem
CID)
Earlier purpose
Structure
RMSD-based clustering
C1
Tobramycin (36294)
Antibiotic, antibacterial
activity
C2
Lanreotide acetate (71349)
Used to treat Acromegaly, inhibits the
growth hormone
C3
Lenapenem (216262)
carbapenem antibiotic with bactericidal
activity, penicillin binding protein
C4
Neomycin (8378)
Aminoglycoside antibiotic
C5
Riboflavin tetrabutyrate
(92140)
One component of the multi-vitamin
drugs
C6
Sennosides (5199)
Stimulant laxative
C7
Gentamicin (3467)
Antibacterial, used against
pneumonia
C8
Terlipressin (72081)
Vasoactive drug, used to manage low blood
pressure
C9
Ribostamycin (33042)
Aminoglycoside-aminocyclitol
antibiotic
C10
Tobramycin (36294)
Antibiotic, antibacterial
activity
C11
Neomycin (8378)
Aminoglycoside antibiotic
C12
Neomycin (8378)
Aminoglycoside antibiotic
Markov State Modelling (MSM)
Analysis
C13
Lypressin (644076)
Used against diabetes
insipidus
C14
Amikacin (37768)
Antibiotic, Multi-drug resistant
tuberculosis
C15
Vasopressin tannate (8230)
Antidiuretic drug, used against diabetes
insipidus
C16
Netilmicin (441306)
Antibiotic, treatment against severe
bacterial infections
The top ranked drugs from the FDA approved (A) and SWEETLEAD (B) drug database
obtained through ensemble docking approach.Top ranked drugs of the FDA approved drug database obtained through 16 independent
molecular docking of 3CLpro ensemble representatives of SARS-CoV-2.Top ranked drugs of the SWEETLEAD drug database obtained through 16 independent
molecular docking of 3CLpro ensemble representatives of SARS-CoV-2.Figure 6(A) explains the screening of the FDA
approved drug database against the ensemble representative structures. The drug with the
lowest value of the grid score was indinavir (PubChem CID: 5362440), which is a known HIV
protease inhibitor which was obtained for the C14 ensemble representative. This was
followed by ceftin (PubChem CID: 6321416) which a cephalosporin-derivative and is used as
an antibiotic to fight bacterial infection (C15 ensemble representative). The third lowest
value of the grid score was observed for the drug ivermectin (PubChem CID: 6321424), which
is used in treating head lice, and is known to possess anti-parasitic property (C9
ensemble representative). One of the in vitro drug
repurposing studies, approves the use of ivermectin as a repurposed drug against COVID-19
(Caly et al., 2020). However, three more drugs
that belong to the group of cephalosporin-derivatives viz. ceftiofur (PubChem CID:
6328657), cefazedone (PubChem CID: 71736) and ceftizoxime (PubChem CID: 6533629) were also
ranked through ensemble docking. All these drugs are known to possess anti-bacterial
property and are used to treat severe bacterial infections. The top-ranked drug obtained
through direct docking, ceftazidime, is also a cephalosporin derivative (Figure 3(A)). Amprenavir (PubChem CID: 65016), which
is also a known HIV protease inhibitor ranked as the top hit for the C12 ensemble
representative. However, the value of the grid score was comparatively higher than the top
hit drug molecules for the other ensemble structures of 3CLpro.Figure 6(B) lists the top-ranked drugs obtained
on screening the SWEETLEAD drug database using the ensemble docking approach. Amongst, the
16 representative structures used for docking, three structures (C4, C11 and C12) were
observed to show the anti-bacterial drug, Neomycin (PubChem CID: 8378) as the top ranked
molecule. The drug with the lowest value of the grid score was neomycin, in comparison to
the rest the of the top ranked drugs for each of the 3CLpro ensemble
representative. The drug with the second-lowest value of grid score was vasopressin
tannate (PubChem CID: 8230) which is known to be used as an anti-diuretic drug to treat
diabetes insipidus (C15). The drug with the third-lowest value of grid score was again
neomycin for the ensemble representative C11. However, amikacin which is also known for
its antibacterial property was the one with fourth-lowest value of grid score (top ranked
for C14). Ten of the 16 ensemble structures screened drug molecules possessing
anti-bacterial activity viz. tobramycin (PubChem CID: 36294), lenapenem (PubChem CID:
216262), neomycin, gentamicin (PubChem CID: 3467), ribostamycin (PubChem CID: 33042),
amikacin (PubChem CID: 37768) and netilmicin (PubChem CID: 441306) as top-ranked. Neomycin
and tobramycin appeared as the top ranked for three and two of the 16 representative
structures, respectively.
Drug–3CLpro interactions
Interaction energies
The interaction energies between the top-ranked docked ligands and 3CLpro
were calculated using the Prodigy-LIG server (Vangone et al., 2019). This interaction energy obtained from Prodigy-LIG server,
is calculated based on the number of contacts formed by the atoms of the ligands with
the residues of the protein. Lower the value of the interaction energy better would be
the binding between the protein and ligand molecule. Figure 7(A) depicts the interaction energies for the top ranked drugs docked
drugs obtained on screening the FDA approved drug database. It was observed that the
interaction energies for the top-ranked drugs through ensemble docking (yellow) were
significantly better from the top five ranked drugs obtained through direct docking. A
difference of 3–6 kcal/mole was observed between the drugs obtained through the two
docking approaches, the ensemble docking showing better interaction energies. The drug
with the best interaction energy was ivermectin, which was also one of the top ranked
drugs w.r.t the grid score obtained from DOCK 6. Figure
7(B) depicts the interaction energies for the top hit drugs obtained on
screening the SWEETLEAD drug database. All the drugs obtained through direct and
ensemble docking approaches were observed to have interaction energies in the range of
−4 to −6 kcal/mole.
Figure 7.
Interaction energies obtained for the docked complexes obtained through direct
(blue) and ensemble (yellow) docking of FDA approved (A) and SWEETLEAD (B) drug
database.
Interaction energies obtained for the docked complexes obtained through direct
(blue) and ensemble (yellow) docking of FDA approved (A) and SWEETLEAD (B) drug
database.As the interaction energy calculation depends on the number of contacts formed by the
ligand with the residues of the protein, it may be inferred that in case of the FDA
approved drugs the top-ranked drugs obtained through ensemble docking showed better
interactions with the residues of 3CLpro as compared to those obtained
through direct docking. Amongst the ensemble docked FDA approved drugs, ivermectin would
have formed a greater number of contacts with the residues of 3CLpro as
compared to other drugs. In the case of the SWEETLEAD drugs, all the top-ranked drugs
from either of the docking approach showed similar values of interaction energies. This
may infer that most of these drugs formed similar number of contacts with the residues
of 3CLpro.
Crucial residues: direct docking
The active site of the main protease shows the presence of a few polar residues viz.
histidine, asparagine, glutamate and glutamine. These residues were observed to be
involved in hydrogen bonding and hydrophobic interactions with the drug molecules. The
LigPlus and PLIP were used to calculate the various interactions between the drug
molecule and the 3CLpro (Laskowski & Swindells, 2011; Salentin et al., 2015). Figure S3 Supporting
Information shows the residues of 3CLpro present in the vicinity of
the FDA approved ligand molecules and the residues that are involved in hydrogen
bonding. The drug ceftazidime which showed the lowest value of grid score was observed
to form hydrogen bonds with GLY 143 and HIS 164 (Figure
8). Ceftazidime was also involved in forming hydrophobic interactions with GLU
166 and π-π interactions with HIS 41 (Figure 8).
Quetiapine formed hydrogen bonds with GLY 143 and THR 26. It showed hydrophobic
interactions with ASP 187 and GLU 189 and π–π interactions with HIS 41. Cabergoline had
no hydrogen bonding interactions, however, it showed hydrophobic interactions with PHE
140, GLU 166 and GLN 189. π–π interaction with HIS 41 was observed for cabergoline too.
Enoxolone was observed to form hydrogen bonds with GLU 166 and GLU 192. It was also
involved in hydrophobic interactions with THR 25, ASN 142, MET 165 and GLU 189.
Apremilast was observed to form hydrogen bonding interactions with THR 26, GLY 143, ASN
142 and SER 144 and hydrophobic interactions with ASN 142 and MET 165. It was observed
that HIS 41, GLY 143, ASN 142 and GLU 166 were involved in interacting with the three of
the drug molecules amongst the top five ranked drugs. However, π-π interactions with HIS
41 were observed in the top three molecules. The region around CYS 145 of the main
protease is known to interact with human proteins viz. human deacetylase 2 (HDAC2)
tRNA-methyl transferase 1 (TRMT1) (Gordon et al., 2020). Both these proteins are known epigenetic regulators and their nuclear
localization is blocked by this viral protease (Gordon et al., 2020). The docking studies performed here revealed that the
residues ASN142 and GLY143, neighboring to the CYS145 were known to be involved in
interacting with the drug molecules. This may suggest their crucial role in inhibiting
the protein–protein interaction between the main protease and the human epigenetic
regulatory proteins.
Figure 8.
Ceftazidime forming hydrogen bonds with G143 and H164, π-π and hydrophobic
interactions with H41 and E166, respectively.
Ceftazidime forming hydrogen bonds with G143 and H164, π-π and hydrophobic
interactions with H41 and E166, respectively.Figure S4 Supporting Information shows the
residues of 3CLpro present in the vicinity of the SWEETLEAD drug molecules
and the residues that are involved in hydrogen bonding. The top-ranked drug, dibekacin
was observed to form five hydrogen bonds with THR 25, THR 26, SER 46, CYS 145, GLU 166
and GLN 189 (Figure 9). Micronomicin, which
ranked second in terms of the grid score was observed to form hydrogen bonds with THR
24, SER 46 and ASN 142. The next drug in the top five ranked drugs was catalposide which
showed hydrogen bonding interactions with THR 24, THR 26, GLY 143 and GLU 166. It also
showed π- π interactions with HIS 41 and hydrophobic interactions with MET 165 and GLN
189. Dihydro-alpha ergocryptine, which ranked fourth in terms of the grid score was
observed to form hydrogen bonds with ASN 142 and GLY 143. It was also involved in
hydrophobic interactions with GLU 166. Itopride which ranked last amongst the top five
drug molecules did not show any significant interactions with the residues of the
protein.
Figure 9.
Dibekacin forming hydrogen bonds with T25, T26, S46, C145, E166 and Q189.
Dibekacin forming hydrogen bonds with T25, T26, S46, C145, E166 and Q189.
Crucial residues: ensemble docking
Experimental studies performed to elucidate the crystal structure of 3CLpro
suggests that the inhibitor binding site of this protein is divided into subsites (Zhang
et al., 2020). The S1 subsite consists of PHE
140, ASN 142, GLU 166, HIS 163 and HIS 172. Whereas, the S2 subsite consists of the
hydrophobic pocket made by the residues viz. HIS 41, MET 49, TYR 54 and MET 165. The CYS
145 is involved in covalent bond with the inhibitor N3 (Figure 1) (Jin et al., 2020; Zhang
et al., 2020). It was observed that most of
these residues which play a crucial role in interacting with the inhibitor showed
similar results for the drugs obtained through direct and ensemble docking. However,
obtaining the same drug as the top-ranked in case of different ensemble representative
states with varying interactions suggests conformational variability in the inhibitor
binding site of the 3CLpro.The residues interacting with the top-ranked drugs from the FDA approved and the
SWEETLEAD database for all the 16 representative clusters have been shown in Supporting Information Figures S5 and S6,
respectively. The ensemble docking of FDA approved drug database had the three drugs
indinavir, ceftin and ivermectin as top ranked drugs for three of the 3CLpro
representatives. These three had the lowest value of the grid scores in comparison to
the other top-ranked drugs from the remaining 13 3CLpro representative.
Indinavir, ceftin and ivermectin were observed to show interactions with THR 24, LEU 27,
VAL 42, THR 45, ARG 60, LYS 61, ASN 142, GLU 166 and GLN 189 (Figure 10). Figure S5
Supporting Information depicts the hydrogen bonding interactions of all the top-ranked
drug molecules of the 16 representative structures. It was observed that apart from the
residues mentioned above HIS 41, PHE 140, HIS163 and GLN 192 were also responsible for
forming hydrogen bonding interactions with the other top-ranked drug molecules. The
ensemble docking of SWEETLEAD drug database had the three drugs neomycin, vasopressin
tannate and amikacin as top-ranked in five of the 3CLpro representatives. The
grid scores for these three drug molecules had the lowest values in comparison to the
remaining top-ranked drugs in the remaining 11 3CLpro representatives.
Neomycin was observed to be the top-ranked drug in the case of three ensemble structures
and in comparison, to the other top-ranked drugs had the first and third lowest value of
the grid score. The conformational variability in the ensemble structures was visible on
observing the number of interactions of the drug, neomycin in the three different
3CLpro states that were captured (Figure
11). GLU 166 was involved in the formation of strong hydrogen bonding with the
atoms of the neomycin in all the three states. GLU 166 formed 3 (Figure 11(A)), 1 (Figure
11(B)) and 2 (Figure 11(C)) hydrogen
bonds in the three representative structures of the ensemble. The other residues
involved in hydrogen bonding were THR 24, SER 46, HIS 164, MET 165, PRO 168 and GLN 189.
Vasopressin tannate, was the drug with second lowest value of grid score amongst the
remaining top-ranked drug molecules. It was observed to form hydrogen bonds with THR 24,
THR 25, HIS 41 and SER 46 residues of the 3CLpro. Amikacin, the drug obtained
with the fourth-lowest value of grid score formed hydrogen bonds with CYS 145, ASN 142
and MET 165.
Figure 10.
Hydrogen bonding interactions of indinavir (A), ceftin (B) and ivermectin (C) with
the residues of 3CLpro.
Figure 11.
Hydrogen bonding of neomycin with residues of 3CLpro captured in three
different states of during ensemble generation.
Hydrogen bonding interactions of indinavir (A), ceftin (B) and ivermectin (C) with
the residues of 3CLpro.Hydrogen bonding of neomycin with residues of 3CLpro captured in three
different states of during ensemble generation.
Conclusion
The high throughput docking and ensemble docking studies of 3C-like protease reveal few
potential drugs that can be considered for repurposing. The docking against the FDA approved
and the SWEETLEAD drug database helped to enlist few antibacterial and antiviral drugs that
may be used as candidates for repurposing studies against 3C-like protease. Indinavir,
ivermectin, cephalosporin-derivatives, neomycin and amprenavir were few of the drugs which
may prove to be effective against the symptoms seen in COVID-19. As, the earlier purpose of
indinavir and amprenavir states inhibition of the HIV protease. Similarly, ivermectin,
cephalosporin-derivatives and neomycin are used against treating anti-parasitic and
anti-bacterial infections especially respiratory tract infections. In support of these
findings, these drugs were also observed to show better docking scores in comparison to
other drugs. The ensemble docking approach helped to explore the conformational variability
of the inhibitor binding site of 3C-like protease. The conformations captured through
effective sampling methods like Markov State Modelling analysis reveal a more accessible
region for inhibitors to bind to the 3C-like protease. The drugs with anti-viral and
anti-bacterial properties that were not ranked at the top by direct docking were identified
through ensemble docking. The grid scores for the drugs when docked against the ensemble
structures of 3CLpro were observed to be better in comparison to direct docking.
Inferring, the conformational flexibility of the 3CLpro to accommodate potential
drug molecules. The drug residue interactions also complement the role of crucial residues
that were earlier defined by the electron-density studies of 3C-like protease and its
inhibitors (Fearon et al., 2020). The ensemble
docking approach coupled with a strong sampling technique would help to explore the more
accessible conformations of the drug target which would further help in designing a better
drug as an inhibitor.Click here for additional data file.
Authors: D Xu; Z Zhang; L Jin; F Chu; Y Mao; H Wang; M Liu; M Wang; L Zhang; G F Gao; F S Wang Journal: Eur J Clin Microbiol Infect Dis Date: 2005-03 Impact factor: 3.267
Authors: Sophie Alexandra Baron; Christian Devaux; Philippe Colson; Didier Raoult; Jean-Marc Rolain Journal: Int J Antimicrob Agents Date: 2020-03-13 Impact factor: 5.283