Sheng Guo1, Hang Xie2, Yu Lei1, Bin Liu3, Li Zhang4, Yechun Xu5, Zhili Zuo6. 1. State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China; School of Chemical Engineering, Sichuan University of Science & Engineering, 180 Xueyuan Street, Huixing Road, Zigong, Sichuan 643000, China. 2. CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China. 3. State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China. 4. School of Chemical Engineering, Sichuan University of Science & Engineering, 180 Xueyuan Street, Huixing Road, Zigong, Sichuan 643000, China. 5. CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China. Electronic address: zuozhili@mail.kib.ac.cn. 6. State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China. Electronic address: ycxu@simm.ac.cn.
Abstract
SARS-CoV-2 is the pathogen that caused the global COVID-19 outbreak in 2020. Promising progress has been made in developing vaccines and antiviral drugs. Antivirals medicines are necessary complements of vaccines for post-infection treatment. The main protease (Mpro) is an extremely important protease in the reproduction process of coronaviruses which cleaves pp1ab over more than 11 cleavage sites. In this work, two active main protease inhibitors were found via docking-based virtual screening and bioassay. The IC50 of compound VS10 was 0.20 μM, and the IC50 of compound VS12 was 1.89 μM. The finding in this work can be helpful to understand the interactions of main protease and inhibitors. The active candidates could be potential lead compounds for future drug design.
SARS-CoV-2 is the pathogen that caused the global COVID-19 outbreak in 2020. Promising progress has been made in developing vaccines and antiviral drugs. Antivirals medicines are necessary complements of vaccines for post-infection treatment. The main protease (Mpro) is an extremely important protease in the reproduction process of coronaviruses which cleaves pp1ab over more than 11 cleavage sites. In this work, two active main protease inhibitors were found via docking-based virtual screening and bioassay. The IC50 of compound VS10 was 0.20 μM, and the IC50 of compound VS12 was 1.89 μM. The finding in this work can be helpful to understand the interactions of main protease and inhibitors. The active candidates could be potential lead compounds for future drug design.
At the end of 2019, a coronavirus that swept the world was named SARS-CoV-2 by the International Committee on Taxonomy. The disease caused was named COVID-19 by the World Health Organization [1]. SARS-CoV-2 has a strong infectious ability that seriously threatening human life worldwide. This kind of virus can cause diseases in humans and infect mammals [2], [3], [4], and the infected people or animals may become carriers of respiratory, intestinal, liver, and nervous system diseases [5]. Prior to SARS-CoV-2, six types of coronaviruses can infect humans, including two highly lethal coronaviruses, namely severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV), four types of coronaviruses that can cause mild upper respiratory diseases, named HCoV-OC43, HCoV-229E, HCoV-NL63 and HCoV-HKU1 [6], [7].SARS-CoV-2 is a single positive-stranded RNA virus belonging to the genus Coronavirus β [8]. The complete sequencing of the SARS-CoV-2 genome has untranslated regions (UTR) at both ends and at least 6 complete open reading frame genes (ORF) [9], [10]. The first ORF (ORF 1a / b) directly translates two polyproteins: polyprotein 1a (pp 1a) and polyprotein 1ab (pp 1ab). These polyproteins are processed by the main protease (Mpro), also known as 3C-like protease (3CLpro), and one or two papain-like proteases (PLP) to become 16 non-structural proteins (nsps) [11]. These nsps are involved in the production of subgenomic RNA, which encodes four major structural proteins, namely surface spike glycoprotein (S), envelope protein (E), membrane protein (M) and Nucleocapsid protein (N) [12], [13]. Then proteins are collected with new RNA genome assembly in the endoplasmic reticulum (ER) and Golgi-apparatus [14].Mpro plays a vital role in the replication cycle of the coronavirus, because the Mpro operates at more than 11 cleavage sites on the pp1ab [15]. The recognition sequence is Leu-Gln for most of 11 sites [16]. Inhibiting the activity of Mpro would block viral replication and would essentially block viral replication [17]. There are no known homologs of Mpro in humans with identical cleavage specificity. Hence, its inhibition is unlikely to show side effects, making it an attractive target for COVID-19 drugs. In previous studies, the Mpro inhibitors have been discovered, including natural and derived bio-active compounds [14], major metabolites from spices [15], bioactive molecules from tea plant [17], herbal plants [18] and acridinedione analogs [19]. However, there is still no effective small molecular medicines available in clinic currently.In this work, we tried to identify the inhibitors of Mpro by docking-based virtual screening and the biochemical evaluation against the Specs database. Then, we explored and compared the interaction modes between compounds obtained and known Mpro inhibitors. This work tried to provide a rapid discovery of Mpro inhibitors which could be developed as drug lead compounds against the SARS-Cov-2.
Materials and methods
In this work, docking-based virtual screening and biochemical evaluation were carried out to discover potential Mpro inhibitors. The three-dimensional crystal structure of Mpro of SARS-CoV-2 has been extracted from PDB database (PDB code: 6LU7; resolution: 2.16 Å) [20]. The working flow for this work is shown in Fig. 1
.
Fig. 1
The flowchart of discovery for Mpro inhibitors.
The flowchart of discovery for Mpro inhibitors.
Preparation of protein
Discovery Studio 4.0 (DS 4.0) was employed to prepare protein by adding missing residues, hydrogen atoms as well as removing water molecules and spectator ions [21]. Then, the structure was minimized and optimized using Maestro 12.3 software (www.schrodinger.com) with OPLS3e force field. The physical condition of pH was set as 7.0. This step optimized the structure, to relieve any strain and fine-tune the placement of various groups. Hydrogen atoms are always optimized fully, which allows relaxation of the H-bond network. Heavy atoms were optimized with converge of 0.3 Å. Allowing movement of the heavy atoms can relieve structural strain, but will result in some deviation from the initial crystal structure.
Preparation of small molecules
The 212,736 molecules from the Specs database (http://www.specs.net) were filtered by Lipinski and Veber rules, and 75,671 molecules were retained [22]. A PAINS (Pan-Assay Interference Compounds) filtering was employed to remove false positive compounds using FAFDrugs4 (https://fafdrugs4.rpbs.univ-paris-diderot.fr/) [23]. The ADMET evaluation was then carried out in DS 4.0 including aqueous solubility, blood brain barrier penetration (BBB), cytochrome P450 (CYP450) 2D6 inhibition, hepatotoxicity, human intestinal absorption (HIA) and plasma protein binding. The intestinal absorption model includes 95% and 99% confidence ellipses in the ADMET_PSA_2D, ADMET_AlogP98 plane [24], [25]. There were 69,946 molecules retained via PAINS filtering and then 65,090 molecules retained via ADMET evaluation. The prepared ligand database was used to generate 3D conformational set for each molecule using CAESAR in DS 4.0 (http://www.accelrys.com). A maximum of 255 conformations within 20 kcal/mol in energy from the global minimum was generated. These structures were used for the next docking simulation.
Docking by LibDock
The Dock Ligands (LibDock) protocol is developed by Diller and Merz [26]. The binding site was defined by the position of original ligand. The protein site features used in LibDock is referred to as Hot Spots (polar and apolar). A polar Hotspot is preferred by a polar ligand atom, for example, a hydrogen bond donor or acceptor. An apolar Hot Spot is preferred by an apolar atom, for example, a carbon atom. The receptor Hot Spot file is calculated prior to the docking procedure. The rigid ligand poses were put into the active site and Hot Spots were matched as triplets. The poses were pruned and a final optimization step was performed before the poses were scored.
Docking by GOLD
GOLD (https://www.ccdc.cam.ac.uk) was selected for precise docking simulation of ligands from previous step. GOLD can make high database enrichments and identify the correct binding mode reliably.The retained candidates by LibDock were further screened using the GOLD docking program [27], [28]. At the same time, 8 published Mpro inhibitors were selected as templates to do structure similarity screening in Specs [11], [29], exhibited in Table 2.
Table 2
Published Mpro Inhibitors.
Number
Name
structure
IC50(μM)
1
Ebselen [29]
0.67±0.09
2
Disumfiram [29]
9.35±0.18
3
Tideglusib [29]
1.55±0.30
4
Carfmofur [29]
1.82±0.06
5
Shikonin [29]
15.75±8.22
6
PX-12 [29]
21.39±7.06
7
11a [11]
0.053±0.005
8
11b [11]
0.040±0.002
There were 3 templates and 3 scoring functions available to evaluate the quality of the docking. The binding pocket was defined within 7 Å from the co-crystallised ligand. The co-crystallised ligand was docked back to the binding pocket to find the most suitable score function. The combination of goldscore_p450_csd and CHEMPLP gave the highest score (117.86) and low RMSD (2.5498 Å) as illustrated in Table 1
. They were thus selected for docking simulation. The value of GA runs was set to 50 times.
Table 1
The Score and RMSD of Different Score Functions.
Fitness&Search Options
CHEMPLP
GoldScore
ChemScore
Templates
Score
RMSD(Å)
Score
RMSD(Å)
Score
RMSD(Å)
chemscore_Kinase
101.48
2.4339
91.14
8.3199
28.15
4.3580
chemscore_p450_csd
92.04
12.0136
93.25
9.6106
32.13
2.2065
goldscore_p450_csd
117.86
2.5498
93.97
1.9502
29.77
4.4503
The Score and RMSD of Different Score Functions.Published Mpro Inhibitors.
Cluster analysis
Due to some molecules share the similar skeleton, Cluster Ligands was used to reduce repeatability and increase the hit rate. The remaining molecules were purchased for the biochemical evaluation.
The protein expression and purification of SARS-CoV-2 Mpro
The protein expression and purification of SARS-CoV-2 Mpro has previously been described [11], [30]. In brief, the cDNA of full-length SARS-CoV-2 Mpro (GenBank: MN908947.3) was cloned into the PGEX6p-1 vector. To obtain SARS-CoV-2 Mpro with authentic N and C terminals, four amino acids (AVLQ) were inserted between the GST tag and the full length SARS-CoV-2 Mpro, while eight amino acids (GPHHHHHH) were added to the C-terminus of SARS-CoV-2 Mpro. The plasmid was then transformed into BL21 (DE3) cells for protein expression. The N-terminal GST tag and four amino acids (AVLQ) were self-cleavable. The expressed protein with an authentic N-terminus was purified by a Ni-NTA column (GE Healthcare) and transformed into the cleavage buffer (150 mM NaCl, 25 mM Tris, pH 7.5) containing human rhinovirus 3C protease to remove the additional residues. The resulting protein sample was further passed through size-exclusion chromatography (Superdex200, GE Healthcare). The eluted protein samples were stored in a solution (10 mM Tris, pH 7.5) for the enzymatic inhibition assay.
The inhibition assay of SARS-CoV-2 Mpro
The inhibition assay of SARS-CoV-2 Mpro has previously been described [30]. In brief, the recombinant SARS-CoV-2 Mpro at a concentration of 30 nM was mixed with serial dilutions of each compound in 80 µL assay buffer (50 mM Tris-HCl, pH 7.3, 1 mM EDTA) and incubated for 10 min. Inhibitory activities of VS10 against SARS-CoV-2 Mpro were measured at different concentrations of 0.01, 0.02, 0.03, 0.06, 0.13, 0.25, 0.50, 1.00, 2.00, 4.00, 8.00, and 16.00 µM. Inhibitory activities of VS12 against SARS-CoV-2 Mpro were measured at different concentrations of 0.03, 0.06, 0.13, 0.25, 0.50, 1.00, 2.00, 4.00, 8.00, 16.00, 32.00, and 64.00 µM. For the other 35 molecules listed in Table S1 (supplementary material), we measured the inhibitory activity at a concentration of 10 µM. The reaction was initiated by adding 40 µL fluorogenic substrate (MCA-AVLQSGFR-Lys(Dnp)-Lys-NH2) at a final concentration of 20 µM. After that, the fluorescence signal at 320 nm (excitation)/405 nm (emission) was measured immediately every 35 s for 3.5 min with a Bio-Tek Synergy4 plate reader. The initial velocities of reactions with compounds added at various concentrations compared to the reaction added with DMSO were calculated and used to generate inhibition profiles.For each compound, three independent experiments were performed for the determination of IC50 value. The IC50 values were expressed as the mean ± SD and determined via nonlinear regression analysis using GraphPad Prism software 8.0 (GraphPad Software, Inc., San Diego, CA, USA).
Results and discussion
Virtual screening and molecular docking
75,671 molecules were remained after Lipinski and Veber rules screening. After that, 69,946 molecules were obtained via PAINS filtering. Furthermore, we choose the intersection with a 95% confidence interval to retain 65,090 candidates by ADMET screening (Fig. 2
). Then, those compounds were subjected to molecular docking using LibDock. The potential candidates after LibDock docking were selected for further study using GOLD. The Default docking mode by Gold was used to acquire top 2000 candidates including 898 compounds from similarity screening, and then the very flexible mode was used to acquire top 200 candidates. And then, 200 candidates were retained with the very flexible mode. The remaining compounds were clustered according to the structure similarity using DS 4.0. Finally, 37 molecules were selected as a small subset which represents 200 compounds.
Fig. 2
2D plot showing relationship between polar surface area and calculated AlogP98 values of the selected molecules.
2D plot showing relationship between polar surface area and calculated AlogP98 values of the selected molecules.
Biochemical evaluation
The final selected 37 molecules were tested by biochemical evaluation, from which 2 compounds showed relative high biological activity (Fig. 3
and Table 3
). The IC50 of compound VS10 is 0.20 μM, and the IC50 of compound VS12 is 1.89 μM.
Fig. 3
The IC50 of compounds VS10 and VS12. Dose-response curves for IC50 values were determined by nonlinear regression. All data are shown as mean ± SD. Three independent experiments were performed.
Table 3
Biochemical Evaluation of Two Lead Compounds.
Number
Structure
IC50(μM)
VS10
0.20 ± 0.03
VS12
1.89 ± 0.22
The IC50 of compounds VS10 and VS12. Dose-response curves for IC50 values were determined by nonlinear regression. All data are shown as mean ± SD. Three independent experiments were performed.Biochemical Evaluation of Two Lead Compounds.
Interaction modes analysis
We analyzed the interaction between these two small molecules and protein via DS 4.0, Pymol 2.2.3 and online server (https://projects.biotec.tu-dresden.de/plip-web/plip/index) [31].Shown as Fig. 4
(A), the CO of compound VS10 formed four strong hydrogen bonds with GLY143, SER144 and CYS145. In addition, the SH of CYS145 also formed two weak hydrogen bonds with benzene ring and heterocycle. MET165 and GLU166 formed a hydrogen bond with two benzene rings, respectively. As presented in Fig. 4 (C), the compound VS12 formed four hydrogen bonds and three hydrophobic interactions with Mpro. The CO respectively formed a hydrogen bond with GLU166 and GLN189, the ring of thiadiazole formed hydrophobic interactions with HIS41 and MET49. The benzene ring formed a hydrophobic interaction with MET165 as well. The NH2 of VS12 formed a hydrogen bond with TYR54. The more important is that the S on VS12 formed a hydrogen bond with CYS145, which may be the location of covalent bonding. It can be seen from the interactions between these two small molecules with the protein. HIS41, CYS145, MET165 and GLU166 are the key amino acids for inhibitor binding.
Fig. 4
Binding modes between ligands and Mpro. A, C shows interactions between VS10, VS12 and protein amino acids in 3D, the green lines represent hydrogen bonds, the magentas lines represent hydrophobic interactions. B, D shows interactions between VS10, VS12 and protein amino acids in 2D by DS 4.0. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Binding modes between ligands and Mpro. A, C shows interactions between VS10, VS12 and protein amino acids in 3D, the green lines represent hydrogen bonds, the magentas lines represent hydrophobic interactions. B, D shows interactions between VS10, VS12 and protein amino acids in 2D by DS 4.0. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Interaction analysis
N3 inhibitor simultaneously acts on S1, S2, S4 and S1′ subsites, as well as 11a and 11b. These small molecules have more interactions with Mpro, which makes them more stable together and shows a better inhibitory effect. VS10 acts on S1, S4 and S1′ subsites, while VS12 acts on S1, S2, S4 and S1′ subsites. In contrast to the numerous stabilizing interactions of N3, VS10 and VS12 interacts mainly with CYS145 through van der Waals and hydrogen bonding interactions, as well as carmofur [32]. Ebselen interacts through hydrogen bonding interactions mainly with GLU166 and van der Waals interactions with MET49 and MET165 in the hydrophobic S2 binding area [32]. VS10 is very similar to ebselen, but IC50 is more than three times stronger. Considering the wide binding area of Mpro, in our work, VS10 forms more hydrogen bonds and hydrophobic interaction contrasts to the ebselen, so, VS10 shows more effective inhibition. MS/MS analysis has revealed that ebselen, PX-12, and carmofur were able to covalently modify the catalytic cysteine CYS145 of SARS-CoV-2 Mpro [29]. Ebselen is known to form a covalent bond with CYS145. To explore whether the compound VS10 and VS12 are covalent inhibitors, or react with one or more cysteines. Two independent experiments were carried out to determine IC50 values of VS10 and VS12 with different incubation time (Table 4
, Fig. 5
, and Fig. 6
). The results showed that the inhibitory activity of VS10 remained unchanged with the prolongation of incubation time (Table 4 and Fig. 5). However, the inhibitory activity of VS12 against SARS-CoV-2 Mpro increased with the prolongation of incubation time (Table 4 and Fig. 6). Contrary to the stable IC50 of VS10, the IC50 of VS12 decreases as the incubation time increases. Therefore, we suspected that VS12 reacted with one or more cysteines and was the covalent inhibitor of SARS-CoV-2 Mpro. No matter what kind of interaction, the docking results show that the interaction sites are close to CYS145 (Fig. 4).
Table 4
The inhibitory activities of VS10 and VS12 against SARS-CoV-2 Mpro with different incubation time.
Incubation time (min)
VS10 – IC50 (μM)
VS12 – IC50 (μM)
0
0.17 ± 0.02
–
2.5
0.20 ± 0.02
–
5
0.19 ± 0.01
2.41 ± 0.02
10
0.19 ± 0.02
1.52 ± 0.04
20
0.18 ± 0.05
0.84 ± 0.07
60
0.25 ± 0.04
0.64 ± 0.08
Fig. 5
The IC50 curves of VS10 against SARS-CoV-2 Mpro with different incubation time of 0, 2.5, 5, 10, 20, and 60 min.
Fig. 6
The IC50 curves of VS12 against SARS-CoV-2 Mpro with different incubation time of 5, 10, 20, and 60 min.
The inhibitory activities of VS10 and VS12 against SARS-CoV-2 Mpro with different incubation time.The IC50 curves of VS10 against SARS-CoV-2 Mpro with different incubation time of 0, 2.5, 5, 10, 20, and 60 min.The IC50 curves of VS12 against SARS-CoV-2 Mpro with different incubation time of 5, 10, 20, and 60 min.
Conclusions
To combat COVID-19 pandemic, researchers around the globe are racing to come up with effective countermeasures. Promising progress has been made in developing vaccines and antiviral drugs. Antivirals are necessary complements of vaccines and are needs for post-infection treatment. In this study, a virtual screening protocol which includes drug-like molecules screening, PAINS and ADMET molecules filtering, docking-based drug design and cluster analysis, was developed. The final remaining top 37 molecules are further evaluated by biochemical evaluation and 2 molecules have shown good inhibitory activity. The binding modes of the two molecules were analyzed, which has shown that the drug leads identified can bind to the binding pocket of SARS-CoV-2 Mpro. HIS41, CYS145, MET165, and GLU166 are found to be the important amino acids for inhibitor binding. In this study, the virtual screening and high-throughput screening proved to be efficient strategies to discover antiviral leads against SARS-CoV-2 virus. The methods presented here can be greatly helpful in the rapid discovery of drug leads from a large compound library against new emerging infectious diseases which currently lack specific medicines.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Franco Lombardo; Prashant V Desai; Rieko Arimoto; Kelly E Desino; Holger Fischer; Christopher E Keefer; Carl Petersson; Susanne Winiwarter; Fabio Broccatelli Journal: J Med Chem Date: 2017-06-27 Impact factor: 7.446