Arafat Rahman Oany1, Mamun Mia2, Tahmina Pervin3, Md Junaid4, S M Zahid Hosen5, Mohammad Ali Moni6. 1. Department of Biotechnology and Genetic Engineering, Faculty of Life Science, Mawlana Bhashani Science and Technology University, Tangail-1902, Bangladesh; Aristopharma Limited, Bangladesh. Electronic address: arafatr@outlook.com. 2. Department of Biotechnology and Genetic Engineering, Faculty of Life Science, Mawlana Bhashani Science and Technology University, Tangail-1902, Bangladesh; Department of Medical Biotechnology, Bangladesh University of Health Sciences, Dhaka, Bangladesh. 3. Biotechnology and Genetic Engineering Discipline, Life Science School, Khulna University, Khulna, Bangladesh. 4. Molecular Modeling Drug-design and Discovery Laboratory, Pharmacology Research Division, BCSIR Laboratories Chattogram, Bangladesh Council of Scientific and Industrial Research, Chattogram, Bangladesh. 5. Molecular Modeling Drug-design and Discovery Laboratory, Pharmacology Research Division, BCSIR Laboratories Chattogram, Bangladesh Council of Scientific and Industrial Research, Chattogram, Bangladesh; Pancreatic Research Group, South Western Sydney Clinical School, and Ingham Institute for Applied Medical Research, Faculty of Medicine, University of New South Wales, Sydney, Australia. 6. WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Australia. Electronic address: m.moni@unsw.edu.au.
Abstract
To date, the global COVID-19 pandemic has been associated with 11.8 million cases and over 545481 deaths. In this study, we have employed virtual screening approaches and selected 415 lead-like compounds from 103 million chemical substances, based on the existing drugs, from PubChem databases as potential candidates for the S protein-mediated viral attachment inhibition. Thereafter, based on drug-likeness and Lipinski's rules, 44 lead-like compounds were docked within the active side pocket of the viral-host attachment site of the S protein. Corresponding ligand properties and absorption, distribution, metabolism, excretion, and toxicity (ADMET) profile were measured. Furthermore, four novel inhibitors were designed and assessed computationally for efficacy. Comparative analysis showed the screened compounds in this study maintain better results than the proposed mother compounds, VE607 and SSAA09E2. The four designed novel lead compounds possessed more fascinating output without deviating from any of Lipinski's rules. They also showed higher bioavailability and the drug-likeness score was 0.56 and 1.81 compared with VE607 and SSAA09E2, respectively. All the screened compounds and novel compounds showed promising ADMET properties. Among them, the compound AMTM-02 was the best candidate, with a docking score of -7.5 kcal/mol. Furthermore, the binding study was verified by molecular dynamics simulation over 100 ns by assessing the stability of the complex. The proposed screened compounds and the novel compounds may give some breakthroughs for the development of a therapeutic drug to treat SARS-CoV-2 proficiently in vitro and in vivo.
To date, the global COVID-19 pandemic has been associated with 11.8 million cases and over 545481 deaths. In this study, we have employed virtual screening approaches and selected 415 lead-like compounds from 103 million chemical substances, based on the existing drugs, from PubChem databases as potential candidates for the S protein-mediated viral attachment inhibition. Thereafter, based on drug-likeness and Lipinski's rules, 44 lead-like compounds were docked within the active side pocket of the viral-host attachment site of the S protein. Corresponding ligand properties and absorption, distribution, metabolism, excretion, and toxicity (ADMET) profile were measured. Furthermore, four novel inhibitors were designed and assessed computationally for efficacy. Comparative analysis showed the screened compounds in this study maintain better results than the proposed mother compounds, VE607 and SSAA09E2. The four designed novel lead compounds possessed more fascinating output without deviating from any of Lipinski's rules. They also showed higher bioavailability and the drug-likeness score was 0.56 and 1.81 compared with VE607 and SSAA09E2, respectively. All the screened compounds and novel compounds showed promising ADMET properties. Among them, the compound AMTM-02 was the best candidate, with a docking score of -7.5 kcal/mol. Furthermore, the binding study was verified by molecular dynamics simulation over 100 ns by assessing the stability of the complex. The proposed screened compounds and the novel compounds may give some breakthroughs for the development of a therapeutic drug to treat SARS-CoV-2 proficiently in vitro and in vivo.
Over the last two decades, major outbreaks of coronaviruses (CoVs) associated with mild to severe human respiratory tract infections have been a bitter global health experience. The first global outbreak was caused by severe acute respiratory syndrome coronavirus (SARS-CoV) in 2002 and covered five continents with over 8000 infected cases and a mortality rate of 10% [1], [2], [3], [4], [5], [6], [7]. During 2012, the second outbreak occurred in Saudi Arabia caused by middle east respiratory syndrome coronavirus (MERS-CoV), with a high mortality rate of 35% [8, 9]. The third human pathogenic coronavirus, 2019 novel coronavirus (2019-nCoV), was first identified at the end of 2019 in Wuhan, China with atypical pneumonia [10] and subsequently spread to the whole world within 3-4 months. The 2019-nCoV has exceeded SARS-CoV and MERS-CoV in terms of transmission rate [11] and case fatality rate (2.3%) [12]. The fatality rate is higher in elderly patients and patients with comorbidities [13]. According to the World Health Organization (WHO), the virus and disease names are respectively, COVID-19 virus and Coronavirus Disease 2019 (CoVID-19) [14], while the International Committee on Taxonomy of Viruses (ICTV) renamed the virus SARS-CoV-2 [15]. According to the WHO ‘situation report 171’ (10 July 2020), a total of 11 874 226 confirmed cases of SARS-CoV-2 were reported, including 545 481 deaths worldwide from 213 countries and/or territories [16]. As the global situation is worsening day by day and there are no appropriate treatment strategies available, there is an urgent demand to develop an effective antiviral/therapeutic drug.SARS-CoV-2 belongs to the virus family Coronaviridae
[17] and is 79.5% genome identical to SARS‐CoV [18]. The four structural proteins, spike (S), envelop (E), membrane (M), and nucleocapsid (N) are encoded by the SARS-CoV-2 genome in a similar fashion to that of SARS-CoV [19]. Transmembrane trimeric S protein comprises two functional subunits, S1 and S2 [20]: S1 harbors the binding activity to the receptor of host cells and S2 is responsible for host-viral cell fusion [21]. Entry of SARS-CoV-2 into host cells is almost identical to the binding mode of the SARS-CoV, and the S protein receptor-binding domain (RBD) [22] maintains a high affinity to the human receptor angiotensin-converting enzyme 2 (ACE2) [23, 24]. Cryo-EM structure analysis showed that 319 to 591 of S residues of 2019-nCoV RBD-SD1 fragment have a stable folding conformation with the ACE2 receptor binding [25]. In contrast to SARS-CoV, SARS-CoV-2 has a novel feature at the S1/S2 boundary of S protein with an unexpected furin cleavage site generated through the insertion of 12 nucleotides [23, 25]. In the S1 subunit of SARS-CoV-2, a total of six RBD amino acids (L455, F486, Q493, S494, N501, and Y505) have a critical role for binding to ACE2 receptors [24]. Recent structural analysis revealed highly conserved residues in 2019-nCoV S RBD compared to SARS-CoV S RBD that are critical for ACE2 receptors binding [22]. Therefore, blocking 2019-nCoV S protein from binding to human ACE2 receptors is the most promising target to develop a novel therapeutic drug in the current pandemic. A few studies have evaluated novel therapeutic drugs for SARS-CoV targeting the mode of action of S protein-mediated entry to host cell. Kao et al. reported VE607 as a novel compound that blocks S protein-ACE2-mediated SARS-CoV entry [26]. Structure-based in silico analysis showed N-(2-aminoethyl)-1 aziridine-ethanamine is an inhibitor of SARS-CoVs S protein-mediated cell fusion [27] and SSAA09E2 exhibits a unique mechanism of action that inhibits the early interactions of SARS-S with the receptor for SARS-CoV [28].In the present study, novel inhibitors of viral attachment have been designed through virtual screening in the quest to combat this deadly virus.
Materials and Methods
Sequence retrieval
The National Center for Biotechnology Information (NCBI) database was explored to find the S glycoprotein of SARS-CoV-2 [29]. Initially, 14 were selected for the study and stored as a FASTA format for further analysis.
Phylogeny and multiple sequence analysis
The BioEdit v7.2.3 sequence alignment editor [30] was used to identify the conserved region among the sequences through multiple-sequence alignment (MSA) with ClustalW [31]. The Jalview v2 tool [32] was used to retrieve the alignment and the CLC Sequence Viewer v7.0.2 (http://www.clcbio.com) was used to analyse the divergence among the different strains of S protein.
Domain and motif analysis
To get insight into the domain structure of the S protein Pfam program of the Sanger Institute [33] and to identify motifs, ScanProsite of the Swiss Institute of Bioinformatics was used [34]. Accession No. QHR63290 was used for this analysis and further proceeded with this protein.
Homology modelling and model validation
Homology models of the target were done by MODELLER v9 [35] and the top-scoring template was selected for model construction. The predicted models were then employed for the validation through the PROCHECK [36] tools of the SWISS-MODEL Workspace [37]. To further check the quality of the predicted model, the superposition between the structure and template was also performed.
Electrostatic potential analysis and energy minimization
The electrostatic potential of the predicted model was built by the APBS server [38] through transforming .pdb to .pqr files and the visualization was done by the PyMOL molecular graphics system [39]. Energy minimization was also done to further secure the validation of the predicted models. The GROMOS 96 [40] force field was used to minimize the structure.
Active site analysis and virtual screening
The active sites of the final proposed model were constructed by the CASTp server. This server builds a model through the pocket algorithm of the alpha shape theory [41]. Screening was initially based on existing literature to get an insight into the potential inhibitors, then the PubChem database [42] was screened for similar hits. All the ligands selected for the docking studies were converted into the .pdbqt format using the Open Bable toolbox [43]. Autodock Vina [44] was used for the screening. The grid generated for molecular docking at the center was X: 228.75, Y: 190.82, and Z: 304.15. The dimensions (Angstrom) used for the docking analysis were X: 26.31, Y: 26.32, and Z: 20.18.
Ligand properties and ADMET analysis
Ligand structural and behavioral properties were analysed for the cross-validation of the ligands as a suitable therapeutic candidate. A wide range of properties was analysed, including Lipinski's Rule [45], drug-likeness, molecular weight, and bioavailability. These parameters were assessed through various online servers and tools, including Molinspiration (http://www.molinspiration.com), Drug-likeness tool [46], and OSIRIS Property Explorer [47]
. ADMET properties were assessed for the selected compounds. ADME SARfari [48], admet-SAR [49], Swiss database [50] and QikProp [51] were used for these assessments.
Novel lead design
Novel leads were designed based on the screening results with the aid of PubChem Sketcher V2.4 [52]. The designed structure was then fetched as Canonic SMILES format, and three-dimensional structures were generated by Open Bable toolbox [43]. Ligand properties, molecular docking, and ADMET analysis of the novel compounds were also assessed as described earlier.
Molecular dynamics simulations
To identify the changes in structural conformation, interaction pattern, and binding stability of the ligand-protein complex, YASARA Dynamics software [53] was employed for all molecular dynamics (MD) simulation by using AMBER14 as a force field [54]. The best protein-ligand complex provided by molecular docking computations was selected for further analysis. In this prospect, a simulation cell was generated in a clean and optimized system, where the box size was 90 × 90 × 90 Å3 and the protein-ligand complex placed inside it. The transferable intermolecular potential 3 points (TIP3P) model was applied, and Na/Cl ions were added in place of water to give a total NaCl concentration of 0.9%, where the entire solvent molecules were 132 889, density was 0.997 g/mL, temperature was 298 K and pH was maintained at 7.4 [55, 56]. For better clarity, rendering of water molecules was omitted. The energy minimization of each system was then implemented by a simulated annealing method using the steepest gradient approach (5000 cycles). For long-range electrostatic interactions, Particle Mesh Ewald (PME) method summation was employed with a cut-off value of 8 Å. Subsequently, by using a step-size equivalent to 2.5 fs, the manufacturing MD simulation was operated at YASARA force field level around 100 ns time scale run at a constant pressure and temperature by employing a Berendsen Thermostat and constant pressure [57, 58]. An assessment of the MD trajectories was saved every 25 ps. This trajectory was used further for the analysis to estimate the structural changes and stability employing root mean square deviation (RMSD) and root mean square fluctuation (RMSF) using YASARA structure built-in macros [59].
Results
Phylogenetic analysis of all the strains of the S protein was generated from CLC Sequence Viewer v7.0.2 and is depicted in Figure 1
with an appropriate branch distance. The 1000 bootstrap was used to generate the tree. The multiple sequence alignment generated from the Jalview v2 tool is shown in Figure 2
. This figure shows only the sequence of interest, RBD, in the MSA with appropriate conservation information throughout the sequences.
Figure 1
Phylogenetic tree analysis of the retrieved sequences of spike glycoprotein (S).
Figure 2
Multiple sequence alignment of the selected proteins. Receptor-binding domain (RBD) is shown with appropriate conservation at the bottom (yellow).
Phylogenetic tree analysis of the retrieved sequences of spike glycoprotein (S).Multiple sequence alignment of the selected proteins. Receptor-binding domain (RBD) is shown with appropriate conservation at the bottom (yellow).The major domain and motif organization of the S protein, based on the sequence information from the Pfam and ScanProsite, was constructed and is depicted in Figure 3
. Accession No. QHR63290, with a sequence length of 1273 amino acid residues, was used to generate the construct.
Figure 3
Schematic representation of the major domain and motif structure of the spike glycoprotein (S). The S1 subunit contains NTD (14–305 aa), RBD (319–541 aa), and RBM (437–508 aa). The S2 subunit contains HR1 (912–984 aa), and HR2 (1163–1213 aa).
Schematic representation of the major domain and motif structure of the spike glycoprotein (S). The S1 subunit contains NTD (14–305 aa), RBD (319–541 aa), and RBM (437–508 aa). The S2 subunit contains HR1 (912–984 aa), and HR2 (1163–1213 aa).
Homology modelling and validation
A homology model of the targeted protein was constructed through MODELLER and is illustrated in Figure 4
a. The superposition between the template and model is shown in Figure 4
b. The “Ramachandran Plot” for the model validation is depicted in Figure 4
c. The three-dimensional models of RBD and receptor-binding motif (RBM), along with specific amino acid residues, are shown in Figures 4
d and 4
e, respectively.
Figure 4
Homology model, structural evaluation, and domain organization. The three-dimensional model of the targeted protein (surface view) is showed in 4a. In 4b, the superposition between the predicted model (cyan color) and template (blue color) 6VSB_A is presented with RMSD of 0.202. The Ramachandran plot validation of the predicted model, where 95.06% amino-acid residues are in the most favored region, is depicted in 4c. The receptor-binding domain (RBD) (red sphere) (4d) and the receptor-binding motif (RBM) (cyan sphere) (4e) is the main catalytic site for the binding with host surface protein. The amino acid residues of the RBM are shown in 4e (zoomed view).
Homology model, structural evaluation, and domain organization. The three-dimensional model of the targeted protein (surface view) is showed in 4a. In 4b, the superposition between the predicted model (cyan color) and template (blue color) 6VSB_A is presented with RMSD of 0.202. The Ramachandran plot validation of the predicted model, where 95.06% amino-acid residues are in the most favored region, is depicted in 4c. The receptor-binding domain (RBD) (red sphere) (4d) and the receptor-binding motif (RBM) (cyan sphere) (4e) is the main catalytic site for the binding with host surface protein. The amino acid residues of the RBM are shown in 4e (zoomed view).
Electrostatic potentiality and energy minimization
The electrostatic potentiality of the modeled protein was analysed to identify the energy distribution of the protein and is shown in Figure 5
a. The GROMOS 96 force field was applied to the modeled protein for better model construction. The force-field energies of the modeled protein before and after minimization were -44342.816 and -57854.49 kJ/mol, respectively. The energy distribution of the protein after minimization is shown in Figure 5
b.
Figure 5
Electrostatic potentiality of the targeted protein. Energy distributions are shown before minimization in 5a and after minimization in 5b. In the energy distribution, blue color (positive charges) indicates higher energy and red color (negative charges) indicates lower energy potentiality. White color indicates energy neutral status.
Electrostatic potentiality of the targeted protein. Energy distributions are shown before minimization in 5a and after minimization in 5b. In the energy distribution, blue color (positive charges) indicates higher energy and red color (negative charges) indicates lower energy potentiality. White color indicates energy neutral status.
Active site exploration and virtual screening
CASTp server predicted different active sites of our preferred protein with different volume scores and the best two large volumes were selected as the final active sites (Figure 6
a and 6
b). The molecular surface areas of the active sites were 31662.48 and 311.61, respectively. The PubChem database, with 103 million depositor-provided chemical substances, was initially explored for the viral attachment inhibitors of the S protein. Based on the potential inhibitor VE607 and SSAA09E2, a total of 415 lead-like compounds were initially screened (Table S1). Thereafter, based on drug-likeness and Lipinski's rules, 46 lead-like compounds were selected for docking, including VE607and SSAA09E2 (Figure S1). For the binding energy of docking poses, RMSD values of 0.0 were selected, and interacting amino acid residues are illustrated in Figure 7
. The docking analysis results are shown in Table 1
.
Figure 6
Active sites of the predicted protein. The large two volumes of the active sites are depicted here. In 6a (blue sphere), is the large cavity of the active site and in 6b (red sphere), the second large volume of the active site cavity is depicted.
Figure 7
Molecular docking interaction analysis. All the ligands are docked into the active site pocket of the protein and different ligands are shown by different colors (7a). Interacting amino acid residues of the protein with different ligands (including VE607 and SSAA09E2) (7b). The interaction descriptions are shown graphically in the bottom of 7b.
Table 1
Docking analysis of the 44 screened compounds
Compound No.
PubChem CID
RMSD
Binding energy (Docking) kcal/mol
1.
10646010
0.0
-6.0
2.
10692930
0.0
-6.5
3.
10740323
0.0
-6.6
4.
10764806
0.0
-6.4
5.
10790629
0.0
-6.8
6.
11306025
0.0
-6.4
7.
11477299
0.0
-6.6
8.
11751804
0.0
-6.5
9.
13490885
0.0
-6.2
10.
13490887
0.0
-6.2
11
13490888
0.0
-6.7
12.
13747420
0.0
-6.1
13.
13747422
0.0
-6.6
14.
13747424
0.0
-6.1
15.
21261758
0.0
-6.5
16.
21261794
0.0
-6.7
17.
21261795
0.0
-6.1
18.
21261835
0.0
-6.8
19.
21261838
0.0
-6.4
20.
21261911
0.0
-6.7
21.
22132924
0.0
-6.7
22.
22132935
0.0
-6.5
23.
29977666
0.0
-6.4
24.
44394970
0.0
-6.5
25.
53755437
0.0
-6.1
26.
54477020
0.0
-6.6
27.
57826284
0.0
-6.4
28.
57826285
0.0
-6.9
29.
57826286
0.0
-6.8
30.
57835221
0.0
-6.3
31.
57835225
0.0
-6.7
32.
59981692
0.0
-6.5
33.
67047984
0.0
-6.2
34.
68116761
0.0
-6.5
35.
70242364
0.0
-6.1
36.
70402151
0.0
-6.8
37.
70438214
0.0
-6.7
38.
71379517
0.0
-6.3
39.
77856050
0.0
-6.6
40.
97289632
0.0
-6.5
41.
100951909
0.0
-6.5
42.
100951910
0.0
-6.0
43.
131708552
0.0
-6.4
44.
142969895
0.0
-6.8
Active sites of the predicted protein. The large two volumes of the active sites are depicted here. In 6a (blue sphere), is the large cavity of the active site and in 6b (red sphere), the second large volume of the active site cavity is depicted.Molecular docking interaction analysis. All the ligands are docked into the active site pocket of the protein and different ligands are shown by different colors (7a). Interacting amino acid residues of the protein with different ligands (including VE607 and SSAA09E2) (7b). The interaction descriptions are shown graphically in the bottom of 7b.Docking analysis of the 44 screened compounds
Ligand properties and ADMET data analysis
Different properties of the screened compounds, including drug-likeness, Lipinski's rules, aqueous solubility, and bioavailability, were assessed for the identification of the best therapeutic candidates (Table 2
). All the ADMET properties were also assessed for the selected ligands and cross-checked with different servers and tools for the validity of the results. These properties are described in Table 3
.
Table 2
Ligand properties of the 44 screened compounds
PubChem CID
Molecular Formula
Molecular Weight
Polar Area
xlogp
Hydrogen Bond Donor
Hydrogen Bond Acceptor
Rotable Bonds
Complexity
Heavy Atom Count
Lipinski's rule violations
Aqueous solubility LogS
Drug-likeness model score
Bio-availability score
Binding energy (docking) kcal/mol
10646010
C25H32N2O4
424.5
78.9
4.71
2
5
9
578
31
0
-4.47
0.85
0.56
-6.0
10692930
C24H30N2O4
410.5
78.9
4.05
2
5
8
563
30
0
-4.06
0.81
0.56
-6.5
10740323
C24H30N2O4
410.5
89.9
4.87
3
5
8
563
30
0
-4.30
1.17
0.56
-6.6
10764806
C25H32N2O4
424.5
78.9
4.52
2
5
9
578
31
0
-4.31
0.81
0.56
-6.4
10790629
C29H32N2O4
472.6
78.9
5
2
5
9
674
35
0
-4.53
0.99
0.56
-6.8
11306025
C28H30N2O5
474.5
88.1
4.14
2
6
8
691
35
0
-4.29
1.11
0.56
-6.4
11477299
C30H34N2O4
486.6
78.9
3.8
2
5
9
703
36
0
-5.2
1.35
0.56
-6.6
11751804
C29H32N2O4
472.6
78.9
3.61
2
5
8
688
35
0
-4.78
1.29
0.56
-6.5
13490885
C24H30N2O4
410.5
67.9
4.3
1
5
8
563
30
0
-4.3
0.51
0.55
-6.2
13490887
C23H28N2O4
396.5
78.9
3.8
2
5
7
549
29
0
-3.95
0.72
0.56
-6.2
13490888
C28H30N2O4
458.5
78.9
4.81
2
5
8
659
34
0
-4.47
0.93
0.56
-6.7
13747420
C24H28N2O6
440.5
116
2.60
3
7
9
650
32
0
-3.02
0.67
0.56
-6.1
13747422
C24H30N2O5
426.5
99.1
3.21
3
6
9
580
31
0
-3.45
0.75
0.56
-6.6
13747424
C24H28N2O6
440.5
116
3.34
3
7
9
650
32
0
-3.43
0.9
0.56
-6.1
21261758
C23H27ClN2O4
430.9
78.9
4.39
2
5
8
569
30
0
-4.59
0.67
0.56
-6.5
21261794
C29H31N2O4-
471.6
81.7
4.93
1
5
8
668
35
0
-4.72
0.78
0.56
-6.7
21261795
C26H30N2O4
434.5
78.9
4.26
2
5
9
666
32
0
-4.23
0.74
0.56
-6.1
21261835
C28H30N2O4
458.5
78.9
4.7
2
5
9
659
34
0
-4.42
0.88
0.56
-6.8
21261838
C23H28N2O4
396.5
78.9
3.73
2
5
8
535
29
0
-3.73
0.67
0.56
-6.4
21261911
C29H30N2O5
486.6
95.9
4.46
2
6
9
746
36
0
-4.31
1.39
0.56
-6.7
22132924
C25H32N2O4
424.5
89.9
3.82
3
5
8
590
31
0
-4.33
1.1
0.56
-6.7
22132935
C27H33N2O4-
449.6
81.7
2.68
1
5
8
666
33
0
-2.84
0.29
0.56
-6.5
29977666
C22H28N2O4
384.5
102
3.36
3
5
9
511
28
0
-3.5
0.71
0.56
-6.4
44394970
C26H34N2O4
438.6
78.9
4.95
2
5
9
605
32
0
-4.39
0.76
0.56
-6.5
53755437
C24H28N2O5
424.5
95.9
3.66
2
6
8
626
31
0
-3.81
1.4
0.56
-6.1
54477020
C29H34N2O3
458.6
61.8
4.74
2
4
9
602
34
0
-4.55
0.47
0.55
-6.6
57826284
C22H28N2O4
384.5
102
3.36
3
5
9
511
28
0
3.5
0.71
0.56
-6.4
57826285
C27H34N2O4
450.6
70.1
3.85
1
5
7
683
33
0
-4.67
1.42
0.56
-6.9
57826286
C27H34N2O4
450.6
70.1
3.85
1
5
7
683
33
0
-4.67
1.42
0.56
-6.8
57835221
C28H30N2O5
474.5
88.1
4.14
2
6
8
691
35
0
-4.29
1.11
0.56
-6.3
57835225
C28H30N2O5
474.5
88.1
4.14
2
6
8
691
35
0
-4.29
1.11
0.56
-6.7
59981692
C24H30N2O4
410.5
78.9
4.05
2
5
8
563
30
0
-4.06
0.81
0.56
-6.5
67047984
C25H30N2O4
422.5
78.9
4.34
2
5
9
604
31
0
-4.25
0.68
0.56
-6.2
68116761
C26H32N2O4
436.5
78.9
4.74
2
5
9
631
32
0
-4.41
0.81
0.56
-6.5
70242364
C20H22N2O5
370.4
88.1
1.52
2
6
5
529
27
0
-2.08
0.28
0.55
-6.1
70402151
C26H34N2O4
438.6
78.9
4.77
2
5
9
605
32
0
-4.41
0.82
0.56
-6.8
70438214
C26H32N2O4
436.5
78.9
4.96
2
5
9
631
32
0
-4.43
0.79
0.56
-6.7
71379517
C17H17NO4
299.32
75.6
2.56
2
4
6
379
22
0
-2.98
0.6
0.56
-6.3
77856050
C26H32N2O4
436.5
78.9
4.96
2
5
9
631
32
0
-4.43
0.79
0.56
-6.6
97289632
C22H28N2O4
384.5
102
3.36
3
5
9
511
28
0
-3.5
0.71
0.56
-6.5
100951909
C25H32N2O4
424.5
78.9
4.71
2
5
9
578
31
0
-4.47
0.85
0.56
-6.5
100951910
C25H32N2O4
424.5
78.9
4.71
2
5
9
578
31
0
-4.47
0.85
0.56
-6.0
131708552
C22H28N2O4
389.5
102
3.36
3
5
9
511
28
0
-3.5
0.71
0.56
-6.4
142969895
C27H38N2O4
454.6
92.9
3.44
2
5
9
472
33
0
-5.17
1.1
0.56
-6.8
Table 3
ADMET profile of the 44 screened compounds
PubChem CID
Absorption
Distribution
Metabolism
Excretion
Toxicity
Blood Brain Barrier
Water solubility
CaCO2 permeability
Intestinal absorption (human)
Skin Permeability cm/s
P-glycoprotein substrate
P-glycoprotein I inhibitor
BBB permeability
CNS permeability
CYP450 1A2 Inhibitor
CYP2C19 Inhibitor
CYP2C9 inhibitor
CYP2C9 Substrate
CYP3A4 Substrate
Renal OCT2 Inhibitor
AMES toxicity
Carcinogens
Acute Oral Toxicity
10646010
No
-6.69
0.5553
High
-5.78
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6984
10692930
No
-5.21
0.5963
High
-6.06
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7143
10740323
No
-6.00
0.6306
High
-5.53
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6251
10764806
No
-5.76
0.5947
High
-5.77
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7179
10790629
No
-6.56
0.6038
High
-5.52
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7199
11306025
No
-5.49
0.5370
High
-6.40
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6980
11477299
No
-8.74
0.5947
High
-5.35
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7179
11751804
No
-8.35
0.5694
High
-5.52
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7057
13490885
Yes
-6.60
0.5753
High
-6.09
Yes
No
Yes
Yes
No
No
No
No
Yes
No
No
No
0.7305
13490887
Yes
-5.91
0.5665
High
-6.23
Yes
No
Yes
Yes
No
No
No
No
Yes
No
No
No
0.6928
13490888
No
-7.98
0.5813
High
-5.70
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7076
13747420
No
-5.26
0.6441
High
-6.67
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7315
13747422
No
-5.73
0.6479
High
-6.91
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7142
13747424
No
-5.26
0.6717
High
-6.28
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6978
21261758
No
-6.87
0.5555
High
-6.02
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6826
21261794
No
-8.37
0.5464
High
-5.51
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7254
21261795
No
-6.38
0.6026
High
-6.06
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7045
21261835
No
-8.10
0.6000
High
-5.69
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7390
21261838
Yes
-6.28
0.6179
High
-6.26
Yes
No
Yes
Yes
No
No
No
No
Yes
No
No
No
0.7037
21261911
No
-7.90
0.6125
High
-6.02
Yes
No
Yes
Yes
No
No
No
No
Yes
No
No
No
0.7037
22132924
No
-6.02
0.6486
High
-5.31
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6451
22132935
No
-6.74
0.5191
High
-5.43
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7085
29977666
No
-6.08
0.6461
High
-5.74
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6413
44394970
No
-6.71
0.5687
High
-5.55
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7273
53755437
No
-5.83
0.6255
High
-6.45
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6614
54477020
Yes
-8.84
0.5094
High
-5.72
Yes
No
Yes
Yes
No
No
No
No
Yes
No
No
No
0.6749
57826284
No
-6.08
0.6461
High
-5.74
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6413
57826285
Yes
-6.15
0.5666
High
-5.29
Yes
Yes
Yes
Yes
No
No
No
No
Yes
No
No
No
0.7656
57826286
Yes
-6.15
0.5666
High
-5.29
Yes
Yes
Yes
Yes
No
No
No
No
Yes
No
No
No
0.7656
57835221
No
-7.81
0.5370
High
-6.40
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6980
57835225
No
-7.81
0.5370
High
-6.40
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6980
59981692
No
-6.30
0.5963
High
-6.06
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7143
67047984
No
-6.35
0.5643
High
-5.82
Yes
Yes
No
No
No
No
No
No
Yes
No
No
No
0.7072
68116761
No
-6.49
0.6040
High
-5.70
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6547
70242364
No
-4.71
0.5736
High
-9.13
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6228
70402151
No
-6.71
0.5943
High
-5.55
Yes
Yes
No
No
No
No
No
No
Yes
No
No
No
0.7231
70438214
No
-6.36
0.5613
High
-5.74
Yes
Yes
No
No
No
No
No
No
Yes
No
No
No
0.7128
71379517
Yes
-5.21
0.5874
High
-6.31
No
No
Yes
Yes
No
No
Yes
No
Yes
No
No
No
0.7297
77856050
No
-6.36
0.5613
High
-5.74
Yes
Yes
No
No
No
No
No
No
Yes
No
No
No
0.7128
97289632
No
-6.08
0.6461
High
-5.74
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6413
100951909
No
-6.69
0.5553
High
-5.78
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6984
100951910
No
-6.69
0.5553
High
-5.78
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6984
131708552
No
-6.08
0.6461
High
-5.74
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6413
142969895
No
-4.76
0.5731
High
-5.08
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.7289
Ligand properties of the 44 screened compoundsADMET profile of the 44 screened compoundsBased on the screening results, novel leads were designed from PubChem Sketcher V2.4 and are shown in Figure 8
a. The ligand properties of the novel drug are shown in Table 4
. Interaction analysis from the molecular docking is illustrated in Figure 8
b and within the active site pocket is in Figure 8
c. The ADMET profile is described in Table 5
.
Figure 8
Designed novel lead and molecular docking interaction analysis. Newly designed novel leads are shown in 8a. All the novel leads are docked into the active site pocket of the protein and different leads are shown by different colors (zoomed view) (8b). Interacting amino acid residues of the proteins with different ligands (8c). The interaction descriptions are shown graphically in the bottom of 8c.
Table 4
Ligand properties of novel compounds along with mother compounds VE607 and SSAA09E2
Lead ID
Molecular Formula
Molecular Weight
Polar Area
xlogp
Hydrogen Bond Donor
Hydrogen Bond Acceptor
Rotable Bonds
Heavy Atom Count
Lipinski's rule violations
Aqueous solubility LogS
Drug-likeness model score
Bio-availability score
Binding energy (docking) kcal/mol
AMTM 01
C28H30N2O5
474.55
110.10
2.88
4
5
8
35
0
-5.80
1.78
0.56
-7.0
AMTM 02
C25H30N2O4
422.52
81.08
3.44
2
4
6
31
0
-5.60
1.79
0.56
-7.5
AMTM 03
C27H28N2O5
460.52
99.10
2.42
3
5
8
34
0
-5.18
1.51
0.56
-7.0
AMTM 04
C27H28N2O7
492.52
139.56
1.53
5
7
8
36
0
-4.18
1.81
0.56
-6.7
VE607
C22H36N2O4
392.53
65.40
1.33
2
6
10
28
1
-3.28
0.80
0.55
-6.1
SSAA09E2
C16H20N4O2
300.16
53.72
0.75
1
4
5
22
0
-4.43
0.97
0.55
-6.7
Table 5
ADMET profile of novel compounds along with mother compounds VE607 and SSAA09E2
Lead ID
Absorption
Distribution
Metabolism
Excretion
Toxicity
Blood Brain Barrier
Water solubility
CaCO2 permeability
Intestinal absorption (human)
Skin Permeability cm/s
P-glycoprotein substrate
P-glycoprotein I inhibitor
BBB permeability
CNS permeability
CYP450 1A2 Inhibitor
CYP2C19 Inhibitor
CYP2C9 inhibitor
CYP2C9 Substrate
CYP3A4 Substrate
Renal OCT2 Inhibitor
AMES toxicity
Carcinogens
Acute Oral Toxicity
AMTM 01
No
-7.07
0.6671
High
-5.62
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6329
AMTM 02
No
-5.07
0.6292
High
-5.21
Yes
No
No
No
No
Yes
No
No
Yes
No
No
No
0.6958
AMTM 03
No
-7.12
0.6353
High
-6.16
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.6134
AMTM 04
No
-5.71
0.6403
High
-6.97
Yes
No
No
No
No
No
No
No
Yes
No
No
No
0.5953
VE607
Yes
-3.82
0.6554
High
-7.00
Yes
Yes
Yes
Yes
No
No
No
No
No
No
No
No
0.6960
SSAA09E2
No
-4.32
0.5294
High
-7.13
Yes
Yes
No
No
No
No
No
No
No
No
No
No
0.7217
Designed novel lead and molecular docking interaction analysis. Newly designed novel leads are shown in 8a. All the novel leads are docked into the active site pocket of the protein and different leads are shown by different colors (zoomed view) (8b). Interacting amino acid residues of the proteins with different ligands (8c). The interaction descriptions are shown graphically in the bottom of 8c.Ligand properties of novel compounds along with mother compounds VE607 and SSAA09E2ADMET profile of novel compounds along with mother compounds VE607 and SSAA09E2
Molecular dynamics simulation analysis
To understand the stability, flexibility, structural behavior, and binding mechanism of the best protein-ligand complex along with the apo form of protein, 100 ns of molecular dynamics simulations were conducted. Here, we considered our top designed novel compound, AMTM-02, with the S protein.The atomic RMSDs of the protein-ligand complex along with the apo form of protein were calculated and are plotted in a time-dependent manner in Figure 9
a, where the apo form and the protein-ligand complex are indicated with the red and blue color, respectively. According to Figure 9
a, these two indexes, i.e. the apo form of protein and the protein-ligand complex, regularly fluctuate around a constant value. The apo form of protein starts its journey with 0.592 Å RMSD value at 0 ns and gradually increases up to 29.65 ns. In 29.65 ns, it shows its highest peak with the RMSD value of 8.141 Å. From 29.65 to 39.575 ns it shows an average RMSD value of 7.01 Å and starts gradually decreasing its peak up to 43.6 ns with 4.9 Å. However, after 43.6 ns it has completed its full 100 ns MD simulation journey along with lots of ups and downs with the average RMSD value of 5.5 Å.
Figure 9
Molecular dynamics simulations of the targeted protein and the protein–ligand (AMTM-02) complex. In 9a, the time series of the RMSD of backbone atoms (C, Cα, and N) for protein. Here, red and blue lines denote Apo_QHR63290, and QHR63290-Ligand complex, respectively. In 9b, the structural changes of protein using root means square fluctuations (RMSF) analysis. Here, red, and blue lines denote Apo_QHR63290, and QHR63290-Ligand complex, respectively. The looped structures of the fluctuated residues are represented in the extended view.
Molecular dynamics simulations of the targeted protein and the protein–ligand (AMTM-02) complex. In 9a, the time series of the RMSD of backbone atoms (C, Cα, and N) for protein. Here, red and blue lines denote Apo_QHR63290, and QHR63290-Ligand complex, respectively. In 9b, the structural changes of protein using root means square fluctuations (RMSF) analysis. Here, red, and blue lines denote Apo_QHR63290, and QHR63290-Ligand complex, respectively. The looped structures of the fluctuated residues are represented in the extended view.On the other hand, the protein-ligand complex starts with 0.548 Å RMSD value and at the initial stage around 7 ns of simulation, it reached its highest RMSD magnitude with 7.992 Å. After 15 ns it became stable and remained until 25 ns showing an average 6 Å RMSD value, its stability decreased to 4 Å and again became stable at 43 ns to 61 ns with an RMSD value of 6 Å. After 63 ns, it showed a significant fluctuation from 7.94 Å to 6 Å and continued at the end of the simulation.To understand the flexible region of the protein and the movement of each residue, the RMSF value of the amino acid sequence of the respected protein complex was explored along with apo protein where lower RMSF value indicates less variability. The apo form showed lower RMSF values ranging from 1 Å to 6 Å (average 2 Å), whereas the protein-ligand complex showed higher RMSF values ranging from 8 Å to 11 Å as shown in Figure 9
b.The protein-ligand complex showed slightly higher fluctuations at 450-500 residues compared with apo.
Discussion
With the global progression of SARS-CoV-2, the death rate is surprisingly high. Scientific communities are working hard to develop vaccines or drugs to combat this issue [19, [60], [61], [62], [63]]. Computational biology approaches are very promising for treating many severe diseases through projecting the in vitro validations [64], [65], [66], [67], [68].The S glycoprotein is one of the most significant targets for designing therapeutics against CoV and is thought to be more prone to mutations [69]. In the current study, 14 S proteins were retrieved, and phylogenetic study showed they were closely related (Figure 1). Furthermore, the MSA study revealed that the RBD sequences from different strains were mostly conserved (Figure 2), thus supporting the choice of target for therapeutic development. The RBM within the RBD (Figure 3) is crucial for binding with host cell receptor and is typically a tyrosine-rich region [70].With 95.06% amino-acid residues in the most favorable region, validation of the homology model through ‘Ramachandran Plot’ favors a comparatively good model. Additionally, the superposition RMSD value (0.202) between the predicted structure and template supports the validity of the structure. The three-dimensional representation of the RBD and RBM also helps give insight into the attachment site of the ACE2 (Figure 4). The electrostatic potential of the protein revealed its charges throughout the surfaces and the positive charges (blue) in the RBM region indicate the hydrophobicity of that region (Figure 5).Active site analysis of the targeted protein is an important part of drug design approaches. As we are looking for attachment inhibitors, we selected the second-largest volume of the active site having the surface area of 311.61 (Figure 7
b).Extensive literature review showed that VE607 and SSAA09E2 were the most reported attachment inhibitors of the SARS-CoV S protein. Therefore, 415 lead-like compounds were screened from the PubChem database. Comparative analysis showed that most (44) of the docked compounds were potent inhibitors of the viral attachment of SARS-CoV-2 (Table 2, Table 3 and Figure 8).Finally, in this study, novel inhibitors were designed based on previous findings and four novel compounds (AMJM 01 to 04) were found to have optimistic lead-like properties compared with VE607 and SSAA09E2.Considering the ligand properties, VE607 violates one of Lipinski's rules of five as it has 10 rotatable bonds. The ADMET properties indicate that VE607 may permeate the blood-brain barrier (BBB), which might be toxic to the central nervous system (CNS). In addition, both VE607 and SSAA09E2 have P-glycoprotein-inhibitor properties, which may lead to drug accumulation-related toxicity.The binding energies with the protein for VE607 and SSAA09E2 were -6.1 and -6.7 kcal/mol, respectively. In the case of novel compounds, the binding energy ranged from -6.7 to -7.5 kcal/mol (Table 04). The most striking feature of the novel compounds was the drug-likeness property. AMTM 02 and AMTM 04 possess the highest drug-likeness score of 1.79 and 1.81, respectively. However, for VE607 and SSAA09E2, the drug-likeness scores were 0.80 and 0.97, respectively. In the case of oral bioavailability, all the novel compounds had a score of 0.56, and VE607 and SSAA09E2 had a score of 0.55.Finally, the MD simulation study of the targeted protein and the novel designed compound AMTM 02 further strengthened our prediction through validating the complex interaction stability presented by RMSD value (Figure 9
a). The RMSF values of the single protein and the complex have also been observed, and few higher fluctuations were observed in the looped regions of the protein (Figure 9
b).The proposed novel compounds might have some significant outputs for drug discovery still need subsequent assessments in vitro and in vivo.
Conclusion
The biggest current global challenge is to escape from the SARS-CoV-2-related pandemic that has arisen because of the inadequacy of effective therapeutic drugs or vaccines. The designed novel inhibitors in the present study for the S protein of SARS-CoV-2 not only showed the higher characteristics in terms of drug-likeness properties but also have the best binding affinities to the target protein thus potentially blocking viral attachment with host cells. However, these analyses require several in vitro and in vivo validations before formulating the drug to resist SARS-CoV-2.
Authors: Paul A Rota; M Steven Oberste; Stephan S Monroe; W Allan Nix; Ray Campagnoli; Joseph P Icenogle; Silvia Peñaranda; Bettina Bankamp; Kaija Maher; Min-Hsin Chen; Suxiong Tong; Azaibi Tamin; Luis Lowe; Michael Frace; Joseph L DeRisi; Qi Chen; David Wang; Dean D Erdman; Teresa C T Peret; Cara Burns; Thomas G Ksiazek; Pierre E Rollin; Anthony Sanchez; Stephanie Liffick; Brian Holloway; Josef Limor; Karen McCaustland; Melissa Olsen-Rasmussen; Ron Fouchier; Stephan Günther; Albert D M E Osterhaus; Christian Drosten; Mark A Pallansch; Larry J Anderson; William J Bellini Journal: Science Date: 2003-05-01 Impact factor: 47.728
Authors: Sunghwan Kim; Paul A Thiessen; Evan E Bolton; Jie Chen; Gang Fu; Asta Gindulyte; Lianyi Han; Jane He; Siqian He; Benjamin A Shoemaker; Jiyao Wang; Bo Yu; Jian Zhang; Stephen H Bryant Journal: Nucleic Acids Res Date: 2015-09-22 Impact factor: 16.971
Authors: Richard Y Kao; Wayne H W Tsui; Terri S W Lee; Julian A Tanner; Rory M Watt; Jian-Dong Huang; Lihong Hu; Guanhua Chen; Zhiwei Chen; Linqi Zhang; Tian He; Kwok-Hung Chan; Herman Tse; Amanda P C To; Louisa W Y Ng; Bonnie C W Wong; Hoi-Wah Tsoi; Dan Yang; David D Ho; Kwok-Yung Yuen Journal: Chem Biol Date: 2004-09
Authors: Asif Nashiry; Shauli Sarmin Sumi; Salequl Islam; Julian M W Quinn; Mohammad Ali Moni Journal: Brief Bioinform Date: 2021-03-22 Impact factor: 11.622
Authors: Kamrul Hasan Chowdhury; Md Riad Chowdhury; Shafi Mahmud; Abu Montakim Tareq; Nujhat Binte Hanif; Naureen Banu; A S M Ali Reza; Talha Bin Emran; Jesus Simal-Gandara Journal: Biology (Basel) Date: 2020-12-23
Authors: Utpala Nanda Chowdhury; Md Omar Faruqe; Md Mehedy; Shamim Ahmad; M Babul Islam; Watshara Shoombuatong; A K M Azad; Mohammad Ali Moni Journal: Comput Biol Med Date: 2021-09-29 Impact factor: 4.589