Literature DB >> 32340562

A molecular modeling approach to identify effective antiviral phytochemicals against the main protease of SARS-CoV-2.

Rajib Islam1, Md Rimon Parves1, Archi Sundar Paul1, Nizam Uddin1,2, Md Sajjadur Rahman1,2, Abdulla Al Mamun3, Md Nayeem Hossain1, Md Ackas Ali1, Mohammad A Halim1,4.   

Abstract

The main protease of SARS-CoV-2 is one of the important targets to design and develop antiviral drugs. In this study, we have selected 40 antiviral phytochemicals to find out the best candidates which can act as potent inhibitors against the main protease. Molecular docking is performed using AutoDock Vina and GOLD suite to determine the binding affinities and interactions between the phytochemicals and the main protease. The selected candidates strongly interact with the key n class="Chemical">Cys145 and His41 residues. To validate the docking interactions, 100 ns molecular dynamics (MD) simulations on the five top-ranked inhibitors including hypericin, cyanidin 3-glucoside, baicalin, glabridin, and α-ketoamide-11r are performed. Principal component analysis (PCA) on the MD simulation discloses that baicalin, cyanidin 3-glucoside, and α-ketoamide-11r have structural similarity with the apo-form of the main protease. These findings are also strongly supported by root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), radius of gyration (Rg), and solvent accessible surface area (SASA) investigations. PCA is also used to find out the quantitative structure-activity relationship (QSAR) for pattern recognition of the best ligands. Multiple linear regression (MLR) of QSAR reveals the R2 value of 0.842 for the training set and 0.753 for the test set. Our proposed MLR model can predict the favorable binding energy compared with the binding energy detected from molecular docking. ADMET analysis demonstrates that these candidates appear to be safer inhibitors. Our comprehensive computational and statistical analysis show that these selected phytochemicals can be used as potential inhibitors against the SARS-CoV-2.Communicated by Ramaswamy H. Sarma.

Entities:  

Keywords:  Antiviral phytochemicals; COVID-19; SARS-CoV-2; molecular docking; molecular dynamics

Mesh:

Substances:

Year:  2020        PMID: 32340562      PMCID: PMC7232885          DOI: 10.1080/07391102.2020.1761883

Source DB:  PubMed          Journal:  J Biomol Struct Dyn        ISSN: 0739-1102


Introduction

In December, the first epidemic of 2019 novel coronavirus (n class="Species">SARS-CoV-2) took place in Wuhan city, China (Hasan et al., 2020; Lu et al., 2020; Wu et al., 2020). Since the outbreak, it has been rapidly infecting people around the world and turned into a pandemic. The worldwide number of corona virus cases reached to 2,621,449 with a death toll of 182,991 as of April 22, 2020 (Worldometers.info, 2020). Although it was started from China, the number of active cases in United States, Spain, Italy, France, Germany, and UK have surpassed the cases identified in China (82,788). It is projected that the total COVID-19 deaths will reach to around 81,114 in US only and approximately a total of 60,000 deaths estimated in Spain, Italy, and France by August 4, 2020 (Team, I. C.-19 Health Service Utilization Forecasting & Murray, 2020). The Main protease (Mpro) also known as 3-C like protease (3CLpro) received great attention because of its important role in post-translational processing of replicase polyproteins (Boopathi et al., 2020; Elmezayen et al., 2020; Khan, Jha, et al., 2020; Wang et al., 2016). The enzymatic activity of this protein leads to processing of viral polyproteins. The ∼306 amino acid long main protease has high structural and sequence similarity to that of n class="Species">SARS-CoV 3CLpro (Liu & Wang, 2020). The SARS-CoV-2 Main protease (Mpro) monomer consists of N-terminal domain-I, N-terminal domain-II, and C-terminal domain-III (Mirza & Froeyen, 2020). The active site of enzyme contains a catalytic dyad having Cys145 and His41 (Bacha et al., 2004; Khan, Zia, et al., 2020). HIV drugs including lopinavir and ritonavir have been explored recently against MERS-CoV (Sheahan et al., 2020). In addition, peptidomimetic α-ketoamides were designed and synthesized to test their performance against the main proteases of betacoronaviruses, alphacoronaviruses, and the 3CLpro of enteroviruses. In a recent follow-up study, Zhang et al. resolved the crystal structure of Mpro complex of SARS-CoV-2 with modified α-ketoamide inhibitors (Zhang et al., 2020). After the outbreak, several types of drugs alone or with combination have been using in many countries (Colson et al., 2020; Gautret et al., 2020; Wang et al., 2020). However, still there is no definite antiviral drug to fight against the deadly virus. Herb species and fruits have been serving patients as sources of n class="Species">herbal medicine for a long time through the human history. They contain a wide variety of phytochemicals, such as flavonoids, alkaloids, glucosides, and polyphenolic compounds. These phytochemicals offer a wide range of therapeutic properties and novel scaffolds to design new drugs (Aanouz et al., 2020; Gupta et al., 2020). Among them, some phytochemicals showed high antiviral activity against a number of viral infections. For example, baicalin is an experimentally proved antiviral agent against several viruses, e.g. SARS-CoV-1 (Chen et al., 2004), SARS-CoV-2 (Liu et al., 2020), H1N1-pdm09 (Nayak et al., 2014), and Chikungunya (Oo et al., 2018). Therefore, an efficient approach is to test the efficacy of antiviral phytochemicals against the 2019 novel SARS Coronavirus (Jassim & n class="Chemical">Naji, 2003; Li et al., 2020; Naithani et al., 2008; Tong, 2009). Herein, we have selected 40 known phytochemicals isolated from natural herbs and fruits. These phytochemicals have already showed antiviral activity against different viruses. The aim of this study is to explore and identify the binding affinities and interactions of these antiviral phytochemicals against the main protease of SARS-CoV-2 using computational and statistical tools. Molecular docking, molecular dynamics, principal component analysis (PCA), and quantitative structure-activity relationships (QSAR) are performed to assess the performance of these phytochemicals. In addition, absorption, distribution, metabolism, and excretion properties of the best candidates are evaluated.

Methods and computational details

Phytochemicals optimization, protein preparation, and molecular docking

Total forty phytochemicals were selected considering their proved antiviral activities (Table S1). Optimization of the phytochemicals and calculation of vibrational frequency were performed using Gaussian 09 software package (Frisch et al., 2009). The structure of the phytochemicals was optimized at semi-empirical PM6 method (Stewart, 2007). The crystal structure of the main protease was taken from the RSCB Protein Data Bank (PDB ID: 6LU7). Then the crystal structure of the protease was optimized and checked by Swiss-PDB viewer software packages (version 4.1.0) based on their least energy. Some significant factors, such as in class="Gene">mproper bond order, side chain geometry, and missing hydrogen, were observed in the crystal structure of the protease. PyMol (version 1.1) software package was used to erase all the hetero atoms, water molecules, and inhibitor present in the structure (DeLano, 2002). Finally, the non-covalent interaction of phytochemicals-protease was calculated using Autodock Vina software package for the docking analysis (Trott & Olson, 2009). Using this method, binding affinities of ligand-protease were determined and reported in kcal/mol unit. The grid box in Autodock Vina was generated targeting the active site of the main protease, where the center was at X: −11.76, Y: 15.17, Z: 69.19 and the dimensions of the grid box were, X: 25.22, Y: 29.33 and Z: 29.22 (unit of the dimensions, Å). To get more insights of these results, GOLD (Jones et al., 1997) docking program was also employed.

Molecular dynamics

The molecular dynamics (MD) simulation was performed on the best five selected phytochemicals obtained from molecular docking study, which helped to get more insight into the protein and docked complexes in biological condition. In this study, MD simulation was conducted by YASARA Dynamics software (Krieger et al., 2012). The AMBER14 force field was employed for this study to describe the macromolecular system (Dickson et al., 2014). n class="Chemical">Water and Na+/Cl- ions were also added to the system. Periodic boundary condition was incorporated to perform the simulation, where the cell size was 20 Å larger than the protease size in all cases. By employing steepest gradient approach (5000 cycles), the initial energy minimization for each system was performed. MD simulations were performed using the PME method to designate long-range electrostatic interactions at a cut off distance of 8 Å, and defining physiological conditions at 310 K, pH 7.4, 0.9% NaCl (Krieger et al., 2006). The simulation temperature was controlled using the Berendsen thermostat, where the pressure kept constant throughout the simulation. A multiple time step algorithm was employed, where the simulation time step was selected as 1.25 fs (Krieger & Vriend, 2015). Finally, MD simulation was performed for 100 ns long and snapshots were saved at every 100 ps into MD trajectory for further analysis. Root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), radius of gyration (Rg), solvent accessible surface area (SASA), and total number of H-bond count were calculated from the MD simulations similar to other studies (Elfiky & Azzam, 2020; Enayatkhani et al., 2020; Muralidharan et al., 2020; Pant et al., 2020).

Pharmacokinetic parameters

Toxicity and some pharmacokinetic parameters were predicted using admetn class="Species">SAR online (Cheng et al., 2012). In early stage of drug discovery, pharmacokinetic study, and data analysis assist scientists to identify safe and effective drug candidates.

Principal component analysis (PCA) on MD simulation data

The principal component analysis is a widely used unsupervised data reduction method. Here, in this project, the goal of applying PCA method is to highlights the similarity and dissimilarity in the collected structural and energy profile of MD trajectory data (Martens & Naes, 1992; Wold et al., 1987). Using this technique, any structural quality change during MD can be characterized by comparing different drug-protein complexes. The following equation highlights the important components of a PCA model: where, X matrix is expressed as a product of two new matrices, i.e. Tk and Pk, Tk serves as the matrix of scores that represents how samples relate to each other, Pk represents the matrix of loadings which contain information about how variables relate to each other, k is the number of factors included in the model, and E is the matrix of residuals. These residuals contain the unmodeled variances. Complexes of the main protease with the selected five phytochemicals may have differences with the main protease, i.e. apo-protein, during MD simulations in terms of structural and energy profile. These differences can be detected by the PCA algorithm (Islam et al., 2019). All calculations were performed on R platform using in-house developed codes (R Core Team, 2019), and plots were generated using the package ggplot2 (Wickham, 2009). Data were preprocessed using autoscale function before applying PCA algorithm (Martens & n class="Chemical">Naes, 1992). The final 50 ns of MD trajectory data were used for analyzing the PCA.

Quantitative structure-activity relationships (QSAR) of phytochemicals

Forty potential ligands were randomly divided into a training set with 25 ligands and test set containing 15 ligands. The test set was utilized as the validation samples. TPSA (topological polar surface area, Å2), molecular weight, XLogPs-AA, HBD, ROTB count, no. of H, C, O, Cl, n class="Chemical">N, and F atoms, single bonds (SB) count, double bonds (DB) count, and no. of benzene rings of the drug candidates were the considered as variables. These variables with calculated binding energies were used to correlate with structure-activity relationship using multiple linear regression (MLR) (Fakayode et al., 2009; Liu et al., 2017; Mark & Workman, 2007). Multiple liner regression was performed using XLSTAT (Adinsoft, 2010).

Results

Molecular docking

By using Autodock Vina, molecular docking is performed to find out the best candidates among the 40 phytochemicals based on their binding scores. Binding affinities of the phytochemicals are distributed within the range of −3.0 to −4.0, −4.1 to -5.0, −5.1 to −6.0, −6.1 to −7.0, −7.1 to −8.0, −8.1 to −9.0, and −10.0 to −11.0 kcal/mol (Figure 1). Selected compounds are screened primarily using AutoDock vina scoring function to find out the best candidates, then the GOLD suit was employed to understand their fitness. The ChemPLP n class="Disease">fitness score is used in GOLD docking, which is the default scoring function of GOLD software program. In GOLD scoring system, higher fitness score indicates the better docking interaction between ligand and protein. The binding affinity and fitness score of all phytochemicals are shown in Table 1. Based on the best binding affinities, hypericin, cyanidin 3-glucoside, baicalin, and glabridin are selected for further analysis. In this study, α-ketoamide-11r is considered as a control ligand because it is recently reported as a good inhibitor against main protease (Zhang et al., 2020), which shows binding affinity of -7.8 kcal/mol. Hypericin and pseudohypericin show the highest binding affinity of -10.7 kcal/mol. As both of them are structurally similar, only hyperici is selected for further study.
Figure 1.

Frequency distribution of 40 phytochemicals over the range of docking scores.

Table 1.

Docking results of all phytochemicals with main protease of SARS-CoV-2 (AutoDock Vina scores are in kcal/mol and GOLD scores are dimensionless).

Ligand nameAutoDock VinaGOLD
Hypericin−10.780.15
Pseudohypericin−10.785.31
Cyanidin 3-Glucoside−8.481.71
Baicalin−8.159.19
Glabridin−8.163.68
Glycyrrhizin−7.960.37
α-Ketoamide-11r−7.893.07
Isobavachalcone−7.878.59
Apigenin−7.761.85
Betulinic Acid−7.650.96
Oleuropein−7.678.78
Quercetin−7.666.11
Luteolin−7.560.33
Oleanolic Acid−7.549.4
Psoralidin−7.562.31
Sageone−7.562.02
Ursolic Acid−7.546.43
Cystoketal−7.465.35
Emodin−7.356.4
Dithymoquinone−7.242.44
Rosmarinic Acid−7.270.63
Liquiritigenin−7.158.53
Curcumin−6.970.18
Cinanserin−6.766.07
Safficinolide−6.652.89
Lapachol−6.355
Hibiscus Acid−5.936.75
Gingerol−5.462.7
Hydroxytyrosol−5.344.08
Zingerone−5.348.64
Carvacrol−5.243.9
Cinnamic−5.244.24
Methyl Cinnamate−5.142.05
Thymohydroquinone−547.77
Thymoquinone−542.44
Thymol−4.945.1
Cinnamaldehyde−4.639.1
Ajoene−4.248.47
Allicin−3.337.59
Diallyl Trisulfide−3.341.64
Frequency distribution of 40 phytochemicals over the range of docking scores. Docking results of all phytochemicals with main protease of SARS-CoV-2 (AutoDock Vina scores are in kcal/mol and GOLD scores are dimensionless).

Molecular interaction of the selected phytochemicals with the main protease

Analysis of the non-covalent interactions of the best five phytochemicals with the main protease reveals that the selected compounds interact with either both (Cys145 and n class="Chemical">His41) or at least one catalytic residue detected by Autodock Vina, as shown in Table 2 and Figure 2 (interaction detected by GOLD is summarized in Table S2). The α-ketoamide-11r is stabilized by nine hydrogen bonds and two hydrophobic bonds while interacting with the receptor protein. It also forms hydrogen bonds with catalytic residue Cys145 and hydrophobic interaction with His41. Baicalin interacts through six hydrogen bonds, one hydrophobic interaction, and one pi-sulfur interaction with the catalytic residue Cys145. Cyanidin 3-glucoside forms six hydrogen bonding interactions and three hydrophobic interactions in which one hydrophobic interaction is observed with the catalytic residue Cys145. Glabridin gets stabilized by one electrostatic, five hydrophobic interactions, and no hydrogen bonding interaction is observed. Hypericin forms four hydrogen bonding interactions and five hydrophobic interactions in which one pi-alkyl interaction is observed with catalytic residue Cys145.
Table 2.

Nonbonding interactions of selected five phytochemicals with main protease of SARS-CoV-2 (pose predicted by AutoDock Vina) where, CH = Conventional Hydrogen bond, H = hydrogen bond, C = carbon hydrogen bond, A = alkyl.

Interacting residueDistanceBond categoryBond Type
α-Ketoamide-11r
ASN1422.49HCH
GLY1432.58HCH
GLY1432.48HCH
SER1442.09HCH
SER1442.13HCH
CYS1452.68HCH
PHE1402.73HCH
HIS1642.89HCH
GLY1432.54HCH
HIS412.87HydrophobicPi-Sigma
MET494.92HydrophobicAlkyl
Baicalin
PRO1682.96HCH
GLU1662.19HCH
GLU1662.53HCH
SER1443.06HCH
GLU1663.01HC
GLU1662.08HC
CYS1455.27OtherPi-Sulfur
MET495.18HydrophobicPi-Alkyl
Cyanidin 3-Glucoside
GLN1893.02HCH
LEU1412.52HCH
THR262.84HCH
ASP1872.75HCH
GLU1662.46HCH
GLY1432.84HC
MET494.93HydrophobicPi-Alkyl
MET494.31HydrophobicPi-Alkyl
CYS1455.17HydrophobicPi-Alkyl
Glabridin
GLU1664.10ElectrostaticPi-Anion
MET493.76HydrophobicAlkyl
MET494.86HydrophobicAlkyl
MET1654.79HydrophobicPi-Alkyl
HIS414.64HydrophobicPi-Alkyl
HIS413.91HydrophobicPi-Alkyl
Hypericin
GLU1662.41HCH
LEU1412.83HCH
ASN1422.95HC
GLU1662.99HPi-Donor H
GLU1662.69HydrophobicPi-Sigma
GLN1892.50HydrophobicPi-Sigma
MET1654.32HydrophobicAlkyl
MET1654.35HydrophobicPi-Alkyl
CYS1455.05HydrophobicPi-Alkyl
Figure 2.

Nonbonding interactions of five selected phytochemicals with the main protease of SARS-CoV-2 (pose predicted by AutoDock Vina).

Nonbonding interactions of five selected phytochemicals with the main protease of n class="Species">SARS-CoV-2 (pose predicted by AutoDock Vina). Nonbonding interactions of selected five phytochemicals with main protease of n class="Species">SARS-CoV-2 (pose predicted by AutoDock Vina) where, CH = Conventional Hydrogen bond, H = hydrogen bond, C = carbon hydrogen bond, A = alkyl.

Molecular dynamics simulation

The RMSD of alpha carbon atoms of all systems are analyzed to detect their stability. It is observed from Figure 3a that α-ketoamide-n class="Chemical">11r complex exhibits the lowest RMSD than other complexes. Even RMSD of the apo-protein is slightly higher than the α-ketoamide-11r, which indicates the greater stability of α-ketoamide-11r. The RMSD of baicalin-protein complex reaches to ∼1.63 Å from 10 to 50 ns, however, this value significantly increases after 50 ns and reaches to 2.24 Å. While assessing the RMSD of cyanidin 3-glucoside complex, the steady increase of RMSD is observed after 21 ns (average RMSD >3.0 Å). Nonetheless, this value is decreased eventually, which indicates that cyanidin 3-glucoside may change the protein conformation. Unlike apo-protein and α-ketoamide-11r complex, RMSD of the glabridin complex is mostly stable. But the complex is found exhibiting its increased RMSD from 45.6 to 78 ns (average RMSD >2.74 Å), and subsequently the complex gets stable. Particularly, the hypericin complex shows consistent fluctuation from 11 to 100 ns. This complex also shows the higher deviation of fluctuations throughout the trajectory.
Figure 3.

Analysis of RMSD, RMSF, Rg, SASA, and total number of hydrogen bond of apo-protein and selected five phytochemical complexes with protein at 100 ns MD simulations. (a) Root-mean-square deviation (RMSD) of the Cα atoms, (b) RMSF values of the alpha carbon over the entire simulation, where the ordinate is RMSF (Å) and the abscissa is residue, (c) Radius of gyration (Rg) over the entire simulation, where the ordinate is Rg (Å) and the abscissa is time (ns), (d) Solvent accessible surface area (SASA), where the ordinate is SASA (Å2) and the abscissa is time (ns), and (E) Total number of H-bond count throughout the simulation.

Analysis of RMSD, RMSF, Rg, SASA, and total number of n class="Chemical">hydrogen bond of apo-protein and selected five phytochemical complexes with protein at 100 ns MD simulations. (a) Root-mean-square deviation (RMSD) of the Cα atoms, (b) RMSF values of the alpha carbon over the entire simulation, where the ordinate is RMSF (Å) and the abscissa is residue, (c) Radius of gyration (Rg) over the entire simulation, where the ordinate is Rg (Å) and the abscissa is time (ns), (d) Solvent accessible surface area (SASA), where the ordinate is SASA (Å2) and the abscissa is time (ns), and (E) Total number of H-bond count throughout the simulation. Since RMSF (root means square fluctuation) helps to understand the region of protein that is being fluctuated during the simulation, the flexibility of each residue is calculated to get better insight on what extent the binding of ligand affects the protein flexibility. It can be understood from Figure 3b that the binding of hypericin makes the protein most flexible in all areas in contrast to apo-protein and the other complexes. n class="Chemical">Hypericin is found to induce local flexibility at Met49 (S2 pocket), Asn51, Pro52, Tyr154, Asp248, Arg279, and Phe294. The apo-protein structure is found to have the lowest RMSF, which indicates that even in unliganded state, the protein is not very much flexible. Besides, the RMSF values of other complexes including α-ketoamide-11r, baicalin, cyanidin 3-glucoside, and glabridin are mostly similar in all areas. Overall, the residues such as Ile136, Lys137, Gly138, Ser139, Phe140 (S1 pocket), Leu141, Asn142, Asp153, Typ154, Arg222, Ser301, Val303, Thr304, Phe305, and Gln306 are found flexible for both of apo-protein and ligand-bound complexes. Higher SASA value indicates the expansion of protein volume and a low fluctuation is expected over the simulation time. Binding of any small molecule might change n class="Chemical">SASA and sometimes could greatly affect the protein structure. The SASA values of alpha ketoamide-11r are found lowest in most of the frames compared to apo-protein. Thus, it can be suggested that the binding of α-ketoamide-11r potentially could reduce protein expansion. In addition, the baicalin complex shows normal fluctuations as compared to the apo-protein and α-ketoamide-n class="Chemical">11r complex. But it shows highest SASA from 80.8 to 100 ns revealing the conformational states with higher protein expansion. Interestingly, cyanidin 3-glucoside complex shows quite similar fluctuations from 60 to 100 ns as exhibited by α-ketoamide-11r complex. It signifies its SASA mediated behavior as it is also observed by α-ketoamide-11r complex. Although the SASA of glabridin complex is observed in the median till 46 ns compared to other complexes as described earlier, the fluctuations are quite irregular from 66 to 78 ns. The average SASA value of all systems are 14160.7, 13985.1, 14209.3, 14039.9, 13958.4, and 13955.1 for apo-protein, α-ketoamide-11r, baicalin, cyanidin 3-glucoside, glabridin, and hypericin, respectively. The first and second lowest average SASA is found for hypericin and glabridin complexes. The radius of gyration (Rg) represents the compactness of a structure. The lower degree of fluctuation with its consistency throughout the simulation indicates the greater compactness and n class="Disease">rigidity of a system. The Rg of apo-protein is found almost stable in terms of consistency of fluctuations throughout the simulation. Besides, the Rg of α-ketoamide-11r is increased from 22 to 95.6 ns (average Rg ∼ 22.82). The greater change of Rg might be the result of protein folding, or unique conformational changes. The baicalin complex shows increased Rg from 78 to 100 ns (average Rg ∼ 22.50), and its last parts of the trend are similar to α-ketoamide-11r. In case of cyanidin 3-glucoside complex, the lowest Rg value is observed from 80 to 100 ns, which indicates greater rigidness in contrast to the other complexes. Besides, the glabridin complex seems to have lowest Rg corresponding to its highest rigidity from 0 to 40 ns and 63 ns to 77 ns. The Rg value of hypericin is much higher after 20 ns (average Rg ∼ 23.16) indicating its slackness of packing compared to all other complexes. On the basis of average Rg, the glabridin complex has the lowest Rg, which are calculated as 22.24, while for apo-protein, α-ketoamide-11r, baicalin, cyanidin 3-glucoside, and hypericin, the Rg values are determined as 22.35, 22.74, 22.34, 22.32, and 23.03, respectively. Therefore, the order of compactness and rigidness should be glabridin > cyanidin 3-glucoside > baicalin > apo-protein > α- ketoamide-11r > hypericin. The number of intermolecular hydrogen bonds in the ligand-protein complex are determined, since it is known to contribute conformational stability. The average number of n class="Chemical">hydrogen bonds are 513, 504, 518, 526, 508, and 505 for apo-protein, α-ketoamide-11r, baicalin, cyanidin 3-glucoside, glabridin, and hypericin complexes, respectively. The highest number of hydrogen bonds is observed for cyanidin 3-glucoside complex, whilst the lowest number of hydrogen bonds is observed in α-ketoamide-11r complex over the 100 ns simulation period. The baicalin complex possesses a greater number of hydrogen bonds compared to apo-protein, which shows almost similar trend to that of glabridin complex.

Pharmacokinetic properties

Total seven pharmacokinetic parameters including carcinogenicity, n class="Species">rat acute toxicity (LD50, mol/kg), p-glycoprotein inhibitor, blood-brain barrier, glycoprotein substrate, organic cation transporter, and human intestinal absorption are tested for the selected best phytochemicals. The results are summarized in Table 3. The results show that the phytochemicals are safer to use. Outcomes are explained in more detail in the discussion section.
Table 3.

Pharmacokinetic parameters of the best phytochemicals.

DrugsCarcinogenicityRat Acute Toxicity (LD50, (mol/kg)P-glycoprotein InhibitorBlood-brain barrierHuman intestinal absorptionRenal organic cation transporterP-glycoprotein Substrate
Cyanidin 3-GlucosideNon carcinogenic2.6483Non inhibitorPositiveNegativeNon inhibitorSubstrate
HypericinNon carcinogenic2.6870Non inhibitorNegativePositiveNon inhibitorSubstrate
BaicalinNon carcinogenic2.7357Non inhibitorNegativePositiveNon inhibitorSubstrate
α-Ketoamide-11rNon carcinogenic2.3318Non inhibitorNegativePositiveNon inhibitorSubstrate
GlabridinNon carcinogenic2.9435Non inhibitorPositivePositiveNon inhibitorSubstrate
Pharmacokinetic parameters of the best phytochemicals.

Principal component analysis (PCA) of molecular dynamics

Principal component analysis (PCA) is used to analyze the structural and energy data obtained from MD simulation on protein-phytochemical complexes and apo-protein. Bond distances, bond angles, dihedral angles, planarity, Van der Waals and electrostatic energies were included as variables. The PCA score plot (Figure 4a) reveals different clusters formation. Among these clusters, α-ketoamide-11r-protein complex (red), n class="Chemical">cyanidin 3-glucoside-protein complex (blue), and the apo-protein (black) are overlapped. Loading plot of the PCA (Figure 4b) reveals that the dihedral angle energies show positive correlation with these three groups. Baicalin-protein complex (green) demonstrates similar patterns to α-ketoamide-11r and cyanidin 3-glucoside complexes in the score plot, although the energy distribution of baicalin is broader compared to the other two complexes. The fluctuating nature of baicalin-protein complex in the MD simulation could be the reason for its wider distribution in the score plot. However, it can be considered as a potential candidate. The hypericin-protein complex exhibits significant difference compared to the other drug-protein complexes by forming a distinct cluster (magenta). This is reasonable as hypericin shows the highest deviation during complex formation compared to the other candidates. The glabridin-protein complex is also formed a distinct cluster (cyan) in the score plot.
Figure 4.

(a) The score plot presented six data clusters in different color, where each dot represented one time point. The clustering is attributable as: apo-protein (black), α-ketoamide-11r complex (red), baicalin (green), cyanidin 3-glucoside complex (blue), glabridin complex (cyan), hypericin complex (magenta), (b) Loading plot from principal components analysis of the energy and structural data.

(a) The score plot presented six data clusters in different color, where each dot represented one time point. The clustering is attributable as: apo-protein (black), α-ketoamide-11r complex (red), n class="Chemical">baicalin (green), cyanidin 3-glucoside complex (blue), glabridin complex (cyan), hypericin complex (magenta), (b) Loading plot from principal components analysis of the energy and structural data.

Quantitative structure-activity relationships (QSAR)

QSAR is a vastly used tool in bioinformatics, drug discovery for the pharmaceutical industry, clinical research, agrochemical, and petrochemical sectors for modeling and predictive pattern analysis (Alam & Khan, 2017; Fakayode et al., 2014; Funar-Timofei et al., 2017). Herein multiple linear regression (MLR) has been used for further analysis. For instance, TPSA (Å2), molecular weight, XLogP3, H-bond donor count, and H-bond acceptor count of the ligands are the most significant variables on n class="Gene">QSAR contributors to the MLR regression (Table S3). PCA is used for pattern recognition of potential ligands, which is represented in this study by PCA score plot (Figure 5c). The first principal component (PC1) shows 57% of the variability in ligand QSAR and 21% of the variation of ligand binding energy. Second principal component (PC2) exhibits 22% in QSAR variability of ligands and the variation of the ligands binding energy is 11%. An interesting grouping of the ligands is observed by the score plot. The ligands with a similar functional group are gathered together side by side on the score plot. For example, ligands (L6, L7, L12, L15, L16, L22, and L25) containing -OH, -COOH, and C = O functional group attached to benzene ring are clustered on the first and second quadrants of the score plot. Only L3 (α-ketoamide-11r) with containing -CONH- and C = O functional groups is placed on the upper-right corner of the score plot.
Figure 5.

(a) Graphical representation observations vs. standardized residues by MLR (training set), (b) Graphical representation observations vs. standardized residues by MLR (test set), (c) Score plot of PCA analysis for QSAR of ligands.

(a) Graphical representation observations vs. standardized residues by MLR (training set), (b) Graphical representation observations vs. standardized residues by MLR (test set), (c) Score plot of PCA analysis for QSAR of ligands. In contrast, ligands containing isoflavane backbone in (L4, L8, n class="Gene">L17, L18, and L19) are clustered on the third and fourth quadrants of the score plot. The observed grouping of ligands on the score plot is highly significant. In addition, this class of ligands has cyclic ether, -OH, -COOH groups, which may have a significant influence on drug reactivity, chemical behavior, therapeutic effect, and potency. L11 and L13 contain common anthraquinone derivative with fused ring and -OH functional group. This isoflavane and anthraquinone backbone pattern could be a key for further investigations to explore new drugs for COVID-19. The overarching goal of any MLR model is to predict binding energy of the future drug candidates. The MLR model is developed and therefore is used to predict the binding energy of the test set for validation of drug candidates. The results demonstrate almost the same binding energies obtained from the binding energies from molecular docking. Binding energy (kcal/mol) = −3.51 + 4.27E-02*TPSA (Å2)-6.42E-03*MW(g/mol)-0.25*XLogP3-AA-0.48*n class="Gene">HBD-0.64*HBA. Here the R2 is 0.842 for the training set and 0.753 for the test set. Figure 5a,b show the prediction results versus original binding results of the training set and test set. In Figure 5a, L3 and L7 show lowest residuals. Possible reason could be the presence of the -CONH- and C = O groups in L3, and the carboxylic group and -C = C- present in L7. On the other hand, in the test set L27 (Figure 5b) reveals the lowest standardized residuals may be the presence of hetero N, S, and O atoms. L33 exhibits a high standardized residual may be due to the large asymmetric molecule and their conformational change. The lowest standardized residual indicates that the predicted binding energies have close agreement to the binding energies obtained by molecular docking. The predicted QSAR binding energies and the docking binding energies obtained from the ligands with similar SAR typically have similar chemical reactivity, pharmacological activity, behavior, and efficacy. But the results obtained by the QSAR should be used with caution as the binding energies are not experimental. The model can be further used for rapid screening of COVID-19 candidates for drug discovery in pharmaceuticals and pharmacology research.

Discussion

Computer-aided drug design (CADD) has become one of the essential approaches in modern drug discovery as it can significantly minimize the cost and labor involved in the drug discovery process. It accelerates the drug development by allowing the scientists to narrow down the biological and synthetic testing efforts. Moreover, molecular docking, molecular dynamics, n class="Gene">QSAR, and ADMET tools have become some of the key parts in the CADD because of their reliable predictions (Talele et al., 2010). Molecular docking predicts the prevailing binding modes between a ligand and a protein by proposing the hypothesis of how the ligands inhibit the protein and thus it ranks candidate ligands (Morris & Lim-Wilby, 2008). In this study, 40 antiviral phytochemical agents are docked against the main protease of SARS-CoV-2. Among them, five candidates are then selected according to their high binding affinity. It is observed that n class="Chemical">hypericin, cyanidin 3-glucoside, baicalin, glabridin, and α-ketoamide-11r could be used as potential inhibitors for the main protease. Hypericin shows the highest binding affinity of −10.7 kcal/mol and forms a pi-alkyl interaction with the catalytic binding residue Cys145. Cyanidin 3-glucoside, baicalin, glabridin, and α-ketoamide-11r also exhibit high binding affinity of −8.4, 8.1, −8.1, and −7.8 kcal/mol, respectively. All the top scored phytochemicals demonstrate strong noncovalent interactions with the binding site residues. More specifically, the selected five phytochemicals form non-covalent interactions with both the two (Cys145 and His41) or at least one of the catalytic residues, and thereby can act as inhibitors of the main protease. QSAR study reveals that the topological polar surface area (TPSA, Å2), molecular weight, XLogP3, H-bond donor count, and H-bond acceptor count of the ligands are the most significant variables. MLR regression model validated their role in the binding affinity and non-covalent interactions of the ligands with the main protease. It is also predicted from the principal component analysis (PCA) of QSAR that L3 (α-ketoamide-11r) shows the lowest residuals because of the presence of the -CONH- and C = O groups in its chemical structure. Furthermore, QSAR analysis of all the phytochemicals show almost similar binding affinity predicted by the binding affinity of the molecular docking in Autodock Vina. Molecular dynamics simulation of α-ketoamide-11r shows the lowest RMSD value than the apo-protein and the other complexes, which indicates its greater stability. n class="Chemical">SASA of α-ketoamide-11r also confirms that it unfolds and stabilizes the main protease. Cyanidin 3-glucoside has lower Rg and SASA values, which indicates that it can make the protease more compact and rigid. Moreover, the lowest RMSF value of the apo-protein indicates its compactness even without forming a complex with a ligand. Except hypericin, all the selected phytochemicals demonstrate similar patterns in RMSF. Hypericin is the most flexible in all areas in contrast to apo-protein and other complexes. Baicalin also shows lower RMSD value and followed the similar trend. Principal component analysis reveals the structural similarity between α-ketoamide-11r-protein and cyanidin 3-glucoside-protein complexes, which is strongly supported by the RMSD, RMSF, Rg, and SASA analysis. PCA analysis unveils that baicalin could be a good candidate to stabilize the main protease as supported by the RMSD analysis. Recent drug discovery depends on drugs which show good pharmacokinetic properties. To be a promising drug candidate, pharmacokinetic parameters must be optimized so that these can pass standard clinical trial criteria. The best five phytochemicals are found to be noncarcinogenic as evidence obtained from carcinogenicity and n class="Species">rat acute toxicity. Cyanidin 3-glucoside and glabridin might cross the blood-brain barrier. All selected phytochemicals might interact with p-glycoprotein which is a member of ABC transporter family and they will not inhibit organic cation transporter. These parameters provide information about the secretion of drugs through urine. Only cyanidin 3-glucoside could show negative result in terms of human intestinal absorption. These pieces of information may provide necessary data for designing promising and effective inhibitors of the main protease of SARS-CoV-2 in future (Cheng et al., 2012; Motohashi & Inui, 2013; Mukhametov & Raevsky, 2017).

Conclusion

In summary, molecular docking, molecular dynamics, QSAR modeling, PCA, and ADMET tools are successfully employed to determine the best phytochemicals against the main protease of n class="Species">SARS-CoV-2. Among the studied 40 phytochemicals, hypericin, cyanidin 3-glucoside, baicalin, glabridin, and α-ketoamide-11r show the highest binding affinity and strong interactions with both or at least one of the catalytic residues (Cys145 and His41) of the main protease. These compounds show many non-covalent interactions, such as hydrogen bonding, hydrophobic, and electrostatic interactions. MD results show that in the physiological environment, baicalin, cyanidin 3-glucoside, and α-ketoamide-11r are the most stable ligands and they are making a greater number of interactions through hydrogen bonds with the main protease. Pharmacokinetic and ADMET analysis indicate their efficacy as drug molecules. PCA of QSAR results show that the ligands, e.g. L6, L7, L12, L15, L16, L22, and L25 containing -OH, -COOH, and C = O functional groups attached to their benzene ring are grouped together in the first and second quadrants of the score plot. Only α-ketoamide-11r (L3) with -CONH- and C = O functional groups take position on the upper right corner of the score plot. The ligands, e.g. L4, L8, L17, L18, L19, L11, and L13 containing isoflavane backbone, anthraquinone with cyclic ether, fused ring, -OH, and -COOH are placed in the third and fourth quadrants of the score plot. Therefore, PCA analysis successfully divides the studied phytochemicals into two groups. The MLR model calculates the value of R2, which is 0.842 for the training set and 0.753 for the test set. It means that our proposed model can predict the binding energies compared to the values obtained from molecular docking. It can be concluded that most of the selected phytochemicals show promise and can be used to design effective antiviral drugs against the SARS-CoV-2.
  46 in total

1.  ADP-ribose and analogues bound to the deMARylating macrodomain from the bat coronavirus HKU4.

Authors:  Robert G Hammond; Norbert Schormann; Robert Lyle McPherson; Anthony K L Leung; Champion C S Deivanayagam; Margaret A Johnson
Journal:  Proc Natl Acad Sci U S A       Date:  2021-01-12       Impact factor: 11.205

Review 2.  Inhibition of the main protease of SARS-CoV-2 (Mpro) by repurposing/designing drug-like substances and utilizing nature's toolbox of bioactive compounds.

Authors:  Io Antonopoulou; Eleftheria Sapountzaki; Ulrika Rova; Paul Christakopoulos
Journal:  Comput Struct Biotechnol J       Date:  2022-03-14       Impact factor: 7.271

3.  Natural flavonoids effectively block the CD81 receptor of hepatocytes and inhibit HCV infection: a computational drug development approach.

Authors:  Dipta Dey; Partha Biswas; Priyanka Paul; Shafi Mahmud; Tanzila Ismail Ema; Arysha Alif Khan; Shahlaa Zernaz Ahmed; Mohammad Mehedi Hasan; Abu Saim Mohammad Saikat; Babry Fatema; Shabana Bibi; Md Ataur Rahman; Bonglee Kim
Journal:  Mol Divers       Date:  2022-07-12       Impact factor: 3.364

Review 4.  Methodology-Centered Review of Molecular Modeling, Simulation, and Prediction of SARS-CoV-2.

Authors:  Kaifu Gao; Rui Wang; Jiahui Chen; Limei Cheng; Jaclyn Frishcosy; Yuta Huzumi; Yuchi Qiu; Tom Schluckbier; Xiaoqi Wei; Guo-Wei Wei
Journal:  Chem Rev       Date:  2022-05-20       Impact factor: 72.087

5.  Point-specific interactions of isovitexin with the neighboring amino acid residues of the hACE2 receptor as a targeted therapeutic agent in suppressing the SARS-CoV-2 influx mechanism.

Authors:  Nourin Ferdausi; Samarth Islam; Fahmida Hoque Rimti; Syeda Tasnim Quayum; Efat Muhammad Arshad; Aashian Ibnat; Tamnia Islam; Adittya Arefin; Tanzila Ismail Ema; Partha Biswas; Dipta Dey; Salauddin Al Azad
Journal:  J Adv Vet Anim Res       Date:  2022-06-26

6.  Andrographolide induces anti-SARS-CoV-2 response through host-directed mechanism: an in silico study.

Authors:  Bhabani Shankar Das; Nabarun Chandra Das; Shasank Sekhar Swain; Suprabhat Mukherjee; Debapriya Bhattacharya
Journal:  Future Virol       Date:  2022-07-04       Impact factor: 3.015

7.  Multi-omics data integration and network-based analysis drives a multiplex drug repurposing approach to a shortlist of candidate drugs against COVID-19.

Authors:  Marios Tomazou; Marilena M Bourdakou; George Minadakis; Margarita Zachariou; Anastasis Oulas; Evangelos Karatzas; Eleni M Loizidou; Andrea C Kakouri; Christiana C Christodoulou; Kyriaki Savva; Maria Zanti; Anna Onisiforou; Sotiroula Afxenti; Jan Richter; Christina G Christodoulou; Theodoros Kyprianou; George Kolios; Nikolas Dietis; George M Spyrou
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 11.622

Review 8.  An Updated Review of Computer-Aided Drug Design and Its Application to COVID-19.

Authors:  Arun Bahadur Gurung; Mohammad Ajmal Ali; Joongku Lee; Mohammad Abul Farah; Khalid Mashay Al-Anazi
Journal:  Biomed Res Int       Date:  2021-06-24       Impact factor: 3.411

9.  Seq12, Seq12m, and Seq13m, peptide analogues of the spike glycoprotein shows antiviral properties against SARS-CoV-2: An in silico study through molecular docking, molecular dynamics simulation, and MM-PB/GBSA calculations.

Authors:  Kunal Dutta; Ammar D Elmezayen; Anas Al-Obaidi; Wei Zhu; Olga V Morozova; Sergey Shityakov; Ibrahim Khalifa
Journal:  J Mol Struct       Date:  2021-07-16       Impact factor: 3.196

10.  In silico analysis and identification of antiviral coumarin derivatives against 3-chymotrypsin-like main protease of the novel coronavirus SARS-CoV-2.

Authors:  Rahman Abdizadeh; Farzin Hadizadeh; Tooba Abdizadeh
Journal:  Mol Divers       Date:  2021-07-02       Impact factor: 3.364

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.