| Literature DB >> 34055238 |
Akanksha Rajput1, Anamika Thakur1,2, Adhip Mukhopadhyay1,2, Sakshi Kamboj1,2, Amber Rastogi1,2, Sakshi Gautam1,2, Harvinder Jassal1, Manoj Kumar1,2.
Abstract
The world is facing the COVID-19 pandemic caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). Likewise, other viruses of the Coronaviridae family were responsible for causing epidemics earlier. To tackle these viruses, there is a lack of approved antiviral drugs. Therefore, we have developed robust computational methods to predict the repurposed drugs using machine learning techniques namely Support Vector Machine, Random Forest, k-Nearest Neighbour, Artificial Neural Network, and Deep Learning. We used the experimentally validated drugs/chemicals with anti-corona activity and their inhibition efficiencies (IC50/EC50) from 'DrugRepV' repository. The unique entries of SARS-CoV-2 (142), SARS (221), MERS (123), and overall Coronaviruses (414) were subdivided into the training/testing and independent validation datasets, followed by the extraction of chemical/structural descriptors and fingerprints (17968). The highly relevant features were filtered using the recursive feature selection algorithm. The selected chemical descriptors were used to develop prediction models with Pearson's correlation coefficients ranging from 0.60-0.90 on training/testing. The robustness of the predictive models was further ensured using external independent validation datasets, decoy datasets, applicability domain, and chemical analyses. The developed models were used to predict promising repurposed drug candidates against coronaviruses after scanning the DrugBank. Top predicted molecules for SARS-CoV-2 were further validated by molecular docking against the spike protein complex with ACE receptor. We found potential repurposed drugs namely, Verteporfin, Alatrofloxacin, Metergoline, Rescinnamine, Leuprolide, and Telotristat ethyl with high binding affinity. These computational methods would assist in antiviral drug discovery against SARS-CoV-2 and other Coronaviruses.Entities:
Keywords: AI; COVID-19; Chemical descriptors; Coronaviruses; Drug repurposing; Machine learning; SARS-CoV-2
Year: 2021 PMID: 34055238 PMCID: PMC8141697 DOI: 10.1016/j.csbj.2021.05.037
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
The performance of the Severe Acute Respiratory Syndrome Virus (SARS), Middle East Respiratory Syndrome Virus (MERS), Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2), and Overall Coronaviruses among the training/testing dataset during 10-fold cross validation using Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbour (KNN), and Artificial Neural Network (ANN).
| Virus | Algorithm | Model Parameters | Dataset | MAE | RMSE | R2 | PCC |
|---|---|---|---|---|---|---|---|
| SARS | SVM | gamma:0.001C:50 | T198 | 0.21 | 0.42 | 0.82 | 0.92 |
| RF | n:100 depth:10 split:5 leaf:1 | T198 | 0.49 | 0.74 | 0.54 | 0.76 | |
| KNN | k:9 | T198 | 0.50 | 0.69 | 0.53 | 0.76 | |
| ANN | activation:tanh solver:sgd learning:adaptive | T198 | 0.83 | 0.92 | 0.14 | 0.73 | |
| SARS- CoV-2 | SVM | gamma:0.005C:50 | T127 | 0.37 | 0.58 | 0.60 | 0.84 |
| RF | n:500 depth:12 split:2 leaf:1 | T127 | 0.84 | 0.86 | 0.15 | 0.50 | |
| KNN | k:11 | T127 | 0.86 | 1.01 | 0.04 | 0.50 | |
| ANN | activation:tanh solver:sgd learning:constant | T127 | 2.46 | 1.80 | 0.39 | 0.62 | |
| MERS | SVM | gamma:0.0005C:100 | T110 | 0.08 | 0.30 | 0.78 | 0.92 |
| RF | n:400 depth:8 split:2 leaf:4 | T110 | 0.37 | 0.53 | 0.16 | 0.60 | |
| KNN | k:5 | T110 | 0.30 | 0.56 | 0.29 | 0.65 | |
| ANN | activation:relu solver:sgd | T110 | 1.04 | 0.69 | 0.16 | 0.49 | |
| Overall Coronaviruses | SVM | gamma:0.0005C:500 | T372 | 0.81 | 0.84 | 0.51 | 0.73 |
| RF | n:400 depth:None split:10 leaf:4 | T372 | 1.19 | 1.08 | 0.31 | 0.58 | |
| KNN | k:5 | T372 | 1.23 | 1.10 | 0.28 | 0.57 | |
| ANN | activation:tanh solver:sgd learning:constant | T372 | 0.95 | 0.94 | 0.43 | 0.68 |
MAE, Mean absolute Error; RMSE, Root Mean Absolute Error; R2, Coefficient of Determination; PCC, Pearson’s correlation coefficient.
The performance of the Severe Acute Respiratory Syndrome Virus (SARS), Middle East Respiratory Syndrome Virus (MERS), Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2), and Overall Coronaviruses among the independent validation dataset during 10-fold cross-validation using Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbour (KNN), and Artificial Neural Network (ANN).
| Virus | Algorithm | Model Parameters | Dataset | MAE | RMSE | R2 | PCC |
|---|---|---|---|---|---|---|---|
| SARS | SVM | gamma:0.001C:50 | V23 | 0.20 | 0.44 | 0.77 | 0.90 |
| RF | n:100 depth:10 split:5 leaf:1 | V23 | 0.47 | 0.69 | 0.65 | 0.82 | |
| kNN | k:9 | V23 | 0.47 | 0.69 | 0.60 | 0.79 | |
| ANN | activation:tanh solver:sgd learning:adaptive | V23 | 0.26 | 0.51 | 0.81 | 0.92 | |
| SARS- CoV-2 | SVM | gamma:0.005C:50 | V15 | 0.21 | 0.46 | 0.81 | 0.92 |
| RF | n:500 depth:12 split:2 leaf:1 | V15 | 0.90 | 0.95 | 0.14 | 0.50 | |
| kNN | k:11 | V15 | 0.52 | 0.72 | 0.35 | 0.67 | |
| ANN | activation:tanh solver:sgd learning:constant | V15 | 2.64 | 1.62 | 0.66 | 0.68 | |
| MERS | SVM | gamma:0.0005C:100 | V13 | 0.47 | 0.68 | 0.69 | 0.92 |
| RF | n:400 depth:8 split:2 leaf:4 | V13 | 0.74 | 0.86 | 0.32 | 0.74 | |
| kNN | k:5 | V13 | 1.16 | 1.08 | 0.24 | 0.69 | |
| ANN | activation:relu solver:sgd | V13 | 0.75 | 0.87 | 0.39 | 0.50 | |
| Overall | SVM | gamma:0.0005C:500 | V42 | 0.78 | 0.88 | 0.53 | 0.75 |
| RF | n:400 depth:None split:10 leaf:4 | V42 | 1.03 | 1.02 | 0.20 | 0.49 | |
| kNN | k:5 | V42 | 1.00 | 1.00 | 0.22 | 0.58 | |
| ANN | activation:tanh solver:sgd learning:constant | V42 | 1.02 | 1.01 | 0.39 | 0.67 |
MAE, Mean Absolute Error; RMSE, Root Mean Square Error; R2, Coefficient of Determination; PCC, Pearson’s Correlation Coefficient.
Fig. 1The robustness of the Support Vector Machine models of the Severe Acute Respiratory Syndrome (SARS), Middle East Respiratory Syndrome (MERS), Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), and overall Coronavirus was checked using the a) William’s plot between the leverage and the standardized residuals. b) the plot between the actual and predicted pIC50.
Fig. 2The scatter plot shows the correlation between the actual pIC50 and the predicted pIC50 of the decoy dataset for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), Severe Acute Respiratory Syndrome (SARS), Middle East Respiratory Syndrome (MERS), and overall coronaviruses.
Fig. 3The chemical analysis of the Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) inhibitors a) The hierarchical clustering of the SARS-CoV-2 is depicted using the circular plots, b) The 3-dimensional multiscaling plot among the SARS-CoV-2 inhibitors. c) Chemical network showing the status of top-10 predicted repurposed drugs against Coronaviruses (SARS, SARS-CoV-2, and MERS). Blue color of the drug shows the predicted repurposed drugs unique to single virus, green color depicts the common repurposed drugs between SARS-CoV-2 and MERS, orange color shows the common repurposed rugs between SARS and SARS-CoV-2, while the pink color shows the common drug between the SARS and MERS. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Table showing the top hits of the predicted repurposed drug candidates against Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) with the information like DrugBank ID, Drug Name, Primary indication, Predicted pIC50, and testing status.
| DrugBank ID | Drug Name | Primary indication | Predicted pIC50 | Status |
|---|---|---|---|---|
| DB00007 | Leuprolide | Prostate cancer; Central precocious puberty (CPP) | 9.093 | Not Yet tested |
| DB00014 | Goserelin | Prostate cancer | 8.641 | Not Yet tested |
| DB00050 | Cetrorelix | Premature LH surge | 8.342 | Not Yet tested |
| DB00148 | Creatine | Dietary shortage or imbalance | 8.594 | Not Relevant |
| DB00206 | Reserpine | Hypertension; | 8.728 | Clinical trial - Observational |
| DB00234 | Reboxetine | Clinical depression | 9.308 | Not Yet tested |
| DB00248 | Cabergoline | Hyperprolactinemic disorders and Parkinsonian Syndrome | 8.370 | Not Yet tested |
| DB00266 | Dicoumarol | Coagulation disorders | 8.357 | Not Yet tested |
| DB00278 | Argatroban | Coagulation disorders | 9.357 | Clinical trial - Interventional |
| DB00289 | Atomoxetine | Attention deficit hyperactivity disorder (ADHD) | 8.563 | Not Yet tested |
| DB00331 | Metformin | Diabetes | 8.498 | Clinical trial - Interventional |
| DB00381 | Amlodipine | Hypertension | 8.363 | Clinical trial - interventional |
| DB00460 | Verteporfin | Subfoveal choroidal neovascularization | 9.556 | Not Yet tested |
| DB00470 | Dronabinol | Anorexia | 8.604 | Computational |
| DB00476 | Duloxetine | Depressive Disorder | 8.736 | Not Yet tested |
| DB00486 | Nabilone | Nausea and vomiting | 8.535 | Not Yet tested |
| DB00536 | Guanidine | Muscle weakness; Myasthenic syndrome of Eaton-Lambert | 8.640 | Not Yet tested |
| DB00579 | Mazindol | Obesity | 8.705 | Not Yet tested |
| DB00589 | Lisuride | Parkinson's Disease | 8.422 | Computational |
| DB00590 | Doxazosin | Benign prostatic hypertrophy | 8.668 | Clinical trial - Observational |
| DB00641 | Simvastatin | Cardiovascular agents | 8.404 | Clinical trial - Interventional |
| DB00644 | Gonadorelin | Gonadotropes of the anterior pituitary | 8.581 | Not Yet tested |
| DB00666 | Nafarelin | Central precocious puberty | 8.621 | Computational |
| DB00682 | Warfarin | Coagulation disorders | 8.924 | Clinical trial - Observational |
| DB00685 | Trovafloxacin | For treatment of infections caused by microorganisms | 9.041 | Computational |
| DB00706 | Tamsulosin | Benign prostatic hyperplasia | 8.890 | Not Yet tested |
| DB00738 | Pentamidine | Pneumonia | 8.592 | Not Yet tested |
| DB00768 | Olopatadine | Allergic conjunctivitis | 8.341 | Not Yet tested |
| DB00776 | Oxcarbazepine | Partial seizures | 8.402 | Not Yet tested |
| DB00778 | Roxithromycin | Respiratory tract; Urinary and soft tissue infections | 8.459 | Not Relevant |
| DB00807 | Proparacaine | Ophthalmic anesthetic | 8.810 | Clinical trial - Observational |
| DB00887 | Bumetanide | Edema associated with congestive heart failure, hepatic and renal disease | 8.548 | Clinical trial - Observational |
| DB00892 | Oxybuprocaine | Used to temporarily numb the front surface of the eye | 8.945 | Not Yet tested |
| DB00914 | Phenformin | Type 2 diabetes mellitus | 8.443 | Computational |
| DB00938 | Salmeterol | Asthma; Chronic obstructive pulmonary disease | 8.976 | Not Yet tested |
| DB00955 | Netilmicin | Bacteremia; Septicaemia; Respiratory tract infections | 8.314 | Not Yet tested |
| DB01018 | Guanfacine | Attention deficit hyperactivity disorder (ADHD) | 9.152 | Clinical trial - Observational |
| DB01079 | Tegaserod | Irritable bowel syndrome | 8.521 | Not Yet tested |
| DB01082 | Streptomycin | Tuberculosis | 8.887 | Computational |
| DB01089 | Deserpidine | Hypertension | 8.555 | Not Yet tested |
| DB01110 | Miconazole | Fungal infections | 8.626 | Not Relevant |
| DB01131 | Proguanil | Malaria | 8.600 | Computational |
| DB01180 | Rescinnamine | Hypertension | 8.921 | Not Yet tested |
| DB01283 | Lumiracoxib | Osteoarthritis | 8.464 | Not Yet tested |
| DB01418 | Acenocoumarol | Thromboembolic disease | 8.800 | Clinical trial - Observational |
| DB01764 | Dalfopristin | Bacterial infections | 8.595 | Not Yet tested |
| DB03615 | Ribostamycin | NA | 8.395 | Not Yet tested |
| DB04840 | Debrisoquine | Hypertension | 8.713 | Not Yet tested |
| DB04864 | Huperzine A | Alzheimer's disease | 8.852 | Not Yet tested |
| DB04868 | Nilotinib | Leukemia | 8.442 | Experimental |
| DB04931 | Afamelanotide | Phototoxicity | 8.492 | Not Yet tested |
| DB06145 | Spiramycin | Bacterial infections | 8.634 | Computational |
| DB06614 | Peramivir | Influenza A/B virus | 9.018 | Computational |
| DB06616 | Bosutinib | Chronic myelogenous leukemia (CML) | 8.489 | Experimental |
| DB06636 | Isavuconazonium | Aspergillosis; Mucormycosis | 8.313 | Clinical trial - Interventional |
| DB06663 | Pasireotide | Cushing’s disease | 8.480 | Not Yet tested |
| DB06784 | Gallium citrate Ga-67 | Hodgkin's disease, lymphoma, and bronchogenic carcinoma | 8.419 | Not Yet tested |
| DB08912 | Dabrafenib | Melanoma | 8.788 | Computational |
| DB08916 | Afatinib | Metastatic non-small cell lung cancer | 8.391 | Not Yet tested |
| DB08943 | Isoconazole | NA | 8.577 | Not Yet tested |
| DB08995 | Diosmin | NA | 8.394 | Clinical trial - Interventional |
| DB09084 | Benzydamine | Analgesic and anti-inflammatory treatment | 8.720 | Not Yet tested |
| DB09125 | Potassium citrate | Renal tubular acidosis | 8.394 | Not Yet tested |
| DB09157 | Carbon dioxide | Insufflation gas for minimal invasive surgery | 8.619 | Not Relevant |
| DB09335 | Alatrofloxacin | NA | 8.862 | Not Yet tested |
| DB11512 | Dihydrostreptomycin | NA | 8.830 | Not Yet tested |
| DB11574 | Elbasvir | HCV genotypes 1 or 4 | 8.724 | Computational |
| DB11753 | Rifamycin | Traveller's Diarrhea | 8.359 | Computational |
| DB11827 | Ertugliflozin | Type 2 diabetes | 8.522 | Not Yet tested |
| DB11828 | Neratinib | Breast cancer | 8.401 | Not Yet tested |
| DB12095 | Telotristat ethyl | To reduce serotonin levels | 9.135 | Not Yet tested |
| DB12364 | Betrixaban | Venous thromboembolism (VTE) | 9.116 | Computational |
| DB12500 | Fedratinib | Myelofibrosis | 8.438 | Not Yet tested |
| DB12615 | Plazomicin | Complicated Urinary Tract Infections (cUTI) | 8.348 | Not Yet tested |
| DB13100 | Biguanide | NA | 9.221 | Not Yet tested |
| DB13211 | Guanoxan | NA | 9.694 | Not Yet tested |
| DB13520 | Metergoline | NA | 8.704 | Not Yet tested |
| DB13680 | Naftazone | NA | 8.342 | Not Yet tested |
| DB14575 | Eslicarbazepine | NA | 8.318 | Not Yet tested |
| DB14753 | Hydroxystilbamidine | Nonprogressive blastomycosis of the skin and other mycoses | 8.314 | Not Yet tested |
Table represents the ligand, binding affinity, Root Mean Square Deviation (RMSD) value (Å), interacting residues, bond length (Å), type of interactions, as well as interacting domain of Spike protein. N-Terminal Domain (NTD), C-Terminal Domain (CTD), Receptor Binding Domain (RBD)
| DrugBank ID | Ligand | Affinity (kcal/mol) | RMSD (Å) | Interacting residues | Bond length(Å) | Interactions | Interacting domain |
|---|---|---|---|---|---|---|---|
| DB00460 | Verteporfin | −9.5 | 0 | SER-77 | 2.50 | Hydrogen Bond | NTD / CTD (RBD) |
| DB09335 | Alatrofloxacin | −9.1 | 0 | HIS-345 | 3.98 | Hydrogen Bond | CTD (RBD) |
| DB13520 | Metergoline | −8.8 | 0 | LEU-95 | 3.43 | Hydrogen Bond | NTD / CTD (RBD) |
| DB01180 | Rescinnamine | −8.5 | 0 | PHE-40 | 4.98, 5.52 | Hydrogen Bond | NTD / CTD (RBD) |
| DB00014 | Goserelin | −8.5 | 0 | ASP-350 | 2.10 | NA | CTD (RBD) |
| DB00007 | Leuprolide | −8.2 | 0 | ARG-273 | 1.40, 2.00, 2.50, 2.88, 3.37 | Hydrogen Bond | NTD / CTD (RBD) |
| DB12095 | Telotristat ethyl | −8 | 0 | TRP-69 | 5.06 | Hydrogen Bond | NTD / CTD (RBD) |
| DB11512 | Dihydrostreptomycin | −7.6 | 0 | GLN-102 | 2.81 | Hydrogen Bond | NTD / CTD (RBD) |
| DB00706 | Tamsulosin | −7.3 | 0 | SER-43 | 2.23, 2.29 | Hydrogen Bond | NTD / CTD (RBD) |
| DB04840 | Debrisoquine | −7.3 | 0 | LEU-95 | 3.43 | Hydrogen Bond | NTD / CTD (RBD) |
| DB00579 | Mazindol | −7.2 | 0 | LEU-95 | 5.12 | Hydrogen Bond | NTD / CTD (RBD) |
| DB04864 | Huperzine A | −7.1 | 0 | PHE-40 | 4.95 | Hydrogen Bond | NTD / CTD (RBD) |
| DB09084 | Benzydamine | −7.1 | 0 | ASP-382 | 8.30 | Hydrogen Bond | CTD (RBD) |
| DB13211 | Guanoxan | −7 | 0 | LEU-95 | 3.70 | Hydrogen Bond | NTD / CTD (RBD) |
| DB00476 | Duloxetine | −6.8 | 0 | LEU-95 | 3.44, 4.66 | Hydrogen Bond | NTD / CTD (RBD) |
Fig. 4The ligands a) Verteporfin, b) Alatrofloxacin, c) Metergoline, d) Rescinnamine, e) Leuprolide, and f) Telotristat ethyl binding the SARS-CoV-2 S-protein. (SARS-CoV-2 S-protein in ribbon diagram with grey color and ligand molecule in green color sphere). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 5Two-dimensional representation of molecular interactions of a) Verteporfin, b) Alatrofloxacin, c) Metergoline, d) Rescinnamine, e) Leuprolide, and f) Telotristat ethyl with the S-protein of SARS-CoV-2.
Fig. 6The overall methodology used in the study. The inhibitors of the Coronaviruses (SARS, SARS-CoV-2, and MERS) were extracted from the literature. Splitting of the dataset into the training/testing and independent validation using randomization approach. The descriptors were calculated using PaDel software followed by the selection of relevant features. The prediction model is developed using machine learning algorithms like Support Vector Machine, Random Forest, k-Nearest Neighbor, Artificial Neural Network, and Deep Neural Network.