| Literature DB >> 34848761 |
Kanglin Hsieh1, Yinyin Wang2, Luyao Chen1, Zhongming Zhao3, Sean Savitz4, Xiaoqian Jiang1, Jing Tang2, Yejin Kim5.
Abstract
Since the 2019 novel coronavirus disease (COVID-19) outbreak in 2019 and the pandemic continues for more than one year, a vast amount of drug research has been conducted and few of them got FDA approval. Our objective is to prioritize repurposable drugs using a pipeline that systematically integrates the interaction between COVID-19 and drugs, deep graph neural networks, and in vitro/population-based validations. We first collected all available drugs (n = 3635) related to COVID-19 patient treatment through CTDbase. We built a COVID-19 knowledge graph based on the interactions among virus baits, host genes, pathways, drugs, and phenotypes. A deep graph neural network approach was used to derive the candidate drug's representation based on the biological interactions. We prioritized the candidate drugs using clinical trial history, and then validated them with their genetic profiles, in vitro experimental efficacy, and population-based treatment effect. We highlight the top 22 drugs including Azithromycin, Atorvastatin, Aspirin, Acetaminophen, and Albuterol. We further pinpointed drug combinations that may synergistically target COVID-19. In summary, we demonstrated that the integration of extensive interactions, deep neural networks, and multiple evidence can facilitate the rapid identification of candidate drugs for COVID-19 treatment.Entities:
Mesh:
Year: 2021 PMID: 34848761 PMCID: PMC8632883 DOI: 10.1038/s41598-021-02353-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Study workflow. (a) We collected 27 SARS-CoV-2 baits, 322 host genes interacting with baits, 1783 host genes on 609 pathways, 3635 drugs, 4427 drugs’ targets, and 1285 phenotypes, and their corresponding interactions from a curated list of COVID-19 literature in CTDbase. (b) We built the COVID-19 knowledge graph with nodes (baits, host genes, drugs,targets, pathways, and phenotypes) and edges (virus–host protein–protein interaction, gene–gene in pathways, drug-target, gene-phenotype, drug-phenotype interaction). (c) We derived the node’s embedding using the multi-relational and variational graph autoencoder[20]. We transferred extensive representation in DRKG using transfer learning. (d) We built a drug ranking model based on the drug’s embedding as features and clinical trials as silver-standard labels. (e) The drug ranking was validated using drug’s gene profiles, in vitro drug screening efficacy[8], and large-scale electronic health records. (f) We presented validated drugs with their genetic, mechanistic, and epidemiological evidence. (g) Using the highly ranked drug candidates, we searched for drug combinations that satisfy complementary exposure patterns[17].
Figure 2(a) COVID-19 knowledge graph t-SNE plot. Two nodes that have similar embedding are closely located in the t-SNE plot. We highlighted drugs undergoing clinical trials (as of July 23, 2020) to glimpse the promising repurposable drugs around the trial drugs. SARS-Cov-2 baits were the upper-left green hexagons. Genes, the gray triangles, were in the middle between baits and drugs. Drugs, the black rounds, were mixed with genes. Drugs undergoing clinical trials, the purple rounds, were closely located together. Phenotypes, the light brown diamonds, were closely located relevant genes and drugs. An interactive plot for a closer look is available in Fig. S2b–e we validated the drug ranking using four different external validation sources including. (b) Differentially expressed genes in SARS-CoV-2-infected human lung cells (GSE153970). (c) GSEA score between the infected human lung cell transcriptome and drug-induced transcriptome. (d) In vitro efficacy (e.g. % inhibition in viral entry and cytopathic effect assays[8]), and (e) treatment effects in EHRs. Figure (c) was created by Plotly[22] (https://plotly.com/).
Accuracy of predicting drugs under COVID-19 clinical trials.
| Embedding methods | Evaluation metrics | Ranking models | ||||
|---|---|---|---|---|---|---|
| Logistic regression | Support vector machines | XGBoost | Random forest | Neural network ranking | ||
| COVID-19 knowledge graph embedding | AUROC | 0.6800 | 0.6915 | 0.7019 | 0.6161 | 0.7628 |
| AUPRC | 0.0604 | 0.1149 | 0.0836 | 0.0940 | 0.1272 | |
| General biomedical knowledge graph embedding from DRKG[ | AUROC | 0.7855 | 0.8332 | 0.8500 | 0.7372 | 0.8512 |
| AUPRC | 0.1183 | 0.1848 | 0.1439 | 0.0790 | 0.1624 | |
| COVID-19 knowledge graph embedding + general embedding (proposed) | AUROC | 0.8973 | 0.7697 | 0.8934 | 0.7814 | 0.8992 |
| AUPRC | 0.1965 | 0.1629 | 0.1701 | 0.0916 | 0.2503 | |
The predictors were the drug embedding and labels that were whether a drug is under clinical trials. Logistic Regression, Support Vector Machines, XGBoost, and Random Forest were off-the-shelf models. The neural network is a customized model (Methods). AUROC area under the receiver operating curve, AUPRC area under the precision-recall curve.
External validation of the candidate drugs using in vitro drug screening results and EHRs.
| Validation type | Source | # overlap drugs | # true positives (TP) | # false positives (FP) | # false negatives (FN) | # true negatives (TN) | Recall TP/(TP + FP) | Precision TP/(TP + FN) |
|---|---|---|---|---|---|---|---|---|
| Gene profiles | GSEA scores | 580 | 55 | 128 | 128 | 269 | 0.3006 | 0.3006 |
| ACE2 enzymatic activity[ | 497 | 25 | 69 | 120 | 283 | 0.2660 | 0.1724 | |
| Spike-ACE2 protein–protein interaction[ | 497 | 6 | 22 | 139 | 330 | 0.2143 | 0.0414 | |
| Cytopathic effect (NCATS)[ | 497 | 26 | 33 | 119 | 319 | 0.4407 | 0.1793 | |
| Cytopathic effect (ReFRAME)[ | 13 | 5 | 8 | N/A | N/A | 0.3846 | N/A | |
| Population based | EHRs | 138 | 6 | 4 | 52 | 76 | 0.6 | 0.1035 |
N/A not available. False-negative or true-negative values could not be obtained because the cytopathic effect (ReFRAME) study only reports positive drugs[7]. Caution is needed in interpreting the accuracy because the number of overlapping drugs is limited in some studies and, thus, the statistical power is limited.
Figure 3The interaction among virus baits, host preys, and drug targets. (a) Single drugs, (b) drug combinations. SARS-Cov-2 baits = green hexagons. Genes = gray triangles, Drugs = black rounds. The potentially repurposable drugs directly and indirectly target the host gene, which has PPI with the virus baits. Both figures are created by Cytoscape[44]. (https://cytoscape.org/).
Top 22 promising drugs with supporting evidence and literature.
| Drug name | Treated for | Targets | GSEA score | In vitro efficacy | Treatment effects in EHRs | Clinical trials | Supporting literature |
|---|---|---|---|---|---|---|---|
| Azithromycin | Anti-infection | 23 s ribosome of bacteria | + | + | + | + | [ |
| Hydroxy-chloroquine | Immunosuppressive drug, Anti-parasite | TLR-7, TLR-9, ACE2 | NA | + | + | + | [ |
| Atorvastatin | Lipid-lowering | HMG-CoA inhibitor | + | NA | + | + | [ |
| Acetaminophen | Pain, fever | PGE-3, COX-1, COX-2 | NA | + | + | + | NA |
| Aspirin | Pain, fever | COX-1, COX-2 | − | − | + | + | NA |
| Albuterol | Anti-asthma | beta-2-agonist | NA | − | + | − | NA |
| Melatonin | Sleep awake cycle | Melatonin receptor | + | + | − | + | [ |
| Sirolimus | Immunomodulatory | mTOR | + | − | NA | + | [ |
| Nifedipine | Anti-hypertension | Calcium channel | + | + | − | + | [ |
| Ribavirin | Anti-HCV | IMP-synthesis | NA | + | NA | + | [ |
| Chloroquine | Immunosuppressive drug, Anti-parasite | TNF, TLR-9, ACE2 | NA | + | NA | + | [ |
| Lopinavir | Anti-HIV | HIV-protease | NA | + | NA | + | [ |
| Teicoplanin | Anti-infection | peptidoglycan | NA | + | NA | + | [ |
| Remdesivir | Ebola, COVID-19 | RNA polymerase | NA | + | − | + | [ |
| Ivermectin | Anti-parasite | Glycine receptor subunit alpha-3 | NA | + | NA | + | [ |
| Amlodipine | Anti-hypertension | Calcium channel | + | + | − | + | [ |
| Celecoxib | Anti-inflammatory | CoX2 | + | + | NA | + | [ |
| Isotretinoin | Anti-cancer | Vitamin A derivative | + | + | NA | + | [ |
| Chlorpromazine | Antipsychotic | D1/D2 receptor | + | + | NA | + | [ |
| Itraconazole | Anti-fungus | Lanosterol 14-alpha demethylase | + | + | NA | + | [ |
| Progesterone | Hormone replacement | Progesterone receptor | + | + | NA | + | [ |
| Tenofovir | Anti-HIV | Reverse transcriptase | + | NA | NA | + | [ |
+ positive evidence, − negative evidence, NA not investigated. Positive in vitro efficacy if there is at least one positive efficacy in the four different in vitro experiments. Full list in Table S4.
Drug combinations that satisfy the complementary exposure pattern from the top 30 drugs[43].
| Drug A | Drug B | # COVID-19 genes that Drug A hits | # COVID-19 genes that Drug B hits | # COVID-19 genes that either Drug A or B hit |
|---|---|---|---|---|
| Etoposide | Sirolimus | 2 | 22 | 24 |
| Mefloquine | Sirolimus | 1 | 22 | 23 |
| Losartan | Ribavirin | 12 | 6 | 18 |
| Hydroxychloroquine | Melatonin | 4 | 10 | 14 |
| Etoposide | Losartan | 2 | 12 | 14 |
| Acetaminophen | Chloroquine | 3 | 11 | 14 |
| Losartan | Mefloquine | 12 | 1 | 13 |
| Chloroquine | Lopinavir | 11 | 2 | 13 |
| Chloroquine | Atorvastatin | 11 | 2 | 13 |
| Acetaminophen | Melatonin | 3 | 10 | 13 |
COVID-19 genes were defined as the host genes that have PPIs with SARS-CoV-2 baits. The full list in Table S5.