| Literature DB >> 33330858 |
Kang-Lin Hsieh, Yinyin Wang, Luyao Chen, Zhongming Zhao, Sean Savitz, Xiaoqian Jiang, Jing Tang, Yejin Kim.
Abstract
Amid the pandemic of 2019 novel coronavirus disease (COVID-19) infected by SARS-CoV-2, a vast amount of drug research for prevention and treatment has been quickly conducted, but these efforts have been unsuccessful thus far. Our objective is to prioritize repurposable drugs using a drug repurposing pipeline that systematically integrates multiple SARS-CoV-2 and drug interactions, deep graph neural networks, and in-vitro/population-based validations. We first collected all the available drugs (n= 3,635) involved in COVID-19 patient treatment through CTDbase. We built a SARS-CoV-2 knowledge graph based on the interactions among virus baits, host genes, pathways, drugs, and phenotypes. A deep graph neural network approach was used to derive the candidate drug’s representation based on the biological interactions. We prioritized the candidate drugs using clinical trial history, and then validated them with their genetic profiles, in vitro experimental efficacy, and electronic health records. We highlight the top 22 drugs including Azithromycin, Atorvastatin, Aspirin, Acetaminophen, and Albuterol. We further pinpointed drug combinations that may synergistically target COVID-19. In summary, we demonstrated that the integration of extensive interactions, deep neural networks, and rigorous validation can facilitate the rapid identification of candidate drugs for COVID-19 treatment. This paper had been uploaded to arXiv : https://arxiv.org/abs/2009.10931.Entities:
Year: 2020 PMID: 33330858 PMCID: PMC7743080 DOI: 10.21203/rs.3.rs-114758/v1
Source DB: PubMed Journal: Res Sq
Figure 1.Study workflow. (a) We collected 27 SARS-CoV-2 baits, 322 host genes interacting with baits, 1,783 host genes on 609 pathways, 3,635 drugs, 4,427 drugs’ targets, and 1,285 phenotypes, and their corresponding interactions from a curated list of COVID-19 literature in CTDbase (24). (b) We built the SARS-CoV-2 knowledge graph with nodes (baits, host genes, drugs, targets, pathways, and phenotypes) and edges (virus-host protein-protein interaction, gene-gene in pathways, drug-target, gene-phenotype, drug-phenotype interaction). (c) We derived the node’s embedding using the multi-relational and variational graph autoencoder (25, 26). We transferred extensive representation in DRKG using transfer learning. (d) We built a drug ranking model based on the drug’s embedding as features and clinical trials as silver-standard labels. (e) The drug ranking was validated using drug’s gene profiles , in vitro drug screening efficacy (6), and large-scale electronic health records. (f) We presented validated drugs with their genetic, mechanistic, and epidemiological evidence. (g) Using the highly ranked drug candidates, we searched for drug combinations that satisfy complementary exposure patterns (13).
Figure 2.(a) SARS-CoV-2 knowledge graph t-SNE plot. Two nodes that have similar embedding are closely located in the t-SNE plot. We highlighted drugs undergoing clinical trials (as of July 23, 2020) to glimpse the promising repurposable drugs around the trial drugs. SARS-Cov-2 baits were the upper-left green hexagons (). Genes, the gray triangles (), were in the middle between baits and drugs. Drugs, the black rounds (), were mixed with genes. Drugs undergoing clinical trials, the purple rounds, were closely located together. Phenotypes, the light brown diamonds (), are closely located relevant genes and drugs. We validated the drug ranking using four different external validation sources including (b) Differentially expressed genes in SARS-CoV-2-infected human lung cells (GSE153970). Potential drugs can treat COVID-19 by inhibiting up-regulated genes or activating down-regulated genes. (c) GSEA score between the infected human lung cell transcriptome and drug-induced transcriptome. (d) in-vitro efficacy (e.g. % inhibition in viral entry and cytopathic effect assays (6)), and (e) treatment effect in EHRs (Optum® de-identified EHR database (2007-2020)).
Accuracy of predicting drugs under COVID-19 clinical trials. The predictors were the drug embedding and labels that were whether a drug is under clinical trials. Logistic Regression, Support Vector Machines, XGBoost, and Random Forest were off-the-shelf models. The neural network is a customized model (Methods). AUROC=area under the receiver operating curve. AUPRC=area under the precision-recall curve.
| Embedding | Ranking models | |||||
|---|---|---|---|---|---|---|
| Logistic | Support | XGBoost | Random | Neural | ||
| SARS-CoV-2 knowledge graph embedding | AUROC | 0.6800 | 0.6915 | 0.7019 | 0.6161 | 0.7628 |
| AUPRC | 0.0604 | 0.1149 | 0.0836 | 0.0940 | 0.1272 | |
| General biomedical knowledge graph embedding from DRKG ( | AUROC | 0.7855 | 0.8332 | 0.8500 | 0.7372 | 0.8512 |
| AUPRC | 0.1183 | 0.1848 | 0.1439 | 0.0790 | 0.1624 | |
| AUROC | 0.8973 | 0.7697 | 0.8934 | 0.7814 | 0.8992 | |
| AUPRC | 0.1965 | 0.1629 | 0.1701 | 0.0916 | 0.2503 | |
External validation of the candidate drugs using in vitro drug screening results and EHRs. N/A=not available. False-negative or true-negative values could not be obtained because the cytopathic effect (ReFRAME) study only reports positive drugs (5). Caution is needed in interpreting the accuracy because the number of overlapping drugs is limited in some studies and, thus, the statistical power is limited.
| Validation | Source | # overlap | # true | # false | # false | # true | Recall | Precision |
|---|---|---|---|---|---|---|---|---|
| Gene profiles | GSEA scores | 580 | 55 | 128 | 128 | 269 | 0.3006 | 0.3006 |
| In-vitro drug screening results | ACE2 enzymatic activity ( | 497 | 25 | 69 | 120 | 283 | 0.2660 | 0.1724 |
| Spike-ACE2 protein-protein interaction ( | 497 | 6 | 22 | 139 | 330 | 0.2143 | 0.0414 | |
| Cytopathic effect (NCATS) ( | 497 | 26 | 33 | 119 | 319 | 0.4407 | 0.1793 | |
| Cytopathic effect (ReFRAME) ( | 13 | 5 | 8 | N/A | N/A | 0.3846 | N/A | |
| Population based | EHRs | 138 | 6 | 4 | 52 | 76 | 0.6 | 0.1035 |
COVID-19 hospitalized patient’s demographics and comorbidities before and after PSM.
| Before matching | After matching | |||
|---|---|---|---|---|
| Recovered | Deceased | Recovered | Deceased | |
| Number of patients | 15,078 | 3,200 | 2,774 | 2,827 |
| Age | ||||
| Mean | 60.10 | 73.78 | 73.64 | 73.24 |
| Standard deviation | 17.63 | 12.81 | 12.95 | 12.86 |
| Sex | ||||
| Male | 7,765 | 1,887 | 1,601 | 1,630 |
| Female | 7,309 | 1,313 | 1,172 | 1,197 |
| Race | ||||
| Caucasians | 7336 | 2031 | 1728 | 1790 |
| African Americans | 4052 | 544 | 511 | 490 |
| Asian Americans | 470 | 113 | 97 | 102 |
| Others | 3,220 | 512 | 438 | 445 |
| Admission conditions | ||||
| Temperature | 36.93 | 37.16 | 37.07 | 37.00 |
| SPO2 | 94.21 | 91.39 | 92.32 | 92.56 |
Figure 3.The interaction among virus baits, host preys, and drug targets. (a) single drugs (b) drug combinations. SARS-Cov-2 baits = green hexagons (). Genes = gray triangles () Drugs = black rounds (). The potentially repurposable drugs directly and indirectly target the host gene, which has PPI with the virus baits.
Top 22 promising drugs with supporting evidence and literature. + : positive evidence, −: negative evidence, NA: not investigated. Positive in-vitro efficacy if there is at least one positive efficacy in the four different in-vitro experiments. Full list in Table S3.
| Drug name | Treated for | Targets | GSEA | In-vitro | EHRs | Clinica | Supporting |
|---|---|---|---|---|---|---|---|
| Azithromycin | Anti-infection | 23 s ribosome of bacteria | + | + | + | + | ( |
| Hydroxychloroquine | Immunosuppressive drug, Anti-parasite | TLR-7, TLR-9, ACE2 | NA | + | + | + | ( |
| Atorvastatin | Lipid-lowering | HMG-CoA Inhibitor | + | NA | + | + | ( |
| Acetaminophen | Pain, fever | PGE-3, COX-1, COX-2 | NA | + | + | + | NA |
| Aspirin | Pain, fever | COX-1, COX-2 | − | − | + | + | NA |
| Albuterol | Anti-asthma | beta-2-agonist | NA | − | + | − | NA |
| Melatonin | Sleep awake cycle | Melatonin receptor | + | + | − | + | ( |
| Sirolimus | Immunomodulatory | mTOR | + | − | NA | + | ( |
| Nifedipine | Anti-hypertension | Calcium channel | + | + | − | + | ( |
| Ribavirin | Anti-HCV | IMP-synthesis | NA | + | NA | + | ( |
| Chloroquine | Immunosuppressive drug, Anti-parasite | TNF, TLR-9, ACE2 | NA | + | NA | + | ( |
| Lopinavir | Anti-HIV | HIV-protease | NA | + | NA | + | ( |
| Teicoplanin | Anti-infection | peptidoglycan | NA | + | NA | + | ( |
| Remdesivir | Ebola, COVID-19 | RNA polymerase | NA | + | − | + | ( |
| Ivermectin | Anti-parasite | Glycine receptor subunit alpha-3 | NA | + | NA | + | ( |
| Amlodipine | Anti-hypertension | Calcium channel | + | + | − | + | ( |
| Celecoxib | Anti-inflammatory | CoX2 | + | + | NA | + | ( |
| Isotretinoin | Anti-cancer | Vitamin A derivative | + | + | NA | + | ( |
| Chlorpromazine | Antipsychotic | D1/D2 receptor | + | + | NA | + | ( |
| Itraconazole | Anti-fungus | Lanosterol 14-alpha demethylase | + | + | NA | + | ( |
| Progesterone | Hormone replacement | Progesterone receptor | + | + | NA | + | ( |
| Tenofovir | Anti-HIV | Reverse transcriptase | + | NA | NA | + | ( |
Drug combinations that satisfy the complementary exposure pattern from the top 30 drugs (72). COVID-19 genes were defined as the host genes that have PPIs with SARS-CoV-2 baits. The full list in Table S4.
| Drug A | Drug B | # COVID-19 | # COVID-19 | # COVID-19 genes |
|---|---|---|---|---|
| Etoposide | Sirolimus | 2 | 22 | 24 |
| Mefloquine | Sirolimus | 1 | 22 | 23 |
| Losartan | Ribavirin | 12 | 6 | 18 |
| Hydroxychloroquine | Melatonin | 4 | 10 | 14 |
| Etoposide | Losartan | 2 | 12 | 14 |
| Acetaminophen | Chloroquine | 3 | 11 | 14 |
| Losartan | Mefloquine | 12 | 1 | 13 |
| Chloroquine | Lopinavir | 11 | 2 | 13 |
| Chloroquine | Atorvastatin | 11 | 2 | 13 |
| Acetaminophen | Melatonin | 3 | 10 | 13 |