| Literature DB >> 34179346 |
Vincent K C Yan1, Xiaodong Li2, Xuxiao Ye1, Min Ou2, Ruibang Luo2, Qingpeng Zhang3, Bo Tang4, Benjamin J Cowling5, Ivan Hung6, Chung Wah Siu7, Ian C K Wong1,8,9,10,11, Reynold C K Cheng2, Esther W Chan1,8,9,10.
Abstract
Identifying effective drug treatments for COVID-19 is essential to reduce morbidity and mortality. Although a number of existing drugs have been proposed as potential COVID-19 treatments, effective data platforms and algorithms to prioritize drug candidates for evaluation and application of knowledge graph for drug repurposing have not been adequately explored. A COVID-19 knowledge graph by integrating 14 public bioinformatic databases containing information on drugs, genes, proteins, viruses, diseases, symptoms and their linkages is developed. An algorithm is developed to extract hidden linkages connecting drugs and COVID-19 from the knowledge graph, to generate and rank proposed drug candidates for repurposing as treatments for COVID-19 by integrating three scores for each drug: motif scores, knowledge graph PageRank scores, and knowledge graph embedding scores. The knowledge graph contains over 48 000 nodes and 13 37 000 edges, including 13 563 molecules in the DrugBank database. From the 5624 molecules identified by the motif-discovery algorithms, ranking results show that 112 drug molecules had the top 2% scores, of which 50 existing drugs with other indications approved by health administrations reported. The proposed drug candidates serve to generate hypotheses for future evaluation in clinical trials and observational studies.Entities:
Keywords: COVID‐19; drug repurposing; knowledge graph; motif scores; ranking
Year: 2021 PMID: 34179346 PMCID: PMC8212091 DOI: 10.1002/adtp.202100055
Source DB: PubMed Journal: Adv Ther (Weinh) ISSN: 2366-3987
Figure 1Structure of the COVID‐19 knowledge graph. Visual schematic of the COVID‐19 knowledge graph in this study. A knowledge graph is a multi‐relational graph composed of entities (nodes) and relations (edges). Each node represents a specific protein, gene, drug, virus, disease or symptom, whereas each edge represents a known existing linkage between any two nodes. Data on linkages from different data sources were processed into the corresponding nodes and edges.
Data sources used for inferring edges in the COVID‐19 knowledge graph
| Edges | Data sources | Size |
|---|---|---|
| Drug – Virus Protein | OpenKG | 20 |
| Drug – Disease | HPO, DrugBank | 2335 |
| Drug – Symptom | HPO, DrugBank | 11 730 |
| Drug – Host Protein | DrugBank, NCBI | 13 749 |
| Disease – Symptom | HPO | 187 342 |
| Host Gene – Host Protein | NCBI, Literature[
| 12 931 |
| Host Gene – Disease | Disgenet | 93 044 |
| Host Gene – Symptom | HPO | 830 344 |
| Host Protein – Host Protein | Uniprot, Biogrid | 169 222 |
| Virus Protein – Virus Protein | Biogrid | 47 |
| Virus Protein – Host Protein | OpenKG | 8292 |
| Virus – Virus | NCBI | 6791 |
| Virus – Disease | OpenKG, HPO | 23 |
| Virus – Symptom | OpenKG, HPO | 70 |
| Virus – Host Protein | Literature[
| 130 |
| Virus – Virus Protein | OpenKG | 525 |
| Virus – Virus Gene | OpenKG | 525 |
| Virus Gene – Virus Protein | OpenKG | 525 |
a)Size refers to the number of edges (representing a specific type of linkage) in the knowledge graph that were inferred from the corresponding data sources. Details of the data sources were described in the Supporting Information.
Figure 3Performance of the knowledge graph drug repurposing algorithm used in this study.
Figure 2Example of motif‐clique “virus‐protein‐symptom”. The motif‐clique shown consists of 2 human proteins (green circles: NR3C1 and POU1F1) both targeted by a virus (orange circle: SARS‐CoV‐2) and share linkages with 34 symptoms (purple circles: annotated by symptom ID from HPO). This is one of the motif‐cliques extracted from the knowledge graph using motif‐discovery algorithms and corresponds to a motif of interest prespecified by the user (in this case, the “virus‐protein‐symptom” motif).
List of drug candidates for COVID‐19 repurposing proposed by knowledge graph
| Drugs | Drug class | Motif score | PageRank score | Embedding score |
|---|---|---|---|---|
| Ritonavir | Antiretroviral agent, protease inhibitor | 95.80 | 100.00 | 96.51 |
| Lopinavir | Antiretroviral agent, protease inhibitor | 95.79 | 99.99 | 96.71 |
| Pitavastatin | Lipid‐modifying agent, statin | 95.64 | 99.98 | 92.64 |
| Eszopiclone | Hypnotic | 26.98 | 99.97 | 96.73 |
| Zopiclone | Hypnotic | 89.98 | 99.97 | 91.84 |
| Perampanel | Anticonvulsant, AMPA glutamate receptor antagonist | 30.59 | 99.96 | 90.66 |
| Praziquantel | Anthelmintic agent | 91.16 | 99.95 | 96.64 |
| Colistin | Antibiotic | 93.29 | 99.94 | 99.44 |
| Bictegravir | Antiviral agent, integrase inhibitor | 15.43 | 99.93 | 95.56 |
| Nelfinavir | Antiretroviral agent, protease inhibitor | 89.46 | 99.92 | 93.36 |
| Prulifloxacin | Antibiotic, fluoroquinolone | 14.65 | 99.92 | 96.76 |
| Cyclosporine | Immunosuppressant, calcineurin inhibitor | 8.15 | 99.91 | 99.85 |
| Fostamatinib | Spleen tyrosine kinase inhibitor | 97.24 | 99.90 | 81.93 |
| Moexipril | Antihypertensive agent, angiotensin‐converting enzyme inhibitor | 94.24 | 99.89 | 90.33 |
| Pirfenidone | Antifibrotic agent | 59.72 | 99.85 | 89.50 |
| Isosorbide | Antianginal agent, vasodilator | 26.44 | 99.81 | 52.64 |
| Bosutinib | Antineoplastic agent, tyrosine kinase inhibitor | 49.20 | 99.80 | 48.74 |
| Dasatinib | Antineoplastic agent, tyrosine kinase inhibitor | 96.60 | 99.73 | 97.25 |
| Docetaxel | Antineoplastic agent, taxane | 89.56 | 99.68 | 97.55 |
| Lovastatin | Lipid‐modifying agent, statin | 95.73 | 99.65 | 96.45 |
| Simvastatin | Lipid‐modifying agent, statin | 95.71 | 99.65 | 98.72 |
| Atorvastatin | Lipid‐modifying agent, statin | 95.74 | 99.64 | 91.08 |
| Flucytosine | Antifungal agent | 95.69 | 99.60 | 63.87 |
| Cerivastatin | Lipid‐modifying agent, statin | 95.70 | 99.58 | 93.28 |
| Fluvastatin | Lipid‐modifying agent, statin | 95.69 | 99.57 | 93.80 |
| Oxamniquine | Anthelmintic agent | 95.65 | 99.55 | 81.91 |
| Pravastatin | Lipid‐modifying agent, statin | 95.68 | 99.54 | 96.54 |
| Rosuvastatin | Lipid‐modifying agent, statin | 95.72 | 99.54 | 94.77 |
| Miconazole | Antifungal agent, imidazole | 90.72 | 99.49 | 96.37 |
| Ibuprofen | Nonsteroidal anti‐inflammatory drug | 98.40 | 99.48 | 80.73 |
| Ponatinib | Antineoplastic agent, tyrosine kinase inhibitor | 30.44 | 99.47 | 90.64 |
| Estradiol | Hormonal agent, estrogen | 93.46 | 99.41 | 99.68 |
| Cannabidiol | Anticonvulsant, cannabinoid | 29.12 | 99.39 | 85.54 |
| Pentobarbital | Anticonvulsant, barbiturate | 51.68 | 99.37 | 43.95 |
| Amitriptyline | Antidepressant, tricyclic antidepressant | 99.44 | 99.36 | 97.29 |
| Progesterone | Hormonal agent, progestin | 97.29 | 99.34 | 99.34 |
| Temazepam | Hypnotic, benzodiazepine | 88.50 | 99.27 | 92.92 |
| Triazolam | Hypnotic, benzodiazepine | 92.50 | 99.26 | 96.92 |
| Zonisamide | Anticonvulsant | 92.40 | 99.24 | 28.34 |
| Regorafenib | Antineoplastic agent, tyrosine kinase inhibitor | 30.48 | 99.22 | 93.37 |
| Spironolactone | Antihypertensive, aldosterone receptor antagonist | 97.19 | 99.20 | 98.92 |
| Rifampicin | Antibiotic | 91.26 | 99.18 | 98.60 |
| Dexamethasone | Anti‐inflammatory agent, corticosteroid | 97.14 | 99.17 | 99.97 |
| Tamoxifen | Hormonal agent, selective estrogen receptor modulator | 94.37 | 99.13 | 98.96 |
| Mifepristone | Hormonal agent, antiprogestin | 97.23 | 99.12 | 95.30 |
| Clonazepam | Anticonvulsant, benzodiazepine | 91.08 | 99.11 | 99.39 |
| Eribulin | Antineoplastic agent, microtubule inhibitor | 30.69 | 99.07 | 88.32 |
| Paclitaxel | Antineoplastic agent, taxane | 52.66 | 99.02 | 85.58 |
| Diazepam | Anticonvulsant, benzodiazepine | 40.36 | 98.29 | 25.30 |
| Bezafibrate | Lipid‐modifying agent, fibrate | 34.65 | 98.06 | 81.88 |
a)The proposed list of drug candidates comprises 50 existing oral and intravenous drugs with other FDA/EMA‐approved indications that had top 2% PageRank scores among all ranked molecules.
Figure 4Motifs‐of‐interest for drug repurposing used in this study.
A motif, essentially a connected graph of a few nodes and edges, is a fundamental building block of large and complex knowledge graphs. Motifs‐of‐interest are defined depending on the use case (e.g., drug repurposing in our study). After defining the relevant motifs‐of‐interest, motif‐clique discovery algorithms are used to extract subgraphs that match the motifs of interest. Note each type of node only appears once in each motif for better efficiency.
Figure 5Accuracy of linear models (LR and LSVM) and non‐linear models (SVMs except LSVM, and all NNs) used for integrating motif, PageRank and embedding scores. Models are order by increasing complexity from left to right.