| Literature DB >> 31797619 |
Daniel N Sosa1, Alexander Derry, Margaret Guo, Eric Wei, Connor Brinton, Russ B Altman.
Abstract
Millions of Americans are affected by rare diseases, many of which have poor survival rates. However, the small market size of individual rare diseases, combined with the time and capital requirements of pharmaceutical R&D, have hindered the development of new drugs for these cases. A promising alternative is drug repurposing, whereby existing FDA-approved drugs might be used to treat diseases different from their original indications. In order to generate drug repurposing hypotheses in a systematic and comprehensive fashion, it is essential to integrate information from across the literature of pharmacology, genetics, and pathology. To this end, we leverage a newly developed knowledge graph, the Global Network of Biomedical Relationships (GNBR). GNBR is a large, heterogeneous knowledge graph comprising drug, disease, and gene (or protein) entities linked by a small set of semantic themes derived from the abstracts of biomedical literature. We apply a knowledge graph embedding method that explicitly models the uncertainty associated with literature-derived relationships and uses link prediction to generate drug repurposing hypotheses. This approach achieves high performance on a gold-standard test set of known drug indications (AUROC = 0.89) and is capable of generating novel repurposing hypotheses, which we independently validate using external literature sources and protein interaction networks. Finally, we demonstrate the ability of our model to produce explanations of its predictions.Entities:
Mesh:
Year: 2020 PMID: 31797619 PMCID: PMC6937428
Source DB: PubMed Journal: Pac Symp Biocomput ISSN: 2335-6928
Fig. 1:Summary of all themes in GNBR, organized by category along with their reference codes.
Fig. 2:Treatment prediction on gold-standard test set for different submodels, including ROC (left) and PR (right) curves.
Fig. 3:2D UMAP projection of embedded pairs compared to “Treatment”
Fig. 4:(a) Pairwise cosine similarity (ranging from 1.00 to −0.23) between embeddings for each theme, clustered by hierarchical clustering. (b) Precision at various recall levels for drug-disease score predicted by each theme.
Summary of the top 30 drug repurposing candidates. “Score”: the predicted confidence generated by our model; “Proximal in PPI Network?”: indication of significant proximity between drug- and disease-associated genes (Section 3.3); Potential Mediators”: top three genes implicated in path analysis (Section 3.4); “Assessment”: manual designation of treatment (Tx) viability; “PMID”: literature reference supporting interpretation.
| Drug | Disease | Score | Proximal in PPI Network? | Potential Mediators | Assessment | PMID |
|---|---|---|---|---|---|---|
| cortisone | myelodysplastic syndrome | 1.336 | CD34, p53, EPO | published Tx | 23483702 | |
| everolimus | sarcoidosis | 1.196 | ✓ | ACE, LYZ, IL-18 | potentially feasible Tx | 28216612 |
| rifampicin | mesothelioma | 1.168 | hGF, THBD, p53 | comorbidity Tx | 21150470 | |
| citalopram | myeloma | 1.140 | IL-6, BDNF, ABCB1 | comorbidity Tx | 17002797 | |
| streptomycin | meningiomas | 1.125 | N/A | MMP9, p53, VEGF | comorbidity Tx | 23374258 |
| cimetidine | Prader-Willi Syndrome | 1.107 | GH, BDNF, GBP-28 | symptom management | 29685165 | |
| hydroxychloroquine | familial Mediterranean fever | 1.079 | SAA, IL-18, TNF | potentially feasible Tx | 15720245 | |
| capsaicin | non-Hodgkin’s lymphoma | 1.055 | IL-6, IL-2, WT1 | potentially feasible Tx | 12208886 | |
| trifluoperazine | Wilms tumor | 1.053 | ✓ | CTNB1, p53, PD-L1 | potentially feasible Tx | 31058089 |
| amantadine | carcinoid syndrome | 1.052 | N/A | GH, mTOR, MLN | unknown/no effect | — |
| lidocaine | biliary atresia | 1.050 | ✓ | LFA-1, CD4, HAMP | symptom management | 21531533 |
| ketoconazole | acromegaly | 1.047 | GH, INS, IGF1 | unknown/no effect | — | |
| acetazolamide | amyotrophic lateral sclerosis | 1.043 | ✓ | OPTN, GM-CSF, TGF | possible contraindication | 23754387 |
| famotidine | leishmaniasis | 1.036 | N/A | IL-4, TNF | published Tx | 28491373, 27600041 |
| idarubicin | osteosarcoma | 1.034 | ✓ | ABCB1, p53, VEGF | potentially feasible Tx | 20979639 |
| hydroxyurea | MALT lymphoma | 1.032 | N/A | BCL10, MYD88, MYC | potentially feasible Tx | 25904378 |
| citalopram | thymoma | 1.026 | N/A | IL-2, EGFR, PD-L1 | potentially feasible Tx | 28356024 |
| acetazolamide | systemic sclerosis | 1.024 | ✓ | ET-1, VEGF, IL-17 | symptom management | 23541012 |
| cortisone | carcinoid syndrome | 1.021 | GH, HES1, GOT1 | unknown/no effect | — | |
| cortisone | trigeminal neuralgia | 1.017 | N/A | ACTH, VIP | published Tx | 16762570 |
| danazol | adrenocortical carcinoma | 1.011 | ✓ | p53, IGF-2, AGT2 | potentially feasible Tx | 25932386 |
| budesonide | biliary atresia | 1.009 | ✓ | CD4, PCNA, IL-18 | published Tx | 25847799 |
| chloramphenicol | mesothelioma | 1.002 | ✓ | CAT, hGF, p53 | potentially feasible Tx | 24939899 |
| hydroxyurea | familial Mediterranean fever | 1.002 | FMF, SAA, IL-18 | unknown/no effect | — | |
| metoclopramide | giant cell arteritis | 1.000 | ✓ | IL-6, CRP, YKL-40 | symptom management | 21926152 |
| dapsone | sarcoidosis | 0.996 | ✓ | IL-18, CD4, AAT | published Tx | 12588536, 11176663 |
| vinblastine | lymphoproliferative disorders | 0.995 | N/A | BCL6, AID, BCL-2 | published Tx | 17243127 |
| prednisone | biliary atresia | 0.995 | ✓ | LFA-1, CD4, hGF | published Tx | 26590818 |
| acetazolamide | porphyria cutanea tarda | 0.992 | ✓ | INS, EPO | symptom management | 15464657 |
| dextromethorphan | carcinoid syndrome | 0.989 | GRIN1, HES1, mTOR | unknown/no effect | — |
Fig. 5:(a) Examples of each drug-disease path motif. Edges are labeled with their highest-supported themes and corresponding support scores. (b) Distribution of motifs across the six interpretation categories in Table 1 as determined by the occurrence of each motif across the top 100 ranked paths per drug-disease prediction.