| Literature DB >> 33787872 |
Eryk Kropiwnicki1, John E Evangelista1, Daniel J Stein1, Daniel J B Clarke1, Alexander Lachmann1, Maxim V Kuleshov1, Minji Jeon1, Kathleen M Jagodnik1, Avi Ma'ayan1.
Abstract
Understanding the underlying molecular and structural similarities between seemingly heterogeneous sets of drugs can aid in identifying drug repurposing opportunities and assist in the discovery of novel properties of preclinical small molecules. A wealth of information about drug and small molecule structure, targets, indications and side effects; induced gene expression signatures; and other attributes are publicly available through web-based tools, databases and repositories. By processing, abstracting and aggregating information from these resources into drug set libraries, knowledge about novel properties of drugs and small molecules can be systematically imputed with machine learning. In addition, drug set libraries can be used as the underlying database for drug set enrichment analysis. Here, we present Drugmonizome, a database with a search engine for querying annotated sets of drugs and small molecules for performing drug set enrichment analysis. Utilizing the data within Drugmonizome, we also developed Drugmonizome-ML. Drugmonizome-ML enables users to construct customized machine learning pipelines using the drug set libraries from Drugmonizome. To demonstrate the utility of Drugmonizome, drug sets from 12 independent SARS-CoV-2 in vitro screens were subjected to consensus enrichment analysis. Despite the low overlap among these 12 independent in vitro screens, we identified common biological processes critical for blocking viral replication. To demonstrate Drugmonizome-ML, we constructed a machine learning pipeline to predict whether approved and preclinical drugs may induce peripheral neuropathy as a potential side effect. Overall, the Drugmonizome and Drugmonizome-ML resources provide rich and diverse knowledge about drugs and small molecules for direct systems pharmacology applications. Database URL: https://maayanlab.cloud/drugmonizome/.Entities:
Mesh:
Substances:
Year: 2021 PMID: 33787872 PMCID: PMC8011435 DOI: 10.1093/database/baab017
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 4.462
List of drug set libraries served by Drugmonizome
| Resource | Dataset | Drugs | Attributes | Average drugs per term |
|---|---|---|---|---|
| Geneshot | Tagger Predicted Genes | 3938 | 13 882 | 55.60 |
| Geneshot | Enrichr Predicted Genes | 3938 | 11 845 | 62.03 |
| Geneshot | AutoRIF Predicted Genes | 3938 | 11 695 | 66.03 |
| Geneshot | GeneRIF Predicted Genes | 3938 | 9193 | 78.65 |
| Geneshot | Coexpression Predicted Genes | 3938 | 9087 | 78.95 |
| STITCH | Targets_500 | 7303 | 9063 | 89.05 |
| L1000FWD | Downregulated Genes | 4884 | 7622 | 139.10 |
| L1000FWD | Upregulated Genes | 4884 | 7611 | 142.88 |
| Geneshot | Literature Associated Genes | 3938 | 7503 | 37.80 |
| PharmGKB | Predicted Side Effects | 1435 | 7137 | 70.72 |
| CREEDS | Upregulated Genes | 71 | 2535 | 11.67 |
| CREEDS | Downregulated Genes | 72 | 2532 | 11.76 |
| SIDER | Side Effects | 1635 | 2078 | 74.60 |
| L1000FWD | Upregulated GO Biological Processes | 4195 | 1228 | 58.03 |
| L1000FWD | Downregulated GO Biological Processes | 4013 | 1068 | 51.05 |
| L1000FWD | Predicted Side Effects | 4852 | 1013 | 99.34 |
| SIDER | Indications | 1546 | 867 | 21.66 |
| PubChem | PubChem Fingerprints | 13 379 | 669 | 2594.72 |
| DrugBank | Drug Targets | 4467 | 611 | 17.42 |
| PharmGKB | Single Nucleotide Polymorphisms | 483 | 554 | 10.02 |
| DrugCentral | Genes | 1555 | 540 | 19.16 |
| DrugRepurposingHub | Genes | 1720 | 375 | 15.57 |
| ATC | ATC Codes | 2233 | 308 | 9.91 |
| KINOMEscan | Kinases | 54 | 301 | 9.33 |
| L1000FWD | Upregulated KEGG Pathways | 3662 | 245 | 120.58 |
| L1000FWD | Downregulated KEGG Pathways | 3309 | 236 | 87.29 |
| L1000FWD | Upregulated GO Molecular Function | 2427 | 183 | 56.77 |
| RDKit | MACCS Fingerprints | 14 308 | 163 | 4080.18 |
| L1000FWD | Downregulated GO Molecular Function | 2158 | 158 | 48.56 |
| L1000FWD | Downregulated GO Cellular Component | 3246 | 157 | 100.82 |
| DrugRepurposingHub | Mechanisms of Action | 1854 | 154 | 13.37 |
| L1000FWD | Upregulated GO Cellular Component | 3366 | 153 | 101.87 |
| DrugBank | Enzymes | 1473 | 72 | 59.73 |
| DrugBank | Transporters | 832 | 51 | 46.80 |
| DrugBank | Carriers | 458 | 14 | 44.78 |
Figure 1.Counts of unique drug–term associations for each library. Terms are colored by their term type groupings.
Figure 2.The Drugmonizome signature search workflow. A set of drugs is submitted for enrichment analysis across all the Drugmonizome gene set libraries. The enrichment results are provided in tables that enable further exploration of the overlapping drugs.
Figure 3.UpSet plot detailing the overlap among drug hits across 12 independent published in vitro drug screen studies.
Figure 4.Top 20 enriched GO Biological Processes terms for the 12 in vitro SARS-CoV-2 drug screens. Enriched terms are ranked by the sum of the −log(P-value) of the term across all screens. The enriched terms are applied to the consensus downregulated (A) and upregulated (B) genes for each drug in each set based on the data provided from L1000FWD (29).
Figure 5.Drugmonizome-ML classifier for prioritizing drugs that may induce peripheral neuropathy. (A) Input feature space with Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction. Each point represents one of 19 898 compounds with 3026 features per compound. Compounds with the known side effect of peripheral neuropathy are highlighted in yellow. (B) ROC and (C) PRC across cross-validation splits after hyperparameter optimization for each classifier to predict peripheral neuropathy. Each curve shows the mean ROC and standard deviation after 10-fold cross-validation for each classifier.
Top 15 drugs predicted by the ET model that are known to be associated with peripheral neuropathy from SIDER
| InChIKey | Name | Known | Prediction probability |
|---|---|---|---|
| JURKNVYFZMSNLP-UHFFFAOYSA-N | Cyclobenzaprine (BRD-K42348709) | TRUE | 0.8592 |
| KRMDCWKBEZIMAB-UHFFFAOYSA-N | Amitriptyline (BRD-K53737926) | TRUE | 0.8311 |
| MJIHNNLFOKEZEW-UHFFFAOYSA-N | Lansoprazole (BRD-A49172652) | TRUE | 0.7613 |
| ZZVUWRFHKOJYTH-UHFFFAOYSA-N | Diphenhydramine (BRD-K47278471) | TRUE | 0.7153 |
| ZKMNUMMKYBVTFN-HNNXBMFYSA-N | Ropivacaine (BRD-K50938786) | TRUE | 0.582 |
| BCGWQEUPMDMJNV-UHFFFAOYSA-N | Imipramine (BRD-K38436528) | TRUE | 0.5591 |
| WUBBRNOQWQTFEX-UHFFFAOYSA-N | Aminosalicylic acid (BRD-K80267133) | TRUE | 0.4977 |
| YREYEVIYCVEVJK-UHFFFAOYSA-N | Rabeprazole (BRD-A39390670) | TRUE | 0.457 |
| PHTUQLWOUWZIMZ-GZTJUZNOSA-N | Dosulepin (BRD-K54759182) | TRUE | 0.3622 |
| XRECTZIEBJDKEO-UHFFFAOYSA-N | Flucytosine (BRD-K82143716) | TRUE | 0.3463 |
| ODQWQRRAPPTVAG-BOPFTXTBSA-N | Doxepin (BRD-K37694030) | TRUE | 0.3403 |
| UGJMXCAKCUNAIE-UHFFFAOYSA-N | Gabapentin (BRD-K62737565) | TRUE | 0.333 |
| KBOPZPXVLCULAV-UHFFFAOYSA-N | Mesalazine (BRD-K28849549) | TRUE | 0.3244 |
| GBXSMTUPTTWBMN-XIRDDKMYSA-N | Enalapril (BRD-K57545991) | TRUE | 0.3153 |
| HCYAFALTSJYZDH-UHFFFAOYSA-N | Desipramine (BRD-K60762818) | TRUE | 0.3102 |
Top 15 drugs predicted by the ET model that are unknown to be associated with peripheral neuropathy
| InChIKey | Name | Known | Prediction probability |
|---|---|---|---|
| NRUKOCRGYNPUPR-OQMCATNJSA-N | PLX-4720 (BRD-K16478699) | FALSE | 0.9757 |
| NRUKOCRGYNPUPR-OQMCATNJSA-N | Teniposide (BRD-A35588707) | FALSE | 0.9396 |
| STQGQHZAVUOBTE-INJOJONLSA-N | Daunorubicin (BRD-K91966436) | FALSE | 0.8372 |
| VSJKWCGYPAHWDS-FQEVSTJZSA-N | Camptothecin (BRD-K37890730) | FALSE | 0.7782 |
| FPIPGXGPPPQFEQ-OVSJKPMPSA-N | Retinol (BRD-K22429181) | FALSE | 0.7499 |
| LTMKESNXUBQKBP-UHFFFAOYSA-N | Lapatinib (BRD-M07438658) | FALSE | 0.7442 |
| HHJUWIANJFBDHT-KOTLKJBCSA-N | Vindesine (BRD-K59753975) | FALSE | 0.7429 |
| XECQQDXTQRYYBH-UHFFFAOYSA-N | Norcyclobenzaprine (BRD-K63165456) | FALSE | 0.6919 |
| FPIPGXGPPPQFEQ-UHFFFAOYSA-N | Tretinoin (BRD-K64634304) | FALSE | 0.6753 |
| XUBOMFCQGDBHNK-UHFFFAOYSA-N | Gatifloxacin (BRD-A74980173) | FALSE | 0.6338 |
| AJLFOPYRIVGYMJ-INTXDZFKSA-N | Mevastatin (BRD-K94441233) | FALSE | 0.6235 |
| KPQZUUQMTUIKBP-UHFFFAOYSA-N | Secnidazole (BRD-A70083328) | FALSE | 0.5208 |
| METKIMKYRPQLGS-LBPRGKRZSA-N | Atenolol (BRD-K44993696) | FALSE | 0.4875 |
| KGUMXGDKXYTTEY-FRCNGJHJSA-N | 4-Hydroxyretinoic acid (BRD-A96799240) | FALSE | 0.4861 |
| BUJAGSGYPOAWEI-UHFFFAOYSA-N | Tocainide (BRD-A92670106) | FALSE | 0.4753 |