| Literature DB >> 35666246 |
Yi Cong1, Misaki Shintani1, Fuga Imanari1, Naoki Osada1, Toshinori Endo1.
Abstract
Drug repurposing has broad importance in planetary health for therapeutics innovation in infectious diseases as well as common or rare chronic human diseases. Drug repurposing has also proved important to develop interventions against the COVID-19 pandemic. We propose a new approach for drug repurposing involving two-stage prediction and machine learning. First, diseases are clustered by gene expression on the premise that similar patterns of altered gene expression imply critical pathways shared in different disease conditions. Next, drug efficacy is assessed by the reversibility of abnormal gene expression, and results are clustered to identify repurposing targets. To cluster similar diseases, gene expression data from 262 cases of 31 diseases and 268 controls were analyzed by Uniform Manifold Approximation and Projection for Dimension Reduction followed by k-means to optimize the number of clusters. For evaluation, we examined disease-specific gene expression data for inclusion, body myositis, polymyositis, and dermatomyositis (DM), and used LINCS L1000 characteristic direction signatures search engine (L1000CDS2) to obtain lists of small-molecule compounds that reversed the expression patterns of these specifically altered genes as candidates for drug repurposing. Finally, the functions of affected genes were analyzed by Gene Set Enrichment Analysis to examine consistency with expected drug efficacy. Consequently, we found disease-specific gene expression, and importantly, identified 20 drugs such as BMS-387032, phorbol-12-myristate-13-acetate, mitoxantrone, alvocidib, and vorinostat as candidates for repurposing. These were previously noted to be effective against two of the three diseases, and have a high probability of being effective against the other. That is, inclusion body myositis and DM. The two-stage prediction approach to drug repurposing presented here offers innovation to inform future drug discovery and clinical trials in a variety of human diseases.Entities:
Keywords: big data; bioinformatics; drug development; drug repurposing; drug research and OMICS; machine learning
Mesh:
Year: 2022 PMID: 35666246 PMCID: PMC9245788 DOI: 10.1089/omi.2022.0026
Source DB: PubMed Journal: OMICS ISSN: 1536-2310
Datasets for Cases
| GEO ID | Disease | Tissue |
|---|---|---|
| GSE475 | Chronic obstructive pulmonary disease | Diaphragm muscle |
| GSE593 | Uterine fibroid | Myometrial |
| GSE1297 | Alzheimer's disease | Hippocampal |
| GSE128470 | Dermatomyositis | Muscle |
| GSE1751 | Huntington's disease | Blood |
| GSE1789 | Down syndrome | Heart |
| GSE2712 | Clear cell sarcoma of the kidney | Kidney |
| GSE3365 | Crohn's disease | PBMC |
| GSE3365 | Ulcerative colitis | PBMC |
| GSE5090 | Polycystic ovary syndrome | Omental adipose tissue |
| GSE5667 | Atopic dermatitis | Skin |
| GSE5808 | Acute measles | Peripheral blood |
| GSE7429 | Osteoporosis | Circulating B cell in blood |
| GSE9750 | Cervical cancer | Cervical epithelium |
| GSE9877 | Sickle cell disease | BOEC |
| GSE13785 | Exercise-induced bronchoconstriction | Airways cell |
| GSE15568 | Cystic fibrosis | Rectal mucosal epithelia |
| GSE25724 | Type 2 diabetes | Islet |
| GSE47018 | Polycythemia vera | CD34+ cell |
| GSE55235 | Rheumatoid arthritis | Synovial |
| GSE75415 | Pediatric adrenocortical tumor | Adrenal gland |
| GSE110223 | Colorectal cancer | Colon |
| GSE115810 | Endometrial cancer | Endometrium |
| GSE124646 | Breast cancer | Breast |
| GSE128470 | Polymyositis | Muscle |
| GSE128470 | Inclusion body myositis | Muscle |
| GSE6613 | Parkinson's disease | Blood |
| GSE35487 | IgA nephropathy | Kidney tubular epithelial cell |
| GSE41649 | Allergic asthma | Bronchial |
| GSE43290 | Meningioma | Meningeal |
| GSE55235 | Osteoarthritis | Synovial |
BOEC, blood outgrowth endonuclear cells; GEO, Gene Expression Omnibus; IgA, immunoglobulin A; PBMC, peripheral blood mononuclear cells.
FIG. 1.Clustering of disease data. (a) UMAP analysis of disease-specific gene expression. Different colors represent different diseases, and positions reflect their degree of association. (b) Determination of k-value by Silhouette analysis. The peak of the results of Silhouette analysis provides the optimal k-value (19 in this case). The horizontal axis indicates the number of clusters (k-value) and the vertical axis indicates the degree of deviation of a cluster from its adjacent cluster at that time. The optimal k-value is reflected by the highest point. (c) Clustering by the k-means method. Clustering is shown by distinct colors and numbers were determined by Silhouette analysis. UMAP, Uniform Manifold Approximation and Projection for Dimension Reduction.
FIG. 2.Discovery of differentially expressed genes. (a) The differentially expressed genes for IBM. (b) The differentially expressed genes for PM. (c) The differentially expressed genes for DM. The vertical axis reflects the intentionality of the statistic (−log10[p-value]) and the horizontal axis reflects the magnitude of the change (log2[fold-change]). Red and blue colors indicate positive and negative directions of significant expression changes, and gray indicates changes below the significance level. DM, dermatomyositis; IBM, inclusion body myositis; PM, polymyositis.
FIG. 3.Number of small-molecule compounds for each disease. Numbers in overlapping regions are compounds shared between the disorders.
Breakdown of Common Drugs
|
|
Dark and light yellow shades represent compounds shared by three and only two diseases, respectively.
DM, dermatomyositis; IBM, inclusion body myositis; PM, polymyositis; PMA, phorbol-12-myristate-13-acetate.