| Literature DB >> 32560162 |
Xiguang Qi1, Mingzhe Shen1, Peihao Fan1, Xiaojiang Guo1, Tianqi Wang2, Ning Feng3, Manling Zhang3, Robert A Sweet4,5, Levent Kirisci6, Lirong Wang1.
Abstract
A gene expression signature (GES) is a group of genes that shows a unique expression profile as a result of perturbations by drugs, genetic modification or diseases on the transcriptional machinery. The comparisons between GES profiles have been used to investigate the relationships between drugs, their targets and diseases with quite a few successful cases reported. Especially in the study of GES-guided drugs-disease associations, researchers believe that if a GES induced by a drug is opposite to a GES induced by a disease, the drug may have potential as a treatment of that disease. In this study, we data-mined the crowd extracted expression of differential signatures (CREEDS) database to evaluate the similarity between GES profiles from drugs and their indicated diseases. Our study aims to explore the application domains of GES-guided drug-disease associations through the analysis of the similarity of GES profiles on known pairs of drug-disease associations, thereby identifying subgroups of drugs/diseases that are suitable for GES-guided drug repositioning approaches. Our results supported our hypothesis that the GES-guided drug-disease association method is better suited for some subgroups or pathways such as drugs and diseases associated with the immune system, diseases of the nervous system, non-chemotherapy drugs or the mTOR signaling pathway.Entities:
Keywords: RNA expression regulation; drug repositioning approaches; gene expression signature
Mesh:
Substances:
Year: 2020 PMID: 32560162 PMCID: PMC7357095 DOI: 10.3390/molecules25122776
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
The Gene Expression Omnibus (GEO) series with crowd extracted expression of differential signatures (CREEDS) IDs excluded.
| GEO Series | CREEDS IDs | Excluded CREEDS IDs |
|---|---|---|
| GSE10432 | drug:2772, dz:297 | dz:297 |
| GSE7036 | drug:3292, dz:181 | drug:3292 |
| GSE6264 | drug:3064, dz:582 | drug:3064 |
| GSE38713 | drug:3289, drug:3194, drug:3195, dz:810 | drug:3289, drug:3194, drug:3195 |
| GSE31773 | drug:2485, dz:712, dz:713, dz:714, dz:715 | drug:2485 |
| GSE11393 | drug:3401, drug:3196, dz:773, dz:267 | drug:3401, drug:3196 |
| GSE8157 | drug:2796, dz:880 | drug:2796 |
| GSE13887 | drug:3181, dz:450 | drug:3181, |
| GSE11223 | drug:3294, drug:3287, dz:590, dz:591, dz:593, dz:589, dz:588, dz:587, dz:586, dz:585 | drug:3294, drug:3287 |
| GSE7762 | drug:3288 | drug:3288 |
| GSE3248 | dz:724 | dz:724 |
Figure 1The proportion of data sourced from the crowd extracted expression of differential signatures (CREEDS) database. Numbers of gene signatures are shown in parentheses. “Drug and Disease Signatures Included in the Final Analysis”: The proportion of drug or disease gene signatures enrolled in the final analysis. “Drug and Disease Signatures Extracted from Non-Human Assays”: The proportion of drug or disease gene signatures extracted from non-human assays. “Signatures with Information Mis-Specified”: The proportion of gene signatures with information errors. “Signatures from Same Assays but Labelled as Both”: The proportion of gene signatures excluded because of both drug and disease sourcing from the same assay. “Drug and Disease Signatures Because of Indication Not Found”: The proportion of gene signatures excluded because no FDA-labelled indication of a relationship was found for the drug or disease (including drugs not approved by FDA).
Figure 2The subgroups proportion of unique 167 indicated drug–disease pairs of different categories. (a) Disease classification. NEO: neoplasms, DMSCT: diseases of the musculoskeletal system or connective tissue, DS: diseases of the skin, CIPD: certain infectious or parasitic diseases, DIS: diseases of the immune system, ENMD: endocrine, nutritional or metabolic diseases, DBBO: diseases of the blood or blood-forming organs, DRS: diseases of the respiratory system, DNS: diseases of the nervous system, DDS: diseases of the digestive system, DCS: diseases of the circulatory system. (b) Drug target. GLUR: glucocorticoid receptor, DNAtopo: DNA/topoisomerase-human, TYRK: tyrosine kinase, DNAclak: DNA cross-linking/alkylation, CYC: cyclooxygenase, DNAlig: DNA/ligase, TOPOI: topoisomerase-non-human, INTR: interferon receptor, MICROT: microtubules, NUCS: nucleotide synthesis, TNF: tumor necrosis factor. (c) TF (transcription factor) level. “Directly”: drugs with TFs as its main therapeutic targets. “Not-directly” indicates drugs with main therapeutic targets which are human DNA structures or human proteins but not TFs. “Non-Human” represent drugs interacting with protein or structures of non-human (for example, from virus or bacterial) as main therapeutic targets. (d) Chemotherapy. “YES” or “NO” indicates the drug is a chemotherapy drug or not. (e) ATC classification. CORTI: corticosteroids for systemic use, plain, OAA: other antineoplastic agents, CYTOANTIB: cytotoxic antibiotics and related substances, ANTIME: antimetabolites, IMMSUP: immunosuppressants, NSAAP: anti-inflammatory and antirheumatic products, non-steroids, HAARA: hormone antagonists and related agents, QUINA: quinolone antibacterial, IMMSTI: immunostimulants, PAAAONP: plant alkaloids and other natural products, ALKA: alkylating agents.
Figure 3The distribution of signed Jaccard index in the indication group and the control group.
Figure 4The average signed Jaccard index score of unique indicated drug–disease pairs split by different categories of subgroups. ** indicates FDR Q < 0.01, * indicates FDR Q < 0.05. (a) ATC classification. ADRI: adrenergics, inhalants, AAPS: anti-acne preparations for systemic use, EIBGLD: blood glucose-lowering drugs, excluding insulins, DAA: direct acting antivirals, ESTR: estrogens, INS: insulins and analogues, LMA: lipid modifying agents, plain, ODP: other dermatological preparations, TET: tetracyclines, VITAD: vitamins A and D, including combinations of the two. CORTI, OAA, CYTOANTIB, ANTIME, IMMSUP, NSAAP, HAARA, QUINA, IMMSTI, PAAAONP, ALKA, see Figure 1 legend. (b) Chemotherapy. “YES” or “NO” indicates the drug is a chemotherapy drug or not. (c) Disease classification. See Figure 1 for abbreviations. (d) Target. 16S: 16S ribosomal RNA, ACRT: aminoimidazole caboxamide ribonucleotide transformylase, AMPAPK: AMP-activated protein kinase, ADGR: androgen receptor, BETAR: beta adrenergic receptor, CD20: CD20 antigen, CYP: cytochromes P450, DAAD: delta-aminolevulinic acid dehydratase, DNMT: DNA/methyltransferase, DNApo: DNA/polymerase, ESR: estrogen receptor, HMG-CoAR: HMG-CoA reductase, I5MD: inosine-5’-monophosphate dehydrogenase, INSR: insulin receptor, mTOR: kinase mTOR, PPAR: peroxisome proliferator-activated receptors, PSB: proteasome subunit beta, RAR: retinoic acid receptor, B-raf: serine/threonine-protein kinase B-raf, THYS: thymidylate synthase, D3: vitamin D3 receptor; GLUR, DNAtopo, TYRK, DNAclak, CYC, DNAlig, TOPOI, INTR, MICROT, NUCS, TNF see Figure 1 legend. (e) TF (transcription factor) level. “Directly”: drugs with TFs as their main therapeutic targets. “Not-directly” indicates drugs with main therapeutic targets which are human DNA structures or human proteins but not TFs. “Non-Human” represents drugs interacting with non-human proteins or structures (for example, from viruses or bacteria) as main therapeutic targets.
Subgroups of generalized linear model (GLM) least squares mean partitions F tests results.
| Classification Category | Subgroups | Average SJI of Indicated Pairs ± SD | N | Average SJI of Control Pairs ± SD | N | Q value |
|---|---|---|---|---|---|---|
| Disease classification | Diseases of the blood or blood-forming organs | −0.02368 ± 0.03746 | 6 | 0.00075 ± 0.02470 | 138 | 0.01322 |
| Diseases of the nervous system | −0.03264 ± 0.03648 | 4 | −0.00054 ± 0.01528 | 92 | 0.00704 | |
| Drug target classification | Interferon receptor | −0.02314 ± 0.03866 | 5 | 0.00916 ± 0.02849 | 115 | 0.00110 |
| Kinase mTOR | −0.05846 ± ---------- | 1 | 0.00353 ± 0.01580 | 23 | 0.01755 | |
| Chemotherapy classification | Chemotherapy drugs | 0.00048 ± 0.00894 | 47 | −0.00022 ± 0.01221 | 1049 | 0.99509 |
| Non-chemotherapy drugs | −0.00556 ± 0.02026 | 120 | −0.00086 ± 0.01872 | 2760 | 0.03937 | |
| ATC classification | Immunostimulants | −0.02314 ± 0.03866 | 5 | 0.00916 ± 0.02849 | 115 | 0.00110 |
| Other dermatological preparations | −0.05846 ± ---------- | 1 | −0.00353 ± 0.01580 | 23 | 0.01755 | |
| Transcription factor level | Directly | −0.00433 ± 0.02310 | 60 | 0.00070 ± 0.01671 | 1378 | 0.22309 |
| Not-directly | −0.00344 ± 0.01443 | 98 | −0.00116 ± 0.01785 | 2224 | 0.99509 | |
| Non-Human | −0.00533±0.01574 | 9 | −0.00057 ± 0.01627 | 207 | 0.79080 |
Important subgroups or subgroups with false discover rate (FDR) q-value lower than 0.05 from GLM least squares mean partitions F tests for signed Jaccard index differences between drug-indicted disease pairs and random drug–disease pairs. “----------” indicates that subgroups only have one unique drug–disease pair sample with no standard deviation.
Top 5% genes with relatively expression probability (G).
| Gene |
| Gene |
| Gene |
| Gene |
|
|---|---|---|---|---|---|---|---|
| MX1 | −46.87% | FTL | −25.22% | USP18 | −19.56% | DUSP6 | −16.90% |
| IFIT3 | −41.45% | RPL24 | −25.18% | CERS2 | −19.38% | TPT1 | −16.66% |
| NME1 | −40.50% | ERP29 | −23.86% | RPLP0 | −19.36% | RSAD2 | −16.59% |
| RPL3 | −39.19% | RSL24D1 | −23.86% | KLRB1 | −19.28% | ADAR | −16.48% |
| RPS5 | −37.61% | PTMA | −23.65% | ADM | −19.23% | DDX58 | −16.44% |
| RPL6 | −36.57% | HLA-DRA | −22.88% | PLSCR1 | −19.23% | APOBEC3A | −16.40% |
| MT1HL1 | −35.52% | IFIT1 | −22.22% | RPLP0P6 | −19.14% | PPIB | −16.17% |
| MT2A | −34.80% | MX2 | −22.22% | RPS3A | −19.07% | RGS2 | −16.09% |
| RPSA | −33.55% | LDHB | −22.12% | TRIM22 | −19.00% | IRF7 | −16.08% |
| TGFBI | −33.47% | DYNLT1 | −21.90% | DDX21 | −18.66% | PSMA6 | −16.00% |
| MT1X | −32.30% | ALDH1A1 | −21.64% | GCH1 | −18.64% | RPL9 | −15.94% |
| HERC5 | −32.15% | HSPA1A | −21.53% | GAPDH | −18.55% | OAS1 | −15.91% |
| FAU | −31.82% | SLC25A5 | −21.53% | OAS3 | −18.48% | RPL31 | −15.74% |
| PLS3 | −29.66% | IFIT2 | −21.38% | RPS25 | −18.40% | PTTG1IP | −15.74% |
| HLA-A | −29.15% | RPS4X | −21.28% | NDUFB11 | −18.40% | BIRC2 | −15.74% |
| RPL22 | −28.88% | EIF3E | −20.88% | SNHG6 | −18.15% | MYD88 | −15.67% |
| FBL | −28.52% | HMGN2 | −20.88% | PSAT1 | −18.06% | RPS14P3 | −15.64% |
| RPS8 | −27.57% | FTH1P5 | −20.80% | IER2 | −18.02% | FTH1 | −15.62% |
| ISG15 | −26.91% | YWHAZ | −20.72% | UXT | −17.65% | C4orf46 | −15.45% |
| EEF1B2 | −26.88% | PFDN5 | −20.57% | PARP12 | −17.58% | PPT1 | −15.42% |
| PHB2 | −26.48% | TMA7 | −20.20% | MAFB | −17.40% | YBX1 | −15.33% |
| MT1H | −26.29% | CCT7 | −20.12% | LYZ | −17.25% | ||
| RPL8 | −26.11% | OASL | −19.89% | NARS | −17.15% | ||
| ATF4 | −25.36% | SNHG5 | −19.64% | AKR1B1 | −17.02% |
Top 10 significant biological pathways according to high relatively expression probability genes.
| Ingenuity Canonical Pathways | -log(p-value) | Ratio | Genes Overlapped with Datasets |
|---|---|---|---|
| EIF2 Signaling | 16.50 | 8.02% (17/212) | ATF4, EIF3E, FAU, RPL22, RPL24, RPL3, RPL31, RPL6, RPL8, RPL9, RPLP0, RPS25, RPS3A, RPS4X, RPS5, RPS8, RPSA |
| Activation of IRF by Cytosolic Pattern Recognition Receptors | 6.60 | 9.84% | ADAR, DDX58, IFIT2, IRF7, ISG15, PPIB |
| Regulation of eIF4 and p70S6K Signaling | 6.48 | 5.23% | EIF3E, FAU, RPS25, RPS3A, RPS4X, RPS5, RPS8, RPSA |
| Interferon Signaling | 6.34 | 13.90% | IFIT1, IFIT3, ISG15, MX1, OAS1 |
| mTOR Signaling | 5.57 | 3.96% | EIF3E, FAU, RPS25, RPS3A, RPS4X, RPS5, RPS8, RPSA |
| NRF2-mediated Oxidative Stress Response | 3.80 | 3.23% | ATF4, CCT7, ERP29, FTH1, FTL, PPIB |
| Role of Pattern Recognition Receptors in Recognition of Bacteria and Viruses | 3.39 | 3.47% | DDX58, IRF7, MYD88, OAS1, OAS3 |
| Neuroinflammation Signaling Pathway | 2.78 | 2.06% | ATF4, BIRC2, HLA-A, HLA-DRA, IRF7, MYD88 |
| SPINK1 General Cancer Pathway | 2.63 | 4.92% | MT1H, MT1X, MT2A |
| Systemic Lupus Erythematosus in B Cell Signaling Pathway | 2.23 | 1.89% | IFIT2, IFIT3, IRF7, ISG15, MYD88 |
Top 10 pathways and their function labels.
| Ingenuity Canonical Pathways | Function | Reference |
|---|---|---|
| EIF2 Signaling | Immune Responses | [ |
| Activation of IRF by Cytosolic Pattern Recognition Receptors | Regulate Interferon | [ |
| Regulation of eIF4 and p70S6K Signaling | Inflammatory | [ |
| Interferon Signaling | Immune Responses | [ |
| mTOR Signaling | Immune Responses | [ |
| NRF2-mediated Oxidative Stress Response | Antioxidant Response | [ |
| Role of Pattern Recognition Receptors in Recognition of Bacteria and Viruses | Regulate Interferon | [ |
| Neuroinflammation Signaling Pathway | Inflammatory | [ |
| SPINK1 General Cancer Pathway | Cancer Diagnose | [ |
| Systemic Lupus Erythematosus in B Cell Signaling Pathway | Inflammatory | [ |
Figure 5The flow chart of drug and disease gene signature data inclusion process. Numbers of gene signatures left in each step are shown in parentheses: (Number of drug signatures/Number of disease signatures) 1.1. and 1.2. All manual gene signatures retrieved from the CREEDS database. 2. Remove all signatures with assays not labelled as human. 3. Remove all drug signatures not from FDA-approved drugs. 4. Remove signatures with information errors or signatures labelled as both for a drug treatment and for a disease. 5. Remaining drug signatures were paired with each disease signature. 6. Remove signatures with no FDA-labelled indication relationships of drug or disease. 7. Indicated group and control group were divided according to the indication relationship from the FDA drug label. 8. Calculate the signed Jaccard index for each remaining drug–disease pair.