| Literature DB >> 35260634 |
Banabithi Bose1, Matthew Moravec2, Serdar Bozdag3.
Abstract
DNA copy number aberrated regions in cancer are known to harbor cancer driver genes and the short non-coding RNA molecules, i.e., microRNAs. In this study, we integrated the multi-omics datasets such as copy number aberration, DNA methylation, gene and microRNA expression to identify the signature microRNA-gene associations from frequently aberrated DNA regions across pan-cancer utilizing a LASSO-based regression approach. We studied 7294 patient samples associated with eighteen different cancer types from The Cancer Genome Atlas (TCGA) database and identified several cancer-specific and common microRNA-gene interactions enriched in experimentally validated microRNA-target interactions. We highlighted several oncogenic and tumor suppressor microRNAs that were cancer-specific and common in several cancer types. Our method substantially outperformed the five state-of-art methods in selecting significantly known microRNA-gene interactions in multiple cancer types. Several microRNAs and genes were found to be associated with tumor survival and progression. Selected target genes were found to be significantly enriched in cancer-related pathways, cancer hallmark and Gene Ontology (GO) terms. Furthermore, subtype-specific potential gene signatures were discovered in multiple cancer types.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35260634 PMCID: PMC8904490 DOI: 10.1038/s41598-022-07628-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The overview of algorithmic steps used within the miRDriver computational pipeline: GISTIC step, Differential Expression step, REGULATOR step and LASSO step with R functions running on pan-cancer.
TCGA cancer types in the study with cohort sizes in different data modalities and results of miRDriver.
| TCGA cancer | Abbreviation | RSeqa | CNAb | 450Kc | 27Kd | miRNAe | GISTIC regions | Interactions | Fisher | |
|---|---|---|---|---|---|---|---|---|---|---|
| Adrenocortical carcinoma | ACC | 79 | 180 | 80 | 79 | 59 | 4683 | 308 (253–33) | 3.38e−13 | |
| Bladder Urothelial Carcinoma | BLCA | 411 | 810 | 437 | 429 | 126 | 5466 | 578 (416–125) | 4.28e−09 | |
| Breast invasive carcinoma | BRCA | 1102 | 1103 | 895 | 343 | 1165 | 66 | 10,494 | 1852 (776–187) | 17.9e−11 |
| Cervical squamous cell carcinoma and endocervical adenocarcinoma | CESC | 304 | 586 | 312 | 311 | 91 | 4515 | 558 (349–86) | 1.15e−05 | |
| Lymphoid Neoplasm Diffuse Large B-cell Lymphoma | DLBC | 48 | 98 | 48 | 47 | 42 | 3697 | 384 (288–31) | 6.06e−16 | |
| Esophageal carcinoma | ESCA | 161 | 373 | 202 | 195 | 119 | 4961 | 738 (521–92) | 4.35e−10 | |
| Head and Neck squamous cell carcinoma | HNSC | 500 | 1090 | 580 | 565 | 105 | 2591 | 326 (205–75) | 2.47e−08 | |
| Kidney renal clear cell carcinoma | KIRC | 534 | 1067 | 483 | 570 | 100 | 3995 | 586 (501–29) | 1.45e−06 | |
| Acute Myeloid Leukemia | LAML | 151 | 397 | 194 | 418 | 188 | 46 | 3593 | 590 (431–21) | 1.15e−04 |
| Brain Lower Grade Glioma | LGG | 511 | 1021 | 534 | 528 | 87 | 1653 | 226 (151–45) | 1.26e−08 | |
| Liver hepatocellular carcinoma | LIHC | 371 | 767 | 430 | 421 | 107 | 3593 | 316 (224–71) | 1.37e−10 | |
| Lung adenocarcinoma | LUAD | 524 | 1110 | 507 | 150 | 555 | 131 | 4602 | 1172 (747–142) | 1.8e−4 |
| Lung squamous cell carcinoma | LUSC | 501 | 1038 | 412 | 161 | 511 | 131 | 2735 | 449 (266–105) | 1.91e−05 |
| Ovarian serous cystadenocarcinoma | OV | 374 | 573 | 10 | 613 | 486 | 64 | 3117 | 1347 (548–147) | 0.03 |
| Pancreatic adenocarcinoma | PAAD | 177 | 368 | 195 | 182 | 75 | 2918 | 530 (371–55) | 2.81e−12 | |
| Prostate adenocarcinoma | PRAD | 498 | 1038 | 553 | 544 | 95 | 4016 | 266 (239–43) | 9.51e−03 | |
| Thyroid carcinoma | THCA | 502 | 1025 | 571 | 569 | 75 | 1138 | 204 (204–2) | 1.58e−04 | |
| Uterine Corpus Endometrial Carcinoma | UCEC | 547 | 1098 | 485 | 118 | 556 | 174 | 6106 | 1118 (688–152) | 5.73e−09 |
NA not available.
Cohort sizes in aGene expression; bCopy number aberration; c450K DNA methylation; d27K DNA methylation; emiRNA expression datasets. fNo. of DE trans genes used in miRDriver's LASSO step; gNo. of selected interactions with no. of selected DE trans genes and no. of selected miRNAs in the parenthesis; jp-value of two-sided Fisher's exact test for enrichment of oncogenic miRNAs in each cancer type.
Target enrichment.
| Cancer type | Eligible miRNAsa | Significant miRNAsb (%) |
|---|---|---|
| ACC | 4 | 0 |
| BLCA | 6 | 67 |
| BRCA | 59 | 63 |
| CESC | 8 | 88 |
| DLBC | 6 | 83 |
| ESCA | 5 | 60 |
| HNSC | 3 | 67 |
| KIRC | 3 | 67 |
| LAML | 7 | 43 |
| LGG | 2 | 100 |
| LIHC | 7 | 43 |
| LUAD | 4 | 50 |
| LUSC | 3 | 100 |
| OV | 27 | 89 |
| PAAD | 7 | 57 |
| PRAD | 1 | 100 |
| THCA | 1 | 0 |
| UCEC | 11 | 55 |
For fourteen different cancer types, at least 50% of the "Eligible miRNAs" had significantly enriched computed targets in the ground truth data (p-value < 0.05).
aNo. of "Eligible miRNAs" for hypergeometric test for the enrichment of known targets; bpercentage of miRNAs with hypergeometric p-values < 0.05.
Comparison of miRDriver with other methods. We computed the overlapping miRNAs computed by miRDriver and each comparable method.
We checked if the count of the "Significant miRNAs" (i.e., miRNAs with target enrichment test p-value < 0.05) in miRDriver was more (i.e., miRDriver won), less (i.e., miRDriver lost), or equal (i.e., there was a draw) than the other method in the overlap. miRDriver had more "Significant miRNAs" than all other methods for most of the cancer types.
Green—miRDriver won; Red— miRDriver lost; Black—draw.
Comparison results of miRDriver with five other methods in ovarian cancer.
| Method | Input miRNAs | Input genes | Computed miRNAs | Selected genes | Eligible miRNAsa | Overlapping eligible miRNAsb | Method's computed miRNAs in overlap | |
|---|---|---|---|---|---|---|---|---|
| miRDriver | 198 | 2114 | 147 | 354 | 27 | |||
| ARACNe | 198 | 2114 | 196 | 791 | 59 | 27 | 1 | 24 |
| ProMise | 198 | 2114 | 57 | 1938 | 34 | 22 | 0 | 17 |
| hiddenICP | 198 | 2114 | 198 | 2100 | 47 | 21 | 0 | 16 |
| idaFast | 50 | 1500 | 50 | 1194 | 32 | 22 | 0 | 17 |
| jointIDA | 50 | 1500 | 50 | 1294 | 32 | 22 | 0 | 17 |
aEligible miRNAs had at least one known target in the ground truth data; bOverlapping eligible miRNAs were with respect to miRDriver. For miRDriver, the number of significant miRNAs in every overlap with other methods was much higher. NA means not applicable.
Enriched pathways and GO terms in pan-cancer.
| REACTOMEa | KEGGb | GO termsc |
|---|---|---|
Immune System Metabolism Innate Immune System Hemostasis Transport of small molecules Developmental Biology Class A/1 (Rhodopsin-like receptors) G alpha (i) signalling events Neuronal System | Metabolism of Xenobiotics by Cytochrome P450 Steroid Hormone Biosynthesis Retinol Metabolism Drug Metabolism Cytochrome P450 Cytokine-Cytokine Receptor Interaction Systemic Lupus Erythematosus | Ion gated channel activity Gated channel activity Cation channel activity Substrate-specific channel activity Passive transmembrane transporter activity Extracellular matrix Ion channel activity Nucleosome DNA packaging complex Nuclear nucleosome Protein-DNA complex Hormone activity |
The pathways that appeared in more than four cancer types are in bold.
aREACTOME pathways, bKEGG pathways and cGO terms that were found to be enriched in at least two cancer types.
Enriched cancer hallmark terms in pan-cancer for computed target genes.
| Cancer type | Cancer Hallmark Terms | |
|---|---|---|
| ACC | Epithelial Mesenchymal Transition | 0.013 |
| BRCA | Estrogen Response Late | 0.003 |
| BRCA | Estrogen Response Early | 0.017 |
| CESC | KRAS Signaling DN | 0.022 |
| CESC | HEDGEHOG Signaling | 0.031 |
| DLBC | KRAS Signaling DN | 0.013 |
| ESCA | Myogenesis | 0.005 |
| ESCA | Coagulation | 0.007 |
| HNSC | Myogenesis | 0.009 |
| KIRC | E2F Targets | 0.000 |
| KIRC | G2M Checkpoint | 0.000 |
| LAML | KRAS Signaling UP | 0.002 |
| LUAD | KRAS Signaling DN | 0.007 |
| PRAD | Myogenesis | 0.017 |
Twenty two common miRNAs computed by miRDriver in multiple cancer types.
Enriched GO terms with the cancer-related citations in the targets of the common miRNAs in Table 7.
| GO-ID | Description | Adjusted |
|---|---|---|
| GO:000633 | DNA replication-dependent nucleosome assembly[ | 5.25e−07 |
| GO:0006342 | Chromatin silencing[ | 1.17e−03 |
| GO:0006323 | DNA packaging[ | 5.29e−05 |
| GO:0045814 | Negative regulation of gene expression, epigenetic[ | 2.0894e−03 |
| GO:0060964 | Regulation of gene silencing by miRNA[ | 2.362e−02 |
| GO:0060147 | Regulation of post-transcriptional gene silencing[ | 2.767e−02 |
| GO:0048018 | Receptor ligand activity[ | 3.377e−03 |
miRNA-gene interactions computed by miRDriver in multiple cancer types. Cancer type column shows in which cancer types the interactions are present.
| Gene | miRNA | Cancer type |
|---|---|---|
| RSPO3 | miR-22 | LAML,LUAD |
| PAX5 | miR-5699 | BLCA,OV |
| LINC01833 | miR-1226 | BRCA,LGG |
| LINC01697 | miR-5703 | HNSC,UCEC |
| HIST1H4L | miR-3613 | BLCA,LUAD |
| LINC02489 | miR-375 | CESC,OV |
| NR0B1 | miR-346 | HNSC,KIRC |
| GABRG2 | miR-744 | PAAD,UCEC |
| PLAC8 | miR-6510 | CESC,HNSC |
| BPIFC | miR-4469 | LUSC, UCEC |
| RTL3 | miR-26b | CESC,UCEC |
| SLC17A2 | miR-5699 | LUSC,PAAD |
Hallmark term-related target enrichment in cancer driver genes.
| Hallmark | miRNAsa | Targetsb | Overlapc | |
|---|---|---|---|---|
| Complement | 5 | 42 | 3 | 0.018 |
| E2F Targets | 2 | 85 | 4 | 0.026 |
| MTORC1 Signaling | 1 | 12 | 2 | 0.011 |
| Myogenesis | 12 | 44 | 3 | 0.020 |
| P53 Pathway | 1 | 12 | 2 | 0.011 |
| TNFA Signaling via NFKB | 2 | 17 | 2 | 0.021 |
| Pancreas Beta Cells | 2 | 48 | 3 | 0.026 |
aNo. of miRNAs in cancer hallmark term; bNo. of targets in the term; cNo. of overlapping targets in the cancer driver genes; dHypergeometric p-value of the overlap.
miRNA targets with negative LASSO coefficient in different cancer types.
| Cancer type | Target | miRNA |
|---|---|---|
| KIRC | COL1A1 | miR-4728 |
| KIRC | CYSLTR2 | miR-346 |
| KIRC | CYSLTR2 | miR-4728 |
| KIRC | ETV4 | miR-4728 |
| CESC | ISX | miR-5001 |
| UCEC | ISX | miR-2276 |
| UCEC | ISX | miR-4733 |
| UCEC | ISX | miR-6842 |
| PAAD | KCNJ5 | miR-5699 |
| KIRC | NTRK1 | miR-4728 |
| HNSC | OLIG2 | miR-5699 |
Figure 2UMAP plots and confusion matrices are summarizing the classification and clustering of the cancer samples. (A, B) UMAP plots with high-degree target genes in BRCA with baseline and k-means clustering labels, respectively; (C, D) UMAP plots with PAM50 genes in BRCA with baseline and k-means clustering labels, respectively; (E, F) Confusion matrices of subtype-classification in BRCA with F1 scores with respect to the baseline labels, using high-degree target genes and PAM50 genes, respectively. Accuracy and F1 score were closer in both cases; (G) UMAP plot with all target genes using transcriptome-based baseline labels in LGG; (H) UMAP plots with high-degree target genes using expression-based baseline labels in LUSC; (I) UMAP plots with high-degree target genes using mRNA-based clusters[81] as a baseline in PAAD.
Figure 3Adjusted Kaplan–Meier plots with adjusted log-rank test p-value for 18 common miRNAs in high and low expression groups, (A) let-7a-3 in OV with OS; (B) let-7b in PAAD with OS; (C) miR-149 in ACC with PFI; (D) miR-210 in BRCA with OS; (E) miR-31 in KIRC with OS; (F) miR-3187 in HNSC with OS; (G) miR-3664 in PRAD with OS; (H) miR-4777 in LUAD with DFI; (I) miR-4786 in LIHC with OS; (J) miR-3136 in BLCA with PFI; (K) miR-34b in ESCA with PFI; (L) miR-3667 in LUSC with PFI; M) miR-4662a in UCEC with PFI; (N) miR-548k in PRAD with PFI; (O) miR-6510 in PAAD with PFI; (P) miR-4762 in LUSC with DFI; (Q) miR-486 in HNSC with DFI; (R) miR-675 in ACC with PFI.
Cancer-specific miRDriver miRNAs with citation frequency.
| Cancer type | miRNAa | Citationb | Cancer type | miRNAa | Citationb |
|---|---|---|---|---|---|
| LIHC | miR-1288 | 2 | BLCA | miR-3677 | 4 |
| HNSC | miR-134 | 56 | LUSC | miR-3934 | 1 |
| KIRC | miR-194-1 | 1 | BLCA | miR-4791 | 1 |
| UCEC | miR-195 | 197 | BLCA | miR-5003 | 1 |
| KIRC | miR-215 | 69 | UCEC | miR-552 | 12 |
| LIHC | miR-3170 | 1 | HNSC | miR-561 | 6 |
| LUAD | miR-3651 | 1 | PAAD | miR-6875 | 3 |
aThese miRNAs were prognostically significant in survival analysis; bOncoScore citation frequency.
Figure 4Boxplots of absolute values of the natural logarithm of hazard ratios in high-degree and low-degree genes with an r-value of Mann–Whitney test, (A) LUSC with OS, (B) BLCA with DSS, (C) ESCA with DFI, (D) HNSC with OS, (E) LGG with OS, (F) PAAD with OS. These plots show that computed high-degree genes were having higher |ln (Hazard Ratio) (r-value < 0) to predict disease survival and prognosis in cancer patients than low-degree genes.
Cancer types with negative r-values from the aMann-Whitney test between low-degree and high-degree gene groups; bHighly cited high-degree genes in these cancer types in cancer-related literature.
| Cancer type | ar-value | bHigh degree genes |
|---|---|---|
| BLCA | − 0.76 | BTNL3, HNF1A-AS1, MIR1205, NAA11, NOL4, OR10H5, PDZD3 |
| ESCA | − 0.26 | ANKRD26P3, C17orf64, CCDC60, FAM81B, LIN28A, MYLKP1 |
| HNSC | − 1.24 | BTBD17, DNM3OS, KLHL33, SMCO1 |
| KIRC | − 0.03 | HOTTIP |
| LGG | − 0.83 | C20orf85, C7orf65 |
| PAAD | − 0.79 | ARHGAP36, C1QTNF1-AS1, TMPRSS15 |
Availability of clinical variables in TCGA.
Green—Available; Black—Unavailable.