| Literature DB >> 31900418 |
Antonio Colaprico1,2,3, Catharina Olsen4,5,6,7, Matthew H Bailey8,9, Gabriel J Odom10,11, Thilde Terkelsen12, Tiago C Silva10,13, André V Olsen12, Laura Cantini14,15,16,17, Andrei Zinovyev14,15,16, Emmanuel Barillot14,15,16, Houtan Noushmehr13,18, Gloria Bertoli19, Isabella Castiglioni19, Claudia Cava19, Gianluca Bontempi4,5, Xi Steven Chen20,21, Elena Papaleo22,23.
Abstract
Cancer driver gene alterations influence cancer development, occurring in oncogenes, tumor suppressors, and dual role genes. Discovering dual role cancer genes is difficult because of their elusive context-dependent behavior. We define oncogenic mediators as genes controlling biological processes. With them, we classify cancer driver genes, unveiling their roles in cancer mechanisms. To this end, we present Moonlight, a tool that incorporates multiple -omics data to identify critical cancer driver genes. With Moonlight, we analyze 8000+ tumor samples from 18 cancer types, discovering 3310 oncogenic mediators, 151 having dual roles. By incorporating additional data (amplification, mutation, DNA methylation, chromatin accessibility), we reveal 1000+ cancer driver genes, corroborating known molecular mechanisms. Additionally, we confirm critical cancer driver genes by analysing cell-line datasets. We discover inactivation of tumor suppressors in intron regions and that tissue type and subtype indicate dual role status. These findings help explain tumor heterogeneity and could guide therapeutic decisions.Entities:
Mesh:
Year: 2020 PMID: 31900418 PMCID: PMC6941958 DOI: 10.1038/s41467-019-13803-0
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Moonlight data integration and functionalities.
a Data used for discovery of oncogenic mediators and controlling mechanisms of cancer driver genes. b Moonlight pipeline for discovery of tumor suppressors, oncogenes, and dual-role genes.
Fig. 2Moonlight application within breast-cancer case study.
a Barplot from Functional Enrichment Analysis showing the BPs enriched significantly with |Moonlight Process Z-score| > = 1 and FDR < = 0.01; increased levels are reported in yellow, decreased in purple, and green shows the -logFDR/10. A negative Moonlight Process Z-score indicates that the process’ activity is decreased, while a positive Moonlight Process Z-score indicates that the process’ activity is increased. Values in parentheses indicate the number of genes in common between the genes annotated in the biological process and the genes used as input for the functional enrichment. b Heatmap showing the top 50 predicted tumor suppressors and oncogenes in breast cancer and their associated biological processes. Hierarchical clustering was performed on the Euclidean distance matrix. Biological Processes with increased (decreased) Moonlight Gene Z-score are marked in red (blue). The number of samples reporting the mutation of specific genes ranges from white to dark purple. Hypermethylated (hypomethylated) DMR are shown in blue (yellow). Genes with poor Kaplan–Meier survival prognosis are marked in pink. Chromatin accessibility in the promoter region ranges from white (closed) to orange (open). The upper panel shows boxplots of cell-line expression levels. c Barplot reporting the number of tumor-suppressor genes (blue) or oncogene (red) predicted in pan-cancer analysis using expert knowledge paired with PRA using two selected biological processes, such as apoptosis and cell proliferation. d Heatmap showing the top 50 dual-role genes (by Moonlight Gene Z-score) within cancer types, oncogenes (OCGs) are shown in red and Tumor-Suppressor Genes (TSGs) in blue. TCGA study abbreviations available at https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations. e Circos plots for molecular subtypes of Moonlight genes predicted using expert knowledge paired with PRA using two selected BPs, such as apoptosis and cell proliferation. From outer to the inner layer, the color labels are breast-cancer subtype. In the parentheses, the number of OCGs and TSG for a specific molecular subtype; OCGs (green) and TSGs (yellow); purple and orange for mutations: inframe deletion, inframe insertion, missense; gene–gene edges between two cancer molecular subtypes are OCG in both (green), TSG in both (yellow), dual-role genes (red).
Fig. 3Chromatin accessibility landscape of oncogenic mediators.
a log2 (chromatin peaks in promoters) for tumor suppressor and oncogenes detected in Pan-Cancer study, b boxplot showing log2 (chromatin peaks in introns), and c breast cancer log2 (chromatin peaks count).
Fig. 4Copy number and mutational landscape of oncogenic mediators.
a Copy-number changes in breast cancer (amplification of oncogenes in red and deletion of tumor suppressors in blue) identified according to criteria described in the Methods section. The orange line represents the significance threshold (FDR = 0.25). The complete list of chromosome location peaks associated to cancer driver genes in Pan-Cancer study is included in Supplementary Data 8. b Boxplot showing log2 (intron mutation counts), c log2 (missense mutation counts) for tumor suppressor and oncogenes detected in Pan-Cancer study, and d breast cancer log2 (mutation type count).
Fig. 5Moonlight dual-role genes that could differentially influence prognosis by cancer type or subtype.
Clinical implication (a, b) Kaplan–Meier survival curves show that ANKRD23 is a tumor suppressor in BLCA (a) and an oncogene in KIRC (b).
Fig. 6Moonlight a Pan-Cancer study: dual-role genes and machine-learning approach.
a Circos plot showing an integrative analysis of 14 TCGA cancer types using the ML approach. Labels around the plot specify the cancer type; the number of OCGs and TSGs for that cancer type are in parentheses. An edge is drawn in the center of the figure whenever the same gene is predicted in two different cancer types. Segments and edge colors correspond to cancer type: green (yellow) segments correspond to the number of OCGs (TSGs) predicted in that cancer type, and red edges represent dual-role genes. b Performance evaluation of Moonlight in terms of log loss for tumor suppressors and oncogenes predicted in 14 cancer types. Performance of 20/20 + and OncodriveRole in terms of log loss and AUC. c Heatmap showing Moonlight Gene Z-score for upstream regulators for rectum adenocarcinomas. Row colors indicate TSGs (yellow) and OCGs (green).
Fig. 7Moonlight intratumor heterogeneity, cell line, and drug analysis.
a Heatmap showing each compound (perturbagen) in rows from the Connectivity Map that share gene targets predicted as OCG (salmon) or TSG (teal) in columns. A red square indicates the presence of a relationship between compound and target. b Heatmap showing each compound (perturbagen) in columns from the Connectivity Map that shares mechanisms of action (rows), sorted by descending number of compounds with shared mechanisms of action. c Heatmap showing the top 50 TSG and OCG (by Moonlight Gene Z-score) predicted in breast cancer as mediators of apoptosis and proliferation (columns) and expression profiles of 50 breast-cancer cell lines from the Genomics of Drug Sensitivity in Cancer (GDSC) database (rows).
Comparison of tools used to predict cancer driver genes.
| Method | Data type | Description |
|---|---|---|
| 20/20 | Mutation data | ≥20% truncating mutations is TSG; >20% missense mutations in recurrent positions is OCG |
| Oncodrive Role | Mutation and copy-number alteration data | Machine-learning approach using 30 features related to the pattern of alterations across tumors |
| ActiveDriver | Mutation data | Detecting cancer drivers based on unexpected mutation sites in phosphorylation regions |
| e-Driver | Mutation data | Identification of proteins with somatic missense mutations using domain based mutation analysis |
| MutSig2CV | Mutation and gene expression data | Identification of significantly mutated genes incorporating expression levels and replication times of DNA |
| DriverNet | Mutation, copy-number alteration, and gene expression data | Method that use interaction networks to identify mutated genes associated with the gene expression alterations of its known interacting genes |
Summary of TCGA RNA-seq samples and differentially expressed genes (DEGs), (tumor vs normal analysis) in 18 cancer types.
| TCGA cancer type | Primary solid tumor (TP) | Solid tissue normal (NT) | DEG |
|---|---|---|---|
| BLCA | 408 | 19 | 2937 |
| BRCA | 1097 | 114 | 3390 |
| CHOL | 36 | 9 | 5015 |
| COAD | 286 | 41 | 3788 |
| ESCA | 184 | 11 | 2525 |
| GBM | 156 | 5 | 4828 |
| HNSC | 520 | 44 | 2973 |
| KICH | 66 | 25 | 4355 |
| KIRC | 533 | 72 | 3618 |
| KIRP | 290 | 32 | 3748 |
| LIHC | 371 | 50 | 3043 |
| LUAD | 515 | 59 | 3498 |
| LUSC | 503 | 51 | 4984 |
| PRAD | 497 | 52 | 1860 |
| READ | 94 | 10 | 3628 |
| STAD | 415 | 35 | 2622 |
| THCA | 505 | 59 | 1994 |
| UCEC | 176 | 24 | 4183 |