| Literature DB >> 30400878 |
Ahrim Youn1,2, Kyung In Kim2, Raul Rabadan3,4, Benjamin Tycko5, Yufeng Shen3,4,6, Shuang Wang7.
Abstract
BACKGROUND: Recent large-scale cancer sequencing studies have discovered many novel cancer driver genes (CDGs) in human cancers. Some studies also suggest that CDG mutations contribute to cancer-associated epigenomic and transcriptomic alterations across many cancer types. Here we aim to improve our understanding of the connections between CDG mutations and altered cancer cell epigenomes and transcriptomes on pan-cancer level and how these connections contribute to the known association between epigenome and transcriptome.Entities:
Keywords: DNA methylation; Pan-cancer analysis; TCGA; chromatic remodeling; expression driver gene; gene expression; methylation driver gene; somatic mutation
Mesh:
Substances:
Year: 2018 PMID: 30400878 PMCID: PMC6218985 DOI: 10.1186/s12920-018-0425-z
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1Rationale underlying the pan-cancer analysis to identify (a) MDGs that are associated with genome-wide methylation changes across cancer types and (b) EDGs that are associated with genome-wide expression changes across cancer types, with further analysis that reveals (c) MDGs mostly consist of chromatin regulators that directly affect the genome-wide methylation patterns or genes that regulate expression of or physically interact with chromatin regulator.
The identified 32 MDGs
| MDGs | |Ai| | |Ti| | Pi | Di | T-i | T+i |
|---|---|---|---|---|---|---|
|
| 15 | 8 | < e-06 | botha | BLCA BRCA HNSC LIHC | LGGa |
|
| 14 | 3 | 0.00147 | both | LGG | STAD UCEC |
|
| 11 | 2 | 0.00861 | both | LGG | BLCA |
|
| 11 | 1 | 0.0277 | hyper | STAD | |
|
| 10 | 1 | 0.0401 | hyper | STAD | |
|
| 8 | 1 | 1.2e-05 | hypo | TGCT | |
|
| 8 | 2 | 0.00575 | both | BLCA | STAD |
|
| 6 | 2 | 0.000227 | hypo | LGG PCPG | |
|
| 6 | 2 | 0.00185 | hypo | LIHC UCEC | |
|
| 5 | 2 | 0.00351 | hyper | KIRC KIRP | |
|
| 5 | 1 | 0.00681 | hyper | STAD | |
|
| 4 | 1 | 9.5e-05 | hypo | LGG | |
|
| 4 | 2 | 0.000266 | both | PCPG | HNSC |
|
| 4 | 2 | 0.00474 | both | THCA | COAD |
|
| 3 | 2 | < e-06 | hyper | GBM LGG | |
|
| 3 | 1 | 1.3e-05 | hyper | LGG | |
|
| 3 | 2 | 1.4e-05 | both | TGCT | THCA |
|
| 3 | 2 | 5.1e-05 | both | STAD | COAD |
|
| 3 | 1 | 0.000487 | hypera | LGGa | |
|
| 3 | 2 | 0.00151 | hyper | COAD STAD | |
|
| 3 | 1 | 0.00189 | hyper | LGG | |
|
| 3 | 2 | 0.00213 | hyper | BRCA STAD | |
|
| 3 | 1 | 0.00578 | hypo | LUAD | |
|
| 3 | 1 | 0.00687 | hypo | LUAD | |
|
| 3 | 1 | 0.00953 | hypo | PRAD | |
|
| 3 | 1 | 0.0104 | hyper | STAD | |
|
| 2 | 1 | < e-06 | hypo | TGCT | |
|
| 2 | 2 | 0.000511 | both | STAD | COAD |
|
| 2 | 1 | 0.000697 | hypo | BLCA | |
|
| 2 | 1 | 0.00193 | hypo | LUAD | |
|
| 1 | 1 | < e-06 | hypo | HNSC | |
|
| 1 | 1 | 0.000142 | hyper | LIHC |
|Ai|: number of tumor types in which CDG i is mutated in ≥ 5 samples with available methylation data;
|Ti|: number of tumor types whose genome-wide methylation levels are significantly associated with the mutation status of CDG i;
p: p-value testing if CDG i is significantly associated with genome-wide methylation changes across tumor types;
Di: direction of methylation changes associated with mutation status of CDG i;
T+i: tumor types that are hyper-methylated by CDG i;
T-i: tumor types that are hypo-methylated by CDG i;
a: Further stratified analysis by IDH1 mutation status in LGG tumor samples suggests an opposite direction from hyper- to hypo-methylation;
b: genes that are not overlapping driver genes
__ : genes that are identified as associated with genome-wide patterns of aberrant methylation by Chen et al. [5]
Fig. 2a Number of genome-wide hyper- (1) and hypo-methylated (2) sites that are associated with the mutation status of the 32 identified MDGs (columns) for each of the 20 TCGA tumor types (rows). (b) Number of genome-wide up-regulated- (1) and down-regulated- (2) genes that are associated with the mutation status of the 29 identified EDGs (columns) for each of the 20 TCGA tumor types (rows). The color code represents the number of differentially methylated sites/differentially expressed genes. Only driver genes that were mutated in ≥ 5 samples for the given tumor type were colored
Fig. 3Steps to test if chromatin regulators are enriched among the dysregulated target genes associated with the mutation status of the identified MDGs
MDGs dysregulate expression levels of chromatin regulators
| MDGs | |Ti| | N1 (# of dysregulated genes) | N2 (# of | Enrichment PvalueA | N3 (# of | Enrichment PvalueB | Genes in list B that are dysregulateda | Genes in list B that are |
|---|---|---|---|---|---|---|---|---|
|
| 8 | 233 | 19 | 0.00056 | 5 | 3.2e-06 |
| |
|
| 3 | 1,534 | 81 | 0.00012 | 9 | 9.1e-06 |
|
|
|
| 2 | 1,447 | 83 | 5.2e-06 | 4 | 0.056 |
|
|
|
| 2 | 766 | 36 | 0.044 | 0 | 1 | ||
|
| 2 | 5,515 | 207 | 0.12 | 12 | 0.0033 |
|
|
|
| 2 | 1,137 | 63 | 2.0e-04 | 3 | 0.11 |
|
|
|
| 2 | 3,128 | 133 | 0.008 | 8 | 0.0091 |
|
|
|
| 2 | 3,560 | 208 | 2.8e-15 | 2 | 0.9 |
| |
|
| 2 | 1,609 | 87 | 2.8e-05 | 1 | 0.82 |
| |
|
| 2 | 3,212 | 157 | 4.4e-06 | 8 | 0.011 |
|
|
|
| 2 | 1,563 | 78 | 0.00088 | 6 | 0.0039 |
|
|
|
| 2 | 2,792 | 136 | 2.7e-05 | 14 | 3.3e-08 |
|
|
|Ti|: number of tumor types whose genome-wide methylation levels are significantly associated with the mutation status of CDG i;
N: number of genes whose expression levels are dysregulated by MDG i in all |Ti| tumor types;
N: number of genes in list A that are dysregulated in all |Ti| tumor types;
N: number of genes in list B that are dysregulated in all |Ti| tumor types;
Enrichment PvalueA, and PvalueB are calculated using hypergeometric distributions testing if genes in lists A and B are enriched among genome-wide differentially expressed target genes;
aGenes in list B that are dysregulated in all |Ti| tumor types;
bGenes in list B that are differentially methylated in all |Ti| tumor types.
Chromatin regulators are enriched in genes that physically interact with MDGs
| MDGs | |Ti| | N4 (# of physically interacting genes) | N5 (# of physically | Enrichment PvalueAi |
|---|---|---|---|---|
|
| 8 | 923 | 192 | 0 |
|
| 3 | 224 | 19 | 0.00035 |
|
| 2 | 250 | 64 | 0 |
|
| 2 | 29 | 3 | 0.079 |
|
| 2 | 364 | 63 | 0 |
|
| 2 | 86 | 6 | 0.08 |
|
| 2 | 54 | 8 | 0.00053 |
|
| 2 | 49 | 7 | 0.0015 |
|
| 2 | 36 | 3 | 0.13 |
|
| 2 | 18 | 7 | 1.4e-06 |
|
| 2 | 17 | 1 | 0.45 |
|
| 2 | 144 | 18 | 2.9e-06 |
|Ti|: number of tumor types whose genome-wide methylation levels are significantly associated with the mutation status of CDG i;
N: number of genes physically interact with MDG i;
N: number of genes physically interact with MDG i that are also in list A;
Enrichment PvalueAi is calculated using a hypergeometric distribution testing if genes in list A are enriched among selected physically interacting genes.
Identified 29 EDGs
| EDGs | ∣ | ∣ |
|
|
| |
|---|---|---|---|---|---|---|
|
| 14 | 11 | < e-06 | both | HNSC LGG SARC | BLCA BRCA COAD GBM KIRC LIHC LUAD STAD |
|
| 14 | 1 | 0.00299 | up | LGG | |
|
| 11 | 2 | 0.00536 | down | BRCA STAD | |
|
| 10 | 3 | 9e-05 | up | BLCA LGG LUAD | |
|
| 10 | 2 | 0.00557 | both | STAD | CESC |
|
| 8 | 4 | 4e-06 | down | COAD LUAD PAAD TGCT | |
|
| 8 | 1 | 0.0265 | down | STAD | |
|
| 6 | 1 | 0.00274 | up | LGG | |
|
| 5 | 1 | 0.00572 | down | LIHC | |
|
| 5 | 0 | 0.00833 | NA | NA | NA |
|
| 4 | 2 | 1.1e-05 | down | COAD THCA | |
|
| 4 | 2 | 5.3e-05 | up | LGG LUAD | |
|
| 4 | 3 | 7e-05 | down | HNSC PCPG THCA | |
|
| 3 | 1 | < e-06 | down | LGG | |
|
| 3 | 2 | < e-06 | both | LGG | GBM |
|
| 3 | 2 | 1e-06 | down | COAD STAD | |
|
| 3 | 2 | 2e-06 | both | LGG | GBM |
|
| 3 | 2 | 7e-06 | both | BRCA | STAD |
|
| 3 | 2 | 0.000199 | up | TGCT THCA | |
|
| 3 | 1 | 0.000809 | down | LUAD | |
|
| 3 | 2 | 0.00119 | down | COAD STAD | |
|
| 3 | 2 | 0.00182 | down | HNSC LGG | |
|
| 3 | 1 | 0.0108 | up | LUAD | |
|
| 3 | 1 | 0.0138 | down | BRCA | |
|
| 2 | 2 | 0.000317 | down | COAD STAD | |
|
| 2 | 1 | 0.000889 | down | TGCT | |
|
| 1 | 1 | < e-06 | down | LGG | |
|
| 1 | 1 | < e-06 | down | PRAD | |
|
| 1 | 1 | 0.000432 | up | HNSC |
∣A′∣ = number of tumor types in which EDG i is mutated in ≥ 5 samples with expression data;
∣E∣ = number of tumor types whose genome-wide expression levels are significantly associated with CDG i;
p-value testing if CDG i is significantly associated with genome-wide expression changes across tumor types;
Bi is the direction of change of expression levels associated with the mutation status of CDG i;
tumor types that are up-regulated by CDG i, = tumor types that are down-regulated by CDG i;
a : genes that are not overlapping driver genes.
†Note that SETD2 gene has a significant p-value for testing association of genome-wide expression changes across multiple tumor types, but there is not a specific tumor type in which SETD2 mutation is significantly associated with genome-wide expression changes.
Patterns of target genes’ promoter regions methylation and expression changes by mutations of the overlapping driver genes across tumor types
| Overlapping driver | ∣ | ∣ | ∣ | ∣ | ∣ | p.methyl | p.exp | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 8 | 11 | 7 | 10389 | 10066 | 52 | 2.8e-32 | 1 | 9.7e-95 | 7.1e-65 | 1 | 0 | 0 |
|
| 3 | 1 | 1 | 12259 | 10293 | 61 | 0.014 | 1 | 1e-15 | 1e-15 | 1 | 2.98e-10 | 0 |
|
| 2 | 3 | 1 | 9049 | 8062 | 46 | 0.0066 | 0.81 | 0.05 | 1e-15 | 1 | 0 | 0 |
|
| 1 | 2 | 1 | 13933 | 5822 | 73 | 5e-12 | 1 | 1e-15 | 5.4e-05 | 1 | 1.93e-11 | 0 |
|
| 1 | 2 | 1 | 9487 | 8573 | 53 | 1e-15 | 0.98 | 1e-15 | 0.29 | 1 | 0 | 0 |
|
| 1 | 4 | 1 | 15041 | 8427 | 73 | 1 | 1 | 7.5e-12 | 1 | 0.87 | 0 | 0 |
|
| 2 | 1 | 1 | 9802 | 8485 | 52 | 4.3e-14 | 1 | 1e-15 | 0.014 | 1 | 7.37e-10 | 0 |
|
| 2 | 1 | 1 | 10288 | 7435 | 53 | 9e-07 | 1 | 1e-15 | 1e-15 | 1 | 0 | 0 |
|
| 2 | 1 | 1 | 7583 | 7591 | 37 | 0.64 | 1 | 1e-15 | 1 | 1 | 0 | 0 |
|
| 1 | 2 | 1 | 13156 | 10606 | 66 | 0.0096 | 1 | 1e-15 | 1e-15 | 1 | 7.63e-07 | 0 |
|
| 2 | 3 | 2 | 8758 | 6300 | 47 | 3.7e-19 | 0.97 | 7e-29 | 8.1e-20 | 1 | 0 | 0 |
|
| 2 | 2 | 2 | 9120 | 10092 | 47 | 1.3e-06 | 0.96 | 1.3e-14 | 7e-29 | 1 | 0 | 9.99e-16 |
|
| 2 | 2 | 2 | 10596 | 10134 | 55 | 6.6e-16 | 1 | 7e-29 | 7e-29 | 1 | 0 | 0 |
|
| 1 | 1 | 1 | 14066 | 12718 | 71 | 1.2e-08 | 1 | 1e-15 | 1e-15 | 1 | 0 | 0 |
|
| 2 | 2 | 1 | 8413 | 10581 | 43 | 5.6e-06 | 1 | 1e-15 | 1e-15 | 1 | 0 | 0 |
|
| 2 | 2 | 2 | 9978 | 9292 | 51 | 4.9e-08 | 1 | 3.9e-29 | 2e-14 | 1 | 0 | 0 |
|
| 1 | 2 | 1 | 11646 | 12114 | 60 | 2.2e-16 | 1 | 1e-15 | 1e-15 | 1 | 0 | 0 |
|
| 2 | 2 | 2 | 8415 | 6667 | 43 | 1.1e-10 | 0.94 | 8.6e-19 | 2.1e-09 | 1 | 0 | 0 |
|
| 1 | 2 | 1 | 10225 | 7849 | 55 | 1e-15 | 1 | 1e-15 | 1e-15 | 1 | 0 | 0 |
|
| 2 | 2 | 2 | 8634 | 8254 | 45 | 3.6e-16 | 1 | 1.1e-23 | 3.5e-26 | 1 | 0 | 0 |
|
| 1 | 1 | 1 | 10060 | 10030 | 50 | 0.4 | 1 | 1e-15 | 0.017 | 1 | 1.25e-12 | 0 |
|
| 1 | 1 | 1 | 6118 | 6234 | 30 | 0.86 | 1 | 1e-15 | 0.93 | 1 | 1.11e-16 | 3.56e-11 |
|
| 1 | 1 | 1 | 15035 | 10164 | 74 | 0.87 | 1 | 4.4e-16 | 0.029 | 0.95 | 2.54e-09 | 1.07e-06 |
|
| 2 | 2 | 2 | 8066 | 6649 | 44 | 1.7e-24 | 1 | 7e-29 | 2.3e-24 | 1 | 0 | 0 |
|
| 1 | 1 | 1 | 13052 | 8233 | 67 | 1.2e-10 | 0.55 | 1.9e-14 | 3.1e-10 | 1 | NA | NA |
|DM|: number of differentially methylated genes averaged across T ∩ E tumor types;
|DE|: number of differentially expressed genes averaged across T ∩ E tumor types;
(%): percent of differentially methylated target genes out of differentially expressed target genes, averaged across tumor types T ∩ E
p(DM ∙ DE): p-value testing if number of target genes that are differentially methylated and expression is larger than expected using a hypergeometric distribution combined across tumor types T ∩ Eusing the Fisher’s method.
p(−−), p(+−), p(−+), p(++): p-values that test if number of target genes with “--”,“+-”,“-+”,“++” pattern of methylation and expression changes is larger than expected a using hypergeometric distribution combined across tumor types using the Fisher’s method.
p.methyl: median p-value from testing if the number of overlapping target genes that are differentially methylated by the mutation of the CDG between any pair of two tumor types is larger than expected using a hypergeometric distribution.
p.exp: median p-value from testing if the number of overlapping target genes that are differentially expressed by the mutation of the CDG between any pair of two tumor types is larger than expected using a hypergeometric distribution.
Fig. 4Chromatin remodeling is one of the major mechanisms that amplify the impact of mutations in CDGs by global methylation and gene expression changes