| Literature DB >> 35300417 |
Peng Huang1, Bin Zhang1, Junsheng Zhao1, Ming D Li1,2.
Abstract
Recently, emerging evidence has indicated that aberrant enhancers, especially super-enhancers, play pivotal roles in the transcriptional reprogramming of multiple cancers, including hepatocellular carcinoma (HCC). In this study, we performed integrative analyses of ChIP-seq, RNA-seq, and whole-genome bisulfite sequencing (WGBS) data to identify intergenic differentially expressed enhancers (DEEs) and genic differentially methylated enhancers (DMEs), along with their associated differentially expressed genes (DEE/DME-DEGs), both of which were also identified in independent cohorts and further confirmed by HiC data. Functional enrichment and prognostic model construction were conducted to explore the functions and clinical significance of the identified enhancer aberrations. We identified a total of 2,051 aberrant enhancer-associated DEGs (AE-DEGs), which were highly concurrent in multiple HCC datasets. The enrichment results indicated the significant overrepresentations of crucial biological processes and pathways implicated in cancer among these AE-DEGs. A six AE-DEG-based prognostic signature, whose ability to predict the overall survival of HCC was superior to that of both clinical phenotypes and previously published similar prognostic signatures, was established and validated in TCGA-LIHC and ICGC-LIRI cohorts, respectively. In summary, our integrative analysis depicted a landscape of aberrant enhancers and associated transcriptional dysregulation in HCC and established an aberrant enhancer-derived prognostic signature with excellent predictive accuracy, which might be beneficial for the future development of epigenetic therapy for HCC.Entities:
Keywords: DNA methylation; RNA-seq; eRNA; enhancer; hepatocellular carcinoma; histone modification; prognostic model; super-enhancer
Year: 2022 PMID: 35300417 PMCID: PMC8921559 DOI: 10.3389/fcell.2022.827657
Source DB: PubMed Journal: Front Cell Dev Biol ISSN: 2296-634X
FIGURE 1The schematic flowchart of the present study.
FIGURE 2A comprehensive catalog of enhancers in the liver. (A) Count of active and primed enhancers in each liver-relevant ChIP-seq sample. (B) Length distribution of enhancers in each liver-relevant ChIP-seq sample. (C) Proportions of long enhancers in three types of liver-relevant ChIP-seq samples. And (D) Distribution of the concurrence of all merged enhancers among 11 liver-relevant ChIP-seq samples.
Characteristics and enhancer identification strategies applied for 11 liver-relevant ChIP-seq samples.
| Sample name | Source | H3K4me1 | H3K27ac | Enhancer identification strategy |
|---|---|---|---|---|
| HepG2 | ENCODE ( | √ | √ | H3K4me1 only (primed enhancer) + H3K4me1 and H3K27ac (active enhancer) |
| Hepatocyte | √ | √ | ||
| LiverAdult 1 | √ | √ | ||
| LiverAdult 2 | √ | √ | ||
| LiverAdult 3 | √ | √ | ||
| LiverAdult 4 | √ | X | H3K4me1 (primed or active enhancer) | |
| LiverAdult 5 | The integrative epigenomic HCC study ( | √ | √ | H3K4me1 only (primed enhancer) + H3K4me1 and H3K27ac (active enhancer) |
| Cirrhosis 1 | √ | √ | ||
| Cirrhosis 2 | √ | √ | ||
| Tumor 1 | √ | √ | ||
| Tumor 2 | √ | √ |
FIGURE 3Distinct patterns of activated and repressed intergenic enhancers in HCC. (A) (left) Distribution of the concurrence of identified intergenic DEEs among five RNA-seq datasets and (right) distribution of the number of associated intergenic DEE-DEG of each DEE. (B) (left) Distribution of the concurrence of identified intergenic DEE-DEGs among five RNA-seq datasets and (right) distribution of the number of associated DEE of each DEE-DEG. (C) Gene clusters associated with aberrant super-enhancers. Only gene clusters with at least five DEE-DEGs and super-enhancers with at least five DEEs were displayed. Activated DEEs and DEE-DEGs are shown in red color, and repressed DEEs and DEE-DEGs are shown in blue.
Summary information about two representative sets of DEE-DEGs regulated by intergenic DEE clusters.
| Enhancer cluster | Average LFC.DEE | Avergae LogFDR.DEE | Gene name | LFC DEG | FDR DEG | Average rho | Average concurrence | Implicated cancers |
|---|---|---|---|---|---|---|---|---|
| chr16:56520399–56864888 (∼344 kb, 35 DEEs) | −3.3 | 6.62 |
| − |
|
|
| HCC ( |
|
| −2.01 | 1.20E-03 | 0.82 | 3.51 | Unknown | |||
|
| − |
|
|
| HCC ( | |||
|
| − |
|
|
| HCC ( | |||
|
| − |
|
|
| HCC ( | |||
|
| − |
|
|
| HCC ( | |||
|
| − |
|
|
| HCC ( | |||
|
| − |
|
|
| HCC ( | |||
|
| −2.58 | 5.28E-10 | 0.86 | 2.86 | Unknown | |||
|
| − |
|
|
| HCC ( | |||
|
| − |
|
|
| HCC ( | |||
|
| −3.35 | 4.85E-24 | 0.91 | 3.83 | Breast cancer ( | |||
| chr17: 81740217–81874724 (∼134 kb, 14 DEEs) | 1.05 | 4.53 |
|
|
|
|
| HCC ( |
|
| 0.95 | 3.37E-16 | 0.80 | 1.78 | Unknown | |||
|
| 0.82 | 3.53E-11 | 0.79 | 1.22 | Bladder cancer ( | |||
|
|
|
|
|
| HCC ( | |||
|
| 0.55 | 8.36E-08 | 0.76 | 1.22 | Breast cancer ( | |||
|
|
|
|
|
| HCC ( | |||
|
|
|
|
|
| HCC ( | |||
|
|
|
|
|
| HCC ( | |||
|
|
|
|
|
| HCC ( | |||
|
| 1.07 | 9.12E-15 | 0.74 | 2.00 | Glioblastoma ( |
Notes: Enhancer cluster: the cluster of intergenic DEEs that were simultaneously associated with the corresponding cluster of genes; average LFC.DEE: the arithmetic mean of log2 fold change of the FPM of all DEEs in the enhancer cluster; average LogFDR.DEE: the arithmetic mean of the–log10FDR of the differential expression test of all DEEs in the enhancer cluster; average rho: the arithmetic mean of Spearman correlation coefficients of all DEE-DEG pairs between corresponding DEGs and DEEs in the enhancer cluster; average concurrence: the arithmetic mean of the concurrence of all DEE-DEG pairs between corresponding DEGs and DEEs in the enhancer cluster; implicated cancers: results of literature searching (only molecular mechanism studies) to determine the relevance between DEE-DEG and cancers (genes implicated in HCC were highlighted with a bold font).
Characteristics of five RNA-seq datasets used in the present study.
| Dataset | No. of tumor tissues | No. of adjacent tissues | Additional data type | Reference (PMID) |
|---|---|---|---|---|
| Discovery cohort | 33 | 33 | WGBS | — |
| GSE77314 | 50 | 50 | — | 27119355 ( |
| GSE124535 | 35 | 35 | — | 30814741 ( |
| GSE148355 | 62 | 47 | — | 33772139 ( |
| GSE77509 | 20 | 20 | — | 28194035 ( |
FIGURE 4DNA methylation and histone PTM modifer-associated intergenic DEEs and DEE-DEGs. (A) Percentage of DNA methylation-associated intergenic DEEs. (B) Percentage of DNA methylation-associated intergenic DEE-DEGs. (C) Significant differential expression of seven histone modification regulators (three histone methyltransferases, two histone demethylases, and three histone deacetylases) in the discovery cohort. (D) Proportion of histone modification regulator-associated intergenic DEEs. (E) Percentage of histone modification regulator-associated intergenic DEE-DEGs.
FIGURE 5Identification and validation of genic DMEs and associated DME-DEGs in HCC. (A) Distribution density of the number of associated DME-DEGs of hypermethylated and hypomethylated genic DME. (B) Count of four types of genic DME-DEGs. “HyperUp” refers to hypermethylated enhancer-associated upregulated DME-DEGs; “HyperDown” refers to hypomethylated enhancer-associated downregulated DME-DEGs; “HypoUp” refers to hypermethylated enhancer-associated upregulated DME-DEGs, and “HypoDown” refers to hypomethylated enhancer-associated downregulated DME-DEGs. (C) Distribution of the three types of replication results of genic DME-DEGs. “Successful replication” refers to the successful replication of genic DME-DEGs for correlated differential methylation and differential expression in TCGA-LIHC; “type I failure” refers to replication failure due to lack of CpG for the corresponding genic DMEs in TCGA-LIHC; and “type II failure” refers to replication failures except type I failure. (D) Platform-adjusted replication rates of four types of genic DME-DEGs in TCGA-LIHC. Platform-adjusted replication rates were calculated as (CountSuccessful Replication + CountSuccessful Replication + CountType I Failure) * 100%.
FIGURE 6Biological functions and in silico verification of AE-DEGs. (A) Venn diagram displaying the overlap between genic DME-DEGs and intergenic DEE-DEGs. The union of them were defined as aberrant enhancer-associated DEGs (AE-DEGs). (B) and (C) Top ten overrepresented pathways/GO terms of activated AE-DEGs and repressed AE-DEGs, respectively. (D) Enrichment of AE-DEGs for ten cancer hallmarks. “*”refers to hypergeometric test FDR < 0.05; “**”refers to FDR < 1e-2; and “***”refers to FDR < 1e-3. (E) Percentage of AE-DEGs that were successfully validated by TADs in HiC-profiled HepG2 and normal liver samples. “HepG2 or liver” refers to successful validation in HepG2 or the liver sample; “HepG2 and liver” refers to successful validation in both HepG2 and the liver sample.
FIGURE 7Construction of a six AE-DEG-based prognostic model for HCC in TCGA-LIHC. (A) The expression heatmap of six AE-DEGs constituted the identified prognostic model for OS of HCC in TCGA-LIHC. Multivariate Cox regression derived coefficients used for the calculation of risk score are given in parentheses. Patients were ranked according to corresponding calculated risk scores. (B) Distribution of the calculated risk scores of HCC patients in TCGA-LIHC. (C) Kaplan-Meier analysis of the six AE-DEG-based prognostic signature in TCGA-LIHC. (D) Distribution of duration and survival status of HCC patients in TCGA-LIHC. (E) Box plots display the comparison of survival times between high- and low-risk HCC patients and the comparison of risk scores between alive and deceased HCC patients in TCGA-LIHC. Wilcox p-values were calculated and displayed with each boxplot. (F) Time-dependent ROC analyses of the six AE-DEG-based prognostic signature in TCGA-LIHC. (G) Forest plot of the multivariate Cox regression analysis in TCGA-LIHC.
FIGURE 8Nomogram for the prediction of overall survival of HCC in TCGA-LIHC. (A) A prognostic nomogram for predicting the probabilities of 1-year, 3-years, and 5-years overall survival of HCC patients in TCGA-LIHC. (B) Calibration plots for evaluation of the predictive performance of the constructed nomogram. (C) Time-dependent ROC curves displayed the comparisons of AUCs among diverse prognostic models.
FIGURE 9Validation of the six AE-DEG-based prognostic model for OS of HCC in ICGC-LIRI cohort. (A) The expression heatmap of six AE-DEGs constituted the identified prognostic model for overall survival of HCC in ICGC-LIRI cohort. Patients were ranked according to their risk scores. (B) Distribution of the calculated risk scores of HCC patients in ICGC-LIRI. (C) Kaplan-Meier analysis of the 6-gene prognostic signature in ICGC-LIRI. (D) Distribution of duration and survival status of HCC patients in ICGC-LIRI. (E) Boxplots display the comparison of survival time between high- and low-risk HCC patients and the comparison of risk score between alive and deceased HCC patients in ICGC-LIRI. (F) Time-dependent ROC analysis of the six AE-DEG-based prognostic signature in ICGC-LIRI. (G) Forest plot of the multivariate Cox regression analysis in ICGC-LIRI.
Comparison of the predictive performance of our AE-DEG-based signature with seven previously established prognostic signatures in HCC.
| Signature name | AUCs for OS in discovery | AUCs for OS in validation | ||||
|---|---|---|---|---|---|---|
| 1-year | 3-years | 5-years | 1-year | 3-years | 5-years | |
| Methylation-driven gene based signature 1 ( | 0.6885 | 0.6563 | 0.6548 | 0.6397 | 0.6644 | 0.5942 |
| Methylation-driven gene based signature 2 ( | 0.742 | 0.661 | — | 0.695 | 0.655 | — |
| Angiogenic gene based signature ( | 0.74 | 0.66 | 0.66 | 0.78 | 0.74 | — |
| EMT related gene based signature ( | 0.824 | 0.798 | 0.800 | 0.688 | 0.674 | 0.876 |
| Ferroptosis and iron-metabolism related gene based signature ( | 0.77 | 0.71 | 0.64 | 0.67 | 0.73 | — |
| Differentially expressed gene signature ( | 0.77 | 0.73 | 0.72 | 0.63 | 0.68 | 0.65 |
| Hypoxia related gene based signature ( | 0.78 | 0.70 | 0.70 | 0.75 | 0.77 | 0.77 |
| Our AE-DEG based signature | 0.783 | 0.797 | 0.715 | 0.795 | 0.756 | 0.800 |