| Literature DB >> 29697014 |
Hagen Klett1,2,3, Yesilda Balavarca4, Reka Toth1,4, Biljana Gigic1,4,5, Nina Habermann1,4, Dominique Scherer1,4,6, Petra Schrotz-King1,4, Alexis Ulrich5, Peter Schirmacher1,7, Esther Herpel7,8, Hermann Brenner1,4,9, Cornelia M Ulrich1,4,10, Karin B Michels11,12, Hauke Busch13, Melanie Boerries1,2,3.
Abstract
DNA methylation is recognized as one of several epigenetic regulators of gene expression and as potential driver of carcinogenesis through gene-silencing of tumor suppressors and activation of oncogenes. However, abnormal methylation, even of promoter regions, does not necessarily alter gene expression levels, especially if the gene is already silenced, leaving the exact mechanisms of methylation unanswered. Using a large cohort of matching DNA methylation and gene expression samples of colorectal cancer (CRC; n = 77) and normal adjacent mucosa tissues (n = 108), we investigated the regulatory role of methylation on gene expression. We show that on a subset of genes enriched in common cancer pathways, methylation is significantly associated with gene regulation through gene-specific mechanisms. We built two classification models to infer gene regulation in CRC from methylation differences of tumor and normal tissues, taking into account both gene-silencing and gene-activation effects through hyper- and hypo-methylation of CpGs. The classification models result in high prediction performances in both training and independent CRC testing cohorts (0.92<AUC<0.97) as well as in individual patient data (average AUC = 0.82), suggesting a robust interplay between methylation and gene regulation. Validation analysis in other cancerous tissues resulted in lower prediction performances (0.69<AUC<0.90); however, it identified genes that share robust dependencies across cancerous tissues. In conclusion, we present a robust classification approach that predicts the gene-specific regulation through DNA methylation in CRC tissues with possible transition to different cancer entities. Furthermore, we present HMGA1 as consistently associated with methylation across cancers, suggesting a potential candidate for DNA methylation targeting cancer therapy.Entities:
Keywords: DNA methylation; Epigenetic regulation; HMGA1; colorectal cancer; gene expression; prediction model
Mesh:
Substances:
Year: 2018 PMID: 29697014 PMCID: PMC6140810 DOI: 10.1080/15592294.2018.1460034
Source DB: PubMed Journal: Epigenetics ISSN: 1559-2294 Impact factor: 4.528
Figure 1.Principal component analysis (PCA) from (A) normalized DNA methylation (M-values) and (B) log2 transformed gene expression data. Normal samples are shown in purple and tumor samples in green. Samples clustering within the other group (separated by the dashed line) were labeled with their sample ID. C: The number of significantly differentially methylated CpG sites (Δβ >0.1 and FDR <0.001) ordered according to location and their island relation (hyper- and hypo-methylation in CRC compared to normal tissues). D: The number of significantly differentially regulated genes (log2FC >0.5 and FDR <0.01) between CRC and normal tissues (up- and down-regulation in CRC). E: Overlap between differential methylation (≥2 CpGs significantly methylated per gene) and significant gene regulation.
Figure 2.Proportions of hyper- and hypo-methylated regions that display a significant relationship (FDR corrected P value <0.05 and |ϱ| >0.2) to gene expression for the promoter (A) and the gene body (B). C: Methylation pattern of all genes (8491) that contain at least one methylation region that is significantly correlated to its gene expression values.
Figure 3.A: Workflow of training Random Forest classification models on different subsets of genes. Prediction performances (AUCs) obtained from three times repeated 10-fold cross-validation for different subsets of genes according to log2FC and Spearman correlation coefficients thresholds. Below, the importance of predictors across all prediction models are shown (scaled between [0, 1]). B: Subsets of negatively and positively correlated genes C: subsets of negatively correlated genes and D: Subsets of positively correlated genes.
Figure 4.A: Left panel: Heatmap of methylation profiles (Δβ values) for all genes from the subsets of negatively (327) and positively (129) correlated genes (see Figure 3C and D; black boxes) with regard to their regulation, prediction outcome and prediction model association. The black box represents an example of gene-specific regulation (up and down) from similar methylation patterns. Right panel: detailed view of 84 cancer associated genes (cancer genes from Bushman's Lab, Suppl. Table 2) and their associated function in cancer. B: Significantly enriched consensus pathways (P value <0.05) in negatively correlated (purple nodes), positively correlated (grey nodes), and both (yellow nodes) genes. Edges are drawn if pathways share 30% of their genes. Node and font size are proportional to the size of the gene sets.
Figure 5.A: ROC curves of validation analysis of the prediction models for negative correlated (solid line) and positive correlated (dashed line) genes in independent CRC methylation (meth) and gene expression (exp) data. B: Average of gene regulation prediction performances (AUC) and their standard deviation for 16 individual CRC patients from TCGA repository.
Figure 6.A: ROC curves of validation analysis of the prediction models for negative correlated (solid line) and positive correlated (dashed line) genes in BRCA, LUAD, and THCA data. B: Overlap between the subsets of genes of negatively and positively correlated genes used in the prediction models of CRC, BRCA, LUAD and THCA (see ROC curves Figure 6A). C: Illumina 450K methylation profile on the HMGA1 gene. Hypermethylated CpGs in cancer tissues are shown in yellow and hypomethylated CpGs in blue. The size of the methylation sites corresponds to the significance of the Spearman correlation coefficient between HMGA1 gene expression and methylation levels at the respective loci. D: Mutational frequencies of HMGA1, MT1E, AGR2, FAS, and NFE2L3 across more than 9000 cancer patients from TCGA.