| Literature DB >> 31356589 |
Majed Mohamed Magzoub1,2, Marcos Prunello3, Kevin Brennan4, Olivier Gevaert4,5.
Abstract
Aberrant DNA methylation disrupts normal gene expression in cancer and broadly contributes to oncogenesis. We previously developed MethylMix, a model-based algorithmic approach to identify epigenetically regulated driver genes. MethylMix identifies genes where methylation likely executes a functional role by using transcriptomic data to select only methylation events that can be linked to changes in gene expression. However, given that proteins more closely link genotype to phenotype recent high-throughput proteomic data provides an opportunity to more accurately identify functionally relevant abnormal methylation events. Here we present a MethylMix analysis that refines nominations for epigenetic driver genes by leveraging quantitative high-throughput proteomic data to select only genes where DNA methylation is predictive of protein abundance. Applying our algorithm across three cancer cohorts we find that using protein abundance data narrows candidate nominations, where the effect of DNA methylation is often buffered at the protein level. Next, we find that MethylMix genes predictive of protein abundance are enriched for biological processes involved in cancer including functions involved in epithelial and mesenchymal transition. Moreover, our results are also enriched for tumor markers which are predictive of clinical features like tumor stage and we find clustering using MethylMix genes predictive of protein abundance captures cancer subtypes.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31356589 PMCID: PMC6695193 DOI: 10.1371/journal.pcbi.1007245
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Workflow for MethylMix-GE and MethylMix-PA to identify genes with differential methylation between cancer and normal tissues with a significant relationship with gene expression and protein abundance respectively.
Overview of number of genes, CpG clusters, and samples used for each TCGA cancer site analysis.
| N Genes | N CpG Clusters | N Samples: Gene Expression & Protein Abundance | N Samples: Tumor Tissue Methylation | N Samples: Normal Tissue Methylation | |
|---|---|---|---|---|---|
| BRCA | 6248 | 9221 | 76 | 972 | 123 |
| COADREAD | 2855 | 4299 | 85 | 614 | 78 |
| OV | 3876 | 5402 | 149 | 582 | 8 |
Fig 2Visualization of MethylMix-PA and MethylMix-GE genes.
A) Methylation of hyper and hypo-methylated genes for each of the three cancer sites: breast cancer (BRCA), colorectal cancer (COADREAD) and ovarian cancer (OV). Red circles: uniquely MethylMix-GE genes, blue circles: uniquely MethylMix-PA genes, and purple circles: overlapping genes between MethylMix-GE and MethylMix-PA B) Venn diagrams comparing the number of reported genes that are differentially methylated and functionally predictive for MethylMix-GE and MethylMix-PA.
Fig 3Correlation analysis between gene expression and protein abundance for the three cancer sites: Breast cancer (BRCA), colorectal cancer (COADREAD) and ovarian cancer (OV).
A) Regression analysis between gene expression and protein abundance. Regression line in purple with confidence interval. Red circles: uniquely MethylMix-GE genes, blue circles: uniquely MethylMix-PA genes, and purple circles: overlapping genes between MethylMix-GE and MethylMix-PA. B) histogram of correlation between gene expression and protein abundance across samples. Red line: MethylMix-GE genes, blue line: MethylMix-PA, and purple line: overlapping genes between MethylMix-GE and MethylMix-PA, showing that MethylMix-PA genes have higher correlation.
Gene set enrichment analysis results for each cancer site including MethylMix-GE and MethylMix-PA genes, showing only results where the MethylMix-PA adjusted P-value<0.10.
Complete results are in S3 Table. Genes in bold are specific to the MethylMix-PA analysis.
| Gene Ontology term | Genes | MethylMix gene list | Nr of genes overlap | Adjusted P Value |
|---|---|---|---|---|
| BRCA | ||||
| regulation of leukocyte activation | BCL2, CARD11, CBFB, | pa | 9 | 0.10 |
| ge | 27 | 0.08 | ||
| fatty acid metabolic process | ACAA2, | pa | 14 | 0.09 |
| ge | 27 | 0.01 | ||
| leukocyte cell-cell adhesion | CARD11, CBFB, | pa | 9 | 0.03 |
| ge | 22 | 0.05 | ||
| interleukin-2 production | CARD11, | pa | 4 | 0.10 |
| ge | 6 | 0.37 | ||
| response to interferon-gamma | pa | 6 | 0.09 | |
| ge | 16 | 0.04 | ||
| T cell activation | BCL2, CARD11, CASP8, CBFB, | pa | 15 | 0.10 |
| ge | 31 | 0.01 | ||
| small molecule catabolic process | ACAA2, | pa | 13 | 0.08 |
| ge | 35 | 0.01 | ||
| positive regulation of cell activation | CBFB, | pa | 8 | 0.06 |
| ge | 13 | 0.03 | ||
| leukocyte migration | APOD, | pa | 15 | 0.09 |
| ge | 14 | 0.07 | ||
| regulation of peptidase activity | pa | 9 | 0.09 | |
| ge | 15 | 0.07 | ||
| STAT cascade | AKR1B1, | pa | 5 | 0.06 |
| ge | 9 | 0.16 | ||
| cell-cell adhesion via plasma-membrane adhesion molecules | APOA1, CDH13, CDH5, | pa | 5 | 0.09 |
| ge | 9 | 0.26 | ||
| COADREAD | ||||
| cell-cell adhesion via plasma-membrane adhesion molecules | APOA1, | pa | 4 | 0.03 |
| ge | 6 | 0.01 | ||
| OV | ||||
| cellular modified amino acid metabolic process | pa | 9 | 0.05 | |
| ge | 7 | 0.43 | ||
| I-kappaB kinase/NF-kappaB signaling | pa | 10 | 0.05 | |
| ge | 12 | 0.05 | ||
| interleukin-1 production | APOA1, | pa | 5 | 0.05 |
| ge | 6 | 0.01 | ||
| response to type I interferon | pa | 6 | 0.09 | |
| ge | 6 | 0.14 | ||
| benzene-containing compound metabolic process | pa | 4 | 0.03 | |
| ge | 2 | 0.47 |
Report of overlap between MethylMix-GE and MethylMix-PA genes with tumor progression markers produced using a fisher exact test.
| Cancer | model | met drivers | overlap | percentage | p.value | |
|---|---|---|---|---|---|---|
| A. Tumor Stage | BRCA | MethylMix-GE | 148 | 10 | 7.5% | 9.83E-04 |
| MethylMix-PA | 46 | 5 | 12.2% | 2.38E-03 | ||
| COADREAD | MethylMix-GE | 125 | 15 | 12.9% | 1.74E-06 | |
| MethylMix-PA | 28 | 4 | 14.8% | 3.86E-04 | ||
| OV | MethylMix-GE | 70 | 2 | 3.3% | 2.42E-02 | |
| MethylMix-PA | 52 | 2 | 4.4% | 2.43E-02 | ||
| B. Tumor Size | BRCA | MethylMix-GE | 148 | 28 | 21.1% | 2.13E-11 |
| MethylMix-PA | 46 | 12 | 29.3% | 1.55E-08 | ||
| COADREAD | MethylMix-GE | 125 | 3 | 2.6% | 1.69E-01 | |
| MethylMix-PA | 28 | 2 | 7.4% | 4.25E-02 | ||
| OV | MethylMix-GE | 70 | 4 | 6.6% | 3.51E-01 | |
| MethylMix-PA | 52 | 4 | 8.9% | 1.76E-01 |
Summary statistics from consensus clustering analysis across K = 2–6 for each cancer; we report inter- and intra-cluster scores along with PAC score.
| MethylMix—GE | MethylMix—PA | ||||||
|---|---|---|---|---|---|---|---|
| Cancer | K | Intra | Inter | PAC | Intra | Inter | PAC |
| BRCA | 2 | 99.6 | 28.6 | 0.013 | 99.8 | 27.6 | 0.010 |
| 3 | 94.5 | 21.8 | 0.326 | 90.0 | 24.2 | 0.355 | |
| 4 | 92.3 | 15.4 | 0.190 | 88.9 | 16.0 | 0.222 | |
| 5 | 84.1 | 13.5 | 0.233 | 81.5 | 14.2 | 0.230 | |
| 6 | 81.7 | 11.8 | 0.235 | 79.5 | 11.9 | 0.252 | |
| COADREAD | 2 | 97.0 | 27.4 | 0.091 | 97.3 | 26.6 | 0.081 |
| 3 | 71.0 | 26.1 | 0.554 | 85.3 | 21.4 | 0.410 | |
| 4 | 77.2 | 18.1 | 0.387 | 68.5 | 19.0 | 0.405 | |
| 5 | 73.3 | 14.9 | 0.358 | 70.6 | 15.6 | 0.366 | |
| 6 | 72.1 | 12.3 | 0.291 | 72.0 | 12.5 | 0.333 | |
| OV | 2 | 94.1 | 28.7 | 0.150 | 93.3 | 29.3 | 0.205 |
| 3 | 86.4 | 21.4 | 0.291 | 82.3 | 22.1 | 0.445 | |
| 4 | 61.1 | 20.1 | 0.497 | 68.3 | 18.6 | 0.470 | |
| 5 | 60.2 | 16.5 | 0.454 | 57.5 | 16.1 | 0.461 | |
| 6 | 62.0 | 13.5 | 0.395 | 54.1 | 13.7 | 0.402 | |
Fig 4Consensus clustering and methylation profiles for three cancer sites at K = 2.
(A) breast cancer (BRCA); (B) colorectal cancer (COADREAD); (C) ovarian cancer (OV). For each cancer site: Left panel: visualization of the consensus clustering with blue indicating high consensus and white indicating low consensus. Middle panels: association with published gene expression and protein abundance subtypes, and additional molecular and clinical features for each cancer, followed by association with somatic mutation status. Statistically significant overlaps, found using Chi-squared and Kruskal-Wallis tests, are marked with asterisks.