| Literature DB >> 29668835 |
Pierre-Louis Cedoz1, Marcos Prunello2, Kevin Brennan1, Olivier Gevaert1.
Abstract
Summary: DNA methylation is an important mechanism regulating gene transcription, and its role in carcinogenesis has been extensively studied. Hyper and hypomethylation of genes is a major mechanism of gene expression deregulation in a wide range of diseases. At the same time, high-throughput DNA methylation assays have been developed generating vast amounts of genome wide DNA methylation measurements. We developed MethylMix, an algorithm implemented in R to identify disease specific hyper and hypomethylated genes. Here we present a new version of MethylMix that automates the construction of DNA-methylation and gene expression datasets from The Cancer Genome Atlas (TCGA). More precisely, MethylMix 2.0 incorporates two major updates: the automated downloading of DNA methylation and gene expression datasets from TCGA and the automated preprocessing of such datasets: value imputation, batch correction and CpG sites clustering within each gene. The resulting datasets can subsequently be analyzed with MethylMix to identify transcriptionally predictive methylation states. We show that the Differential Methylation Values created by MethylMix can be used for cancer subtyping. Availability and implementation: MethylMix 2.0 was implemented as an R package and is available in bioconductor. https://www.bioconductor.org/packages/release/bioc/html/MethylMix.html.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29668835 PMCID: PMC6129298 DOI: 10.1093/bioinformatics/bty156
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Subtyping of lung squamous cell carcinoma patients based on MethylMix2.0 analysis. This heatmap illustrates ‘DM values’ for 638 MethylMix genes (rows) in 503 LUSC patient primary tumors (columns). Patients are ordered by DNA methylation subtype, indicated in the horizontal sidebar. DNA methylation subtypes represent patient groups with distinct DNA methylation profiles that are homogenous within subtypes. The NSD1-inactivated subtype is highlighted in red, with other (NSD1 proficient) subtypes indicated in grey. Horizontal sidebars indicate the category of each patient with regard to key etiological variables including NSD1 mutations (white space reflects patients without NSD1 mutation data), smoking status, sex, pathological stage and smoking status. MethylMix genes (rows) are ordered by hierarchical clustering