| Literature DB >> 23735126 |
Sheng Li1, Francine E Garrett-Bakelman, Altuna Akalin, Paul Zumbo, Ross Levine, Bik L To, Ian D Lewis, Anna L Brown, Richard J D'Andrea, Ari Melnick, Christopher E Mason.
Abstract
BACKGROUND: DNA methylation profiling reveals important differentially methylated regions (DMRs) of the genome that are altered during development or that are perturbed by disease. To date, few programs exist for regional analysis of enriched or whole-genome bisulfate conversion sequencing data, even though such data are increasingly common. Here, we describe an open-source, optimized method for determining empirically based DMRs (eDMR) from high-throughput sequence data that is applicable to enriched whole-genome methylation profiling datasets, as well as other globally enriched epigenetic modification data.Entities:
Mesh:
Year: 2013 PMID: 23735126 PMCID: PMC3622633 DOI: 10.1186/1471-2105-14-S5-S10
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Workflow of DMR analysis. Objects and data frames from the R-package methylKit (top, grey), or other DNA methylation base-pair data outputs, can immediately be utilized in all the functions in eDMR (white, below).
Figure 2Identification of the optimal cutoff for calling a gap between two DMRs. (A) Histogram of the log2 distance of the nearest CpGs in Sample 1. A spike at zero log2 base pairs distance represents the reverse complement of CpGs (GpC) on the other strand. (B) Bimodal normal distribution fitting on the log2 distance of adjacent CpGs genome-wide in AML sample 1. Two distributions (red, and green) are shown that account for two separate data densities (dotted line). (C) Weighted sum of penalty changes (blue line) over log2 distances. The red line is the optimized log2 DMR distance with the lowest weighted penalty from the cost function
Figure 3DMR analysis and output of eDMR for leukemia samples. (A) Fitting of the bimodal normal distribution to CpGs common to the IDH AML and normal bone marrow control samples. (B) Fitting of the bimodal normal distribution to CpGs common to the MLL AML and normal bone marrow control samples. Both data have similar distributions. (C) The number of hypermethylated (red) and hypomethylated (blue) DMRs identified in each leukemia subtype. (D) Boxplots of the DMR length distributions in both leukemia subtypes. (E) Gene body distributions for CDS (red), introns (mustard), promoters (green), 3'UTRs (blue), and 5'UTRs (purple). (F) CpG island (red) and shore (blue) DMR count distribution in the IDH and MLL AML tumor-types.