| Literature DB >> 23281662 |
Michael P Trimarchi1, Mark Murphy, David Frankhouser, Benjamin A T Rodriguez, John Curfman, Guido Marcucci, Pearlly Yan, Ralf Bundschuh.
Abstract
BACKGROUND: DNA methylation is an important epigenetic mark and dysregulation of DNA methylation is associated with many diseases including cancer. Advances in next-generation sequencing now allow unbiased methylome profiling of entire patient cohorts, greatly facilitating biomarker discovery and presenting new opportunities to understand the biological mechanisms by which changes in methylation contribute to disease. Enrichment-based sequencing assays such as MethylCap-seq are a cost effective solution for genome-wide determination of methylation status, but the technical reliability of methylation reconstruction from raw sequencing data has not been well characterized.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23281662 PMCID: PMC3535705 DOI: 10.1186/1471-2164-13-S8-S6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Differentially methylated regions, endometrial tumors vs. nonmalignant endometrial tissue
| Genomic feature | All samples | Samples passing QC only |
|---|---|---|
| CpG islands | 4717 | 7541 |
| Promoter- associated | 3806 | 3980 |
| CpG shores | 7515 | 15371 |
| Promoters | 314 | 6803 |
Figure 1QC exclusion criteria reduce noise in methylation signal. The percentage of uniquely aligned reads falling in 500 bp bins containing no CpG dinucleotides pre- and post-QC analysis are plotted as a standard boxplot for samples prior to QC filtering, samples that passed QC, and samples that did not pass QC. An input from a sample that was not subjected to methylation capture is included for reference. The number of samples in each group is included above the baseline. Values for replicate lanes in each group were averaged, and samples were statistically compared using a Wilcoxon rank-sum test. Whiskers indicate 10th and 90th percentiles. 13.5% of 500 bp bins in the genome are classified as CpG barren.
Figure 2Replicate sequencing lanes for MethylCap-seq experiments correlate highly. Replicate lanes for each sample were randomly assigned to two partitions, and the average rpm of 6000 (of 6 M) randomly selected 500 bp bins were compared between partitions.
Figure 3Additional lanes of sequencing data moderately increase saturation and greatly increase 5X CpG coverage. Variation in CpG enrichment (A), saturation (B), and 5X coverage (C) was assessed for 15 lanes of data in the ovarian study corresponding to 7 samples by generating plots of individual lanes and combined replicate lanes for each sample. (D) Average percent deviation of the individual lanes from the combined lane for each sample was plotted for each parameter. Error bars for (D) represent standard error. Asterisks represent Student t-test p < 0.05.
Figure 4Global methylation indicator scales inversely with read counts from a spiked-in . The pIRES2-EGFP plasmid was in vitro methylated and spiked-in at a set concentration into each of 14 samples from the decitabine study prior to sequencing. After sequencing, GMI was calculated and plotted against the inverse of the number of normalized reads aligning to the plasmid, and a linear best fit drawn through the points (p = 0.036, R2 = 0.318).