| Literature DB >> 23452827 |
Jerry Guintivano1, Michal Arad, Kellie L K Tamashiro, Todd D Gould, Zachary A Kaminsky.
Abstract
BACKGROUND: Genome-wide tiling array experiments are increasingly used for the analysis of DNA methylation. Because DNA methylation patterns are tissue and cell type specific, the detection of differentially methylated regions (DMRs) with small effect size is a necessary feature of tiling microarray 'peak' finding algorithms, as cellular heterogeneity within a studied tissue may lead to a dilution of the phenotypically relevant effects. Additionally, the ability to detect short length DMRs is necessary as biologically relevant signal may occur in focused regions throughout the genome.Entities:
Mesh:
Year: 2013 PMID: 23452827 PMCID: PMC3599767 DOI: 10.1186/1471-2105-14-76
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Meta-analysis vs. t statistic performance. a.) Heat maps of randomly generated DMRs among 5 cases and 5 controls where 20,40,60,80, and 100% of probes are significant. The foreground and background panes depict DMRs spanning 5 and 30 probes, respectively. b.) Density plots of the percentage of significant probes within DMRs detected by BioTile (blue) and the average t-test method (red).
Figure 2Data simulation and smoothing performance. a.) Heat maps of 100,000 probes derived from DNA methylation profiling of 5 control hippocampal samples from OVX C57BL/6 J mice and a simulated data matrix where probe mean and probe variance were modeled on the empirical dataset. An example of a simulated DMR inserted data matrix is depicted. b.) A plot of an example DMR 5 probes long exhibiting an average log2 fold change of 1. Mean group-wise differences are plotted in black (y-axis, top) as a function of the relative probe position (x-axis, top). The smoothed t-statistics (left y-axis, middle, red) generated by the CHARM algorithm and the relative meta-analysis weights generated by BioTile (right y-axis, middle, blue), are plotted. A heat map of the location is depicted (bottom). Vertical green lines denote the location of the inserted DMR.
Algorithm power
| 5 | 0.06 | 0.56 | 0.09 | OR = 0.05, p = 4.6 × 10-182 | OR = 0.60, p = 1.2 × 10-03 |
| 10 | 0.83 | 0.48 | 0.27 | OR = 5.45, p = 5.1 × 10-81 | OR = 14.66, p = 1.1 × 10-196 |
| 15 | 0.89 | 0.48 | 0.44 | OR = 8.34, p = 1.7 × 10-112 | OR = 9.83, p = 5.9 × 10-132 |
| 20 | 0.93 | 0.67 | 0.49 | OR = 5.17, p = 7.7 × 10-52 | OR = 11.70, p = 4.2 × 10-133 |
| | | | | | |
| 5 | 0.82 | 0.25 | 0.15 | OR = 11.39, p = 8.9 × 10-29 | OR = 20.14, p = 7.2 × 10-40 |
| 10 | 0.91 | 0.60 | 0.24 | OR = 3.64, p = 1.1 × 10-07 | OR = 19.92, p = 2.6 × 10-39 |
| 15 | 0.93 | 0.72 | 0.39 | OR = 4.86, p = 1.0 × 10-08 | OR = 17.82, p = 2.6 × 10-32 |
| 20 | 0.97 | 0.78 | 0.62 | OR = 9.25, p = 2.4 × 10-09 | OR = 22.21, p = 2.4 × 10-22 |
| 25 | 0.95 | 0.83 | 0.69 | OR = 3.65, p = 1.2 × 10-04 | OR = 7.12, p = 3.5 × 10-11 |
| 30 | 0.99 | 0.84 | 0.77 | OR = 18.65, p = 2.2 × 10-08 | OR = 29.94, p = 2.9 × 10-13 |
| | | | | | |
| 0.1 | 0.74 | 0.36 | 0.35 | OR = 4.93, p = 7.7 × 10-19 | OR = 5.18, p = 7.0 × 10-20 |
| 0.5 | 0.90 | 0.58 | 0.36 | OR = 7.26, p = 1.1 × 10-14 | OR = 18.17, p = 1.8 × 10-33 |
| 0.75 | 0.97 | 0.57 | 0.37 | OR = 20.17, p = 1.1 × 10-10 | OR = 41.03, p = 2.4 × 10-18 |
| 1 | 0.97 | 0.83 | 0.50 | OR = 7.90, p = 5.0 × 10-08 | OR = 43.53, p = 3.5 × 10-41 |
| 1.5 | 1.00 | 0.85 | 0.63 | OR = 55.96, p = 2.9 × 10-14 | OR = 194.39, p = 1.9 × 10-41 |
| 2 | 1.00 | 0.88 | 0.75 | OR = Inf, p = 1.4 × 10-03 | OR = Inf, p = 4.0 × 10-07 |
A table depicting the power of each algorithm to identify ‘hidden’ DMRs inserted into the simulated data matrix. Fisher’s odds ratios over 1 denote a higher proportion of DMRs identified by BioTile relative to TileMap and CHARM, respectively.
Figure 3Algorithm performance. Plots of the proportion of hidden DMRs identified (power) by BioTile (blue triangles), TileMap (purple squares), and CHARM (red circles) as a function of sample size (a), DMR length (b), and log2 fold change (c).
Figure 4Interaction of DMR characteristics on algorithm performance. Three dimensional depictions of non-parametric loess models fitting power (y-axis) as a function of DMR size (x-axis) and effect size (z-axis) for BioTile (a), TileMap (b), and CHARM (c).
Algorithm performance in biological datasets
| | | | | | | |
| 1 | H3K9me3 | 12 | 9 | 5,315 | 62 | 83 |
| 2 | DNA methylation | 6 | 9 | 5,767 | 27 | 105 |
| 3 | DNA methylation | 10 | 9 | 5,558 | 821 | 100 |
| | | | | | | |
| 1 | H3K9me3 | 12 | 9 | 100% | 28% | 0% |
| 2 | DNA methylation | 6 | 9 | 100% | 9% | 0% |
| 3 | DNA methylation | 10 | 9 | 100% | 9% | 0% |
| | | | | | | |
| 1 | H3K9me3 | 12 | 9 | 71% | 28% | 0% |
| 2 | DNA methylation | 6 | 9 | 45% | 9% | 0% |
| 3 | DNA methylation | 10 | 9 | 55% | 9% | 0% |
A table depicting algorithm performance when applied to datasets derived from the McGowan et al., and Suderman et al., studies investigating the effects of early life environmental influence on epigenetic marks in the hippocampus.