| Literature DB >> 22956892 |
Mingxiang Teng1, Yadong Wang, Seongho Kim, Lang Li, Changyu Shen, Guohua Wang, Yunlong Liu, Tim H M Huang, Kenneth P Nephew, Curt Balch.
Abstract
A number of empirical Bayes models (each with different statistical distribution assumptions) have now been developed to analyze differential DNA methylation using high-density oligonucleotide tiling arrays. However, it remains unclear which model performs best. For example, for analysis of differentially methylated regions for conservative and functional sequence characteristics (e.g., enrichment of transcription factor-binding sites (TFBSs)), the sensitivity of such analyses, using various empirical Bayes models, remains unclear. In this paper, five empirical Bayes models were constructed, based on either a gamma distribution or a log-normal distribution, for the identification of differential methylated loci and their cell division-(1, 3, and 5) and drug-treatment-(cisplatin) dependent methylation patterns. While differential methylation patterns generated by log-normal models were enriched with numerous TFBSs, we observed almost no TFBS-enriched sequences using gamma assumption models. Statistical and biological results suggest log-normal, rather than gamma, empirical Bayes model distribution to be a highly accurate and precise method for differential methylation microarray analysis. In addition, we presented one of the log-normal models for differential methylation analysis and tested its reproducibility by simulation study. We believe this research to be the first extensive comparison of statistical modeling for the analysis of differential DNA methylation, an important biological phenomenon that precisely regulates gene transcription.Entities:
Year: 2012 PMID: 22956892 PMCID: PMC3432337 DOI: 10.1155/2012/376706
Source DB: PubMed Journal: Comp Funct Genomics ISSN: 1531-6912
Five empirical Bayes model frameworks.
| Empirical Bayes model | H0 : | HA : | Likelihood ∏ |
|---|---|---|---|
| H0 : | HA : | ||
| BGG |
|
|
|
|
|
| ||
|
| |||
| H0 : | HA : | ||
|
|
| ||
| BNGG |
|
|
|
|
|
| ||
|
|
| ||
|
| |||
| H0 : | HA : | ||
|
|
| ||
| BNNGG |
|
|
|
|
|
| ||
|
|
| ||
|
| |||
| H0 : | HA : | ||
| BLNN |
|
|
|
|
|
| ||
|
| |||
| BLNNN | H0 : | HA : | |
|
|
| ||
|
|
|
| |
|
|
| ||
Five empirical Bayes models parameter list.
| Empirical Bayes model | Parameters | Observed data | Missing data |
|---|---|---|---|
| BGG |
|
|
|
| BNGG |
|
|
|
| BNNGG |
|
|
|
| BLNN |
|
|
|
| BLNNN |
|
|
|
Note: i, j and k represent probe, sample and replicate, respectively.
Time-dependent methylation pattern definitions. Between the parent A2780 cell and its cisplatin-treated 1st, 3rd, and 5th generation daughter cells, a probe with increased methylation (probability ≥ 0.8) is defined as hypermethylation (i.e., up), a probe with decreased methylation (probability ≥ 0.8) is defined as hypomethylation (i.e., down), and otherwise, the methylation change is even. Probes showing decreased methylation from generations 1 to 3 to 5 were defined as having “stochastic hypomethylation.” Analogously, probes showing increased methylation from generations 1 to 3 to 5 were considered to exhibit “stochastic hypermethylation.” Finally, probes showing mixed increased and decreased methylation from generations 1 to 3 to 5 were defined as having “random differential methylation.”
| Categories | Differential methylation | ||
|---|---|---|---|
| Parental versus Generation 1 | Parental versus generation 3 | Parental versus generation 5 | |
| Stochastic hypomethylation | Down | Down | Down |
| Even | Down | Down | |
| Even | Even | Down | |
|
| |||
| Stochastic hypermethylation | Up | Up | Up |
| Even | Up | Up | |
| Even | Even | Up | |
|
| |||
| Random differential methylation | Down | Up | Down |
| Down | Up | Even | |
| Down | Even | Down | |
| Even | Up | Down | |
| Even | Up | Even | |
| Even | Down | Even | |
| Even | Down | Up | |
| Up | Down | Up | |
| Up | Down | Even | |
| Up | Even | Up | |
Figure 1Model performance comparisons in differential methylation data analysis. Five empirical Bayes models were compared: (1) binary-gamma-gamma (BGG); (2) binary-normal-gamma-gamma (BNGG); (3) binary-normal-normal-gamma-gamma (BNNGG); (4) binary-log-normal-normal (BLNN); (5) binary-log-normal-normal-normal (BLNNN). Negative log-likelihoods (a) and the number of identified differentially methylated CpG islands (b) of the five Bayesian models as applied for comparing methylation differences between A2780 parental cells and their cisplatin-treated 1st, 3rd, and 5th generation daughter cells.
Figure 2Differentially methylated CpG islands before and after cisplatin treatment identified by empirical Bayes models. Scatter plots of the logarithmically transformed DNA methylation intensities before and after 1, 3, and 5 cell divisions of cisplatin-treated A2780 cells, in which the x-axis represents the parental A2780 cell line and the y-axis represents the cisplatin-treated A2780 progeny sublines. Rows 1, 2, and 3 represent the A2780 sublines following 1, 3, and 5 cell divisions coincident to treatment with the DNA crosslinking agent cisplatin. Columns 1, 2, 3, 4, and 5 represent binary-gamma-gamma (BGG) model, binary-normal-gamma-gamma (BNGG) model, binary-normal-normal-gamma-gamma (BNNGG) model, binary-log-normal-normal (BLNN) model, and binary-log-normal-normal-normal (BLNNN) model, respectively. Red, blue, and green represent the differentially methylated CpG islands (Z ⩾0.8), not determined differentially methylated or not (0.2 < Z < 0.8) and not differentially methylated CpG islands (Z ⩽ 0.2).
Figure 3Numbers of CpG islands, as identified by empirical Bayes models, segregating into our three previously defined methylation heritability categories [30].
Figure 4Overlaps of stochastically hypo- and hypermethylated CpG islands identified by empirical Bayes models.
Number of significantly enriched TFBSs in time-dependent methylation patterns.
| Empirical Bayes model | Stochastic hypo-methylation | Stochastic hyper-methylation | Random differential methylation |
|---|---|---|---|
| BGG | 0 | 0 | 4 |
| BNGG | 0 | 0 | 0 |
| BNNGG | 0 | 0 | 0 |
| BLNN | 71 | 51 | 19 |
| BLNNN | 36 | 58 | 0 |