| Literature DB >> 25474268 |
Zijing Mao, Chifeng Ma, Tim H-M Huang, Yidong Chen, Yufei Huang.
Abstract
DNA methylation is a common epigenetic marker that regulates gene expression. A robust and cost-effective way for measuring whole genome methylation is Methyl-CpG binding domain-based capture followed by sequencing (MBDCap-seq). In this study, we proposed BIMMER, a Hidden Markov Model (HMM) for differential Methylation Regions (DMRs) identification, where HMMs were proposed to model the methylation status in normal and cancer samples in the first layer and another HMM was introduced to model the relationship between differential methylation and methylation statuses in normal and cancer samples. To carry out the prediction for BIMMER, an Expectation-Maximization algorithm was derived. BIMMER was validated on the simulated data and applied to real MBDCap-seq data of normal and cancer samples. BIMMER revealed that 8.83% of the breast cancer genome are differentially methylated and the majority are hypo-methylated in breast cancer.Entities:
Mesh:
Year: 2014 PMID: 25474268 PMCID: PMC4243086 DOI: 10.1186/1471-2105-15-S12-S6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The proposed bi-layer HMM for differential methyulation analysis. The first layer HMMs model the methylation statuses in the normal and disease samples in the bins, where 's and 's represent the reads counts and the methylations status in the normal samples, respectively, and 's and 's those in the cancer samples. The second layer HMM models the relationship between the differntial methylation status, dm, and the methylations statuses of and .
Parameter set used for simulation.
| Table 1-1 | ||||||
|---|---|---|---|---|---|---|
| 0.99999 | 0.7296 | 0.2704 | -1.619118E7 | |||
| 0.00001 | 0.0225 | 0.9775 | ||||
| 0.99999 | 0.7614 | 0.2386 | -2.928487E7 | |||
| 0.00001 | 0.0563 | 0.9437 | ||||
| 0.9 | 0.04 | 0.03 | 0.01 | 0.01 | 0.01 | |
| 0.26 | 0.24 | 0.2 | 0.18 | 0.08 | 0.04 | |
| 0.8 | 0.08 | 0.07 | 0.03 | 0.01 | 0.01 | |
| 0.22 | 0.26 | 0.20 | 0.16 | 0.1 | 0.06 | |
| Weight | ||||||
| 0.99999 | 0.9705 | 0.0295 | 0.3519 | -4.538562E8 | ||
| 0.00001 | 0.2862 | 0.7138 | ||||
Figure 2The performance of BIMMER on Simulated Data. A The pricision-recall curves and the ROC curves for different transitional probabiliteis of differential methylation status . The top row for , the middle row for , and the bottom row for . B. The pricision-recall curves and the ROC curves for different weights. The top row for , the middle row for , and the bottom row for .
Figure 3The performance of BIMMER for different initial weights (0.01, 0.3). A. Results for the true weight ω = 0.1; B. Results for the true weight ω = 0.2; C. Results for the true weight ω = 0.3.
Estimated parameters after training using different initial weight: 0.01 and 0.3
| Weight of simulator | 0.3 | 0.2 | 0.1 | ||||
|---|---|---|---|---|---|---|---|
| Initial weight for training | 0.01 | 0.3 | 0.01 | 0.3 | 0.01 | 0.3 | |
| Transition of simulator | Weight | 0.2820 | 0.2850 | 0.1993 | 0.2002 | 0.0974 | 0.0980 |
| 0.97 | Differential Transition | 0.9705 | 0.9710 | 0.9698 | 0.9699 | 0.9689 | 0.9690 |
| 0.71 | 0.7146 | 0.7173 | 0.7186 | 0.7192 | 0.7048 | 0.7044 | |
| 0.76 | Patient Transition | 0.7584 | 0.7584 | 0.7545 | 0.7545 | 0.7594 | 0.7594 |
| 0.97 | 0.9698 | 0.9698 | 0.9698 | 0.9698 | 0.9699 | 0.9699 | |
| 0.66 | Normal Transition | 0.6300 | 0.6300 | 0.6562 | 0.6562 | 0.6867 | 0.6867 |
| 0.92 | 0.9173 | 0.9173 | 0.9217 | 0.9217 | 0.9289 | 0.9289 | |
Initial values and the prior probabilities of BIMMER.
|
| 1 | 0 |
| 1 | 0 | dm | 1 | 0 | State |
|
| dm | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.9 | 0.1 | 1 | 0.9 | 0.1 | 0.9 | 0.1 | 0.1 | 0.1 | 0.9 | ||||||||
| 0 | 0.1 | 0.9 | 0 | 0.1 | 0.9 | 0.1 | 0.9 | 0.9 | 0.9 | 0.1 | ||||||||
Table. 3-1 to Table. 3-3 are initial transition probabilities for , and dm; Table. 3-4 enlists the initial probabilities of , and dm.
The estimated parameters of the second hidden layer
| 0.9705 | 0.0295 | 0.3519 | |
| 0.2862 | 0.7138 | ||
Figure 4Differential rate of 4 types of genomic regions in different chromosomes.
Differential rate of 4 regions on 24 chromosomes
| Chromosome | Promoter | Exon | Enhancer | Gene Body |
|---|---|---|---|---|
| Chr1 | 0.005729 | 0.015478 | 2.331E-4 | 3.890E-4 |
| Chr2 | 0.003440 | 0.009209 | 1.528E-4 | 2.357E-4 |
| Chr3 | 0.001957 | 0.006006 | 9.138E-4 | 1.402E-4 |
| Chr4 | 0.002017 | 0.005697 | 8.264E-5 | 1.336E-4 |
| Chr5 | 0.001348 | 0.004764 | 9.434E-5 | 9.545E-5 |
| Chr6 | 0.001857 | 0.005404 | 1.131E-4 | 1.300E-4 |
| Chr7 | 0.002172 | 0.005126 | 1.022E-4 | 1.287E-4 |
| Chr8 | 0.002223 | 0.004775 | 8.555E-5 | 1.182E-4 |
| Chr9 | 0.001470 | 0.003957 | 6.925E-5 | 1.079E-4 |
| Chr10 | 0.001583 | 0.004071 | 6.994E-5 | 1.044E-4 |
| Chr11 | 0.001412 | 0.003576 | 6.113E-5 | 9.748E-5 |
| Chr12 | 0.001212 | 0.003218 | 5.683E-5 | 8.236E-5 |
| Chr13 | 0.001256 | 0.002956 | 5.161E-5 | 7.356E-5 |
| Chr14 | 0.001070 | 0.002745 | 4.777E-5 | 6.970E-5 |
| Chr15 | 0.001162 | 0.002940 | 4.717E-5 | 7.995E-5 |
| Chr16 | 0.001172 | 0.002930 | 5.095E-5 | 8.412E-5 |
| Chr17 | 0.001196 | 0.002969 | 5.242E-5 | 8.350E-5 |
| Chr18 | 0.001090 | 0.002745 | 4.886E-5 | 7.272E-5 |
| Chr19 | 8.860E-4 | 0.002388 | 4.385E-5 | 6.917E-5 |
| Chr20 | 0.001014 | 0.002686 | 5.052E-5 | 7.976E-5 |
| Chr21 | 9.763E-4 | 0.002416 | 4.783E-5 | 6.888E-5 |
| Chr22 | 9.368E-4 | 0.002513 | 4.462E-5 | 7.624E-5 |
| ChrX | 6.024E-4 | 0.001642 | 2.793E-5 | 4.002E-5 |
| ChrY | 4.400E-4 | 0.001178 | 2.086E-5 | 2.979E-5 |
| Total | 0.001225 | 0.003201 | 5.644E-5 | 8.452E-5 |
Top 20 differential methylated gene
| GENE SYMBOL | DIFFMETHY RATE | METHYLATION STATUS |
|---|---|---|
| 0.380952381 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.333333333 | 0 | |
| 0.317460317 | 1 | |
| 0.285714286 | 0 | |
| 0.285714286 | 0 |
Differential rate of normal and patient samples for 22 breast cancer related genes
| Gene Name | Relation with Breast Cancer | Differential Methylation Status | Differential Rate |
|---|---|---|---|
| Related to Survival | Yes | 0.2195 | |
| Related to Survival | No | ||
| Related to Survival | Not maped | ||
| Related to Survival | No | ||
| Related to Survival | Yes | 0.3659 | |
| Related to Survival | Yes | 0.2195 | |
| Related to Tumor Size | No | ||
| Related to Tumor Size | Yes | 0.2683 | |
| Related to Tumor Size | Yes | 0.3171 | |
| Related to Tumor Size | No | ||
| Related to Tumor Size | No | ||
| Related to Tumor Size | No | ||
| Related to Tumor Size | No | ||
| Related to Tumor Size | No | ||
| Related to ER+ | No | ||
| Related to ER+ | Yes | 0.2195 | |
| Related to ER+ | No | ||
| Related to ER+ | No | ||
| Related to ER+ | No | ||
| Related to ER+ | Yes | 0.1951 | |
| Related to ER+ | No | ||
| Related to ER+ | No |
Figure 5Density Plots of Breast Cacner Related Differentially Methylated Genes A. Density Plot for ACADL (Chr11:1,986,988-1,996,988). B. Density Plot for DAB2IP (Chr20:42,346,800-42,356,800). C. Density Plot for IL6 (Chr18:5,228,722-5,238,722). D. Density Plot for IMPACT (Chr9:2,150,455-2,160,455). E. Density Plots for RARA (Chr7:127,223,462-127,233,462). For each sub-figure, the plot includes 3 panels. The top panel shows a single line of squares, each representing a predicted differential methylation at a bin, where red square denotes differentially methylation. The second panel shows the reads density of 10 normal samples together with the predicted methylation status (the top indicator line). The reads density is in red color and color intensity is proportion to the read counts. The green square in the indicator line denotes that the bin is predicted to be methylated. The third panel shows the read density of 10 breast cancer patients and the corresponding predicted methylation status.