| Literature DB >> 27346250 |
Emanuele Libertini1, Simon C Heath2, Rifat A Hamoudi3, Marta Gut2, Michael J Ziller4,5,6, Agata Czyz7, Victor Ruotti7, Hendrik G Stunnenberg8, Mattia Frontini9,10,11, Willem H Ouwehand9,10,12, Alexander Meissner4,5,6, Ivo G Gut2, Stephan Beck1.
Abstract
The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27346250 PMCID: PMC4931220 DOI: 10.1038/ncomms11306
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Figure 1Relationship between methylation values and oscillation of methylation (OM) for chromosome 1 of M1.
(a) Patterns of oscillations as estimated by OM. Values were scaled to 0–1. (b) Relationship between methylation value and OM distribution in a representative region of M1. Delta (OM) values were scaled to 0–1. (c) Quantile distribution of OM values. Most oscillations are around 0, significant oscillations represent a deviation from the co-methylation and are used to call the successive COMET boundaries. (d) Rounded quantile distribution of OM values. COMETs are called using the dynamic OMg threshold which is defined by significant deviations in the OM distribution, representing roughly 8% of the OM values for the methylomes included here.
Figure 2COMETgazer and MethylSeekR segmentation for M1 with corresponding methylation values.
A red box highlights the fine-grained nature of COMET analysis in segmentation compared with features defined by MethylSeekR. This example region shows the COMET break-down of multiple MethylSeekR features.
Figure 3Information recovery by DMC analysis.
(a) Semi-quantitative DMP content recovery rates for DMR and DMC analysis based on the results from the RADmeth replicate analysis. DMP calls were set at P<0.05 after BH adjustment. DMRs are typically short compared to DMCs which accounts for the difference in DMP counts. (b) Example DMC (boxed red) showing methylation level (tracks 1–2), COMETs (tracks 3–5), DMPs (tracks 6–7), and DMRs (tracks 8–9) for M1 at maximum and 30X coverage, as well as M3. DMP calls are shown as adjusted p values for differential methylation between M1–2 and M7–10. DMR values representing differential methylation between M1–2 and M7–10 correspond to areaStat, a parameter of compound t-statistics for the included DMPs. COMET values correspond to the average value inside each block.
Figure 4Correlation between African (YRU) haplotype blocks and YRU COMETs derived from M5.
Median haplotype block size defined by r2>0.9 versus median COMET size defined by OMg=0.1. Data was tiled over fixed windows of 100,000 bp and scaled over 0–1 (Methods).