| Literature DB >> 27187204 |
Raivo Kolde1, Kaspar Märtens2, Kaie Lokk3, Sven Laur2, Jaak Vilo2.
Abstract
MOTIVATION: One of the main goals of large scale methylation studies is to detect differentially methylated loci. One way is to approach this problem sitewise, i.e. to find differentially methylated positions (DMPs). However, it has been shown that methylation is regulated in longer genomic regions. So it is more desirable to identify differentially methylated regions (DMRs) instead of DMPs. The new high coverage arrays, like Illuminas 450k platform, make it possible at a reasonable cost. Few tools exist for DMR identification from this type of data, but there is no standard approach.Entities:
Mesh:
Year: 2016 PMID: 27187204 PMCID: PMC5013909 DOI: 10.1093/bioinformatics/btw304
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Method workflow. First, the genome is segmented based on distance between consequent probes. The boxplots show the dependence between the distance and correlation of methylation patterns. Second, the resulting segments are divided further into regions with consistent methylation profiles. Finally, the differential methylation is tested using a linear mixed model (Color version of this figure is available at Bioinformatics online.)
Detected DMRs in the simulation study with 5000 generated DMRs, with an average effect size μ
| Bumphunter | Aclust | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TP | FP | missed | TP | FP | missed | |||||||
| # regions | # sites | # regions | # sites | # regions | # sites | # regions | # sites | # regions | # sites | # regions | # sites | |
| 0.0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 87 | 255 | 0 | 0 |
| 0.025 | 0 | 0 | 0 | 0 | 5000 | 28697 | 1568 | 6385 | 352 | 1273 | 3464 | 18545 |
| 0.050 | 1 | 36 | 0 | 0 | 4999 | 28946 | 3241 | 15485 | 537 | 1980 | 1947 | 8903 |
| 0.075 | 5 | 174 | 0 | 0 | 4994 | 28958 | 4039 | 20952 | 581 | 2068 | 1213 | 5067 |
| 0.10 | 18 | 788 | 0 | 0 | 4978 | 28486 | 4423 | 24077 | 619 | 2158 | 810 | 3047 |
| 0.15 | 33 | 1401 | 0 | 0 | 4956 | 28079 | 4642 | 26681 | 646 | 2254 | 456 | 1656 |
| 0.20 | 63 | 2046 | 0 | 0 | 4930 | 26652 | 4715 | 26771 | 641 | 2224 | 356 | 1002 |
comb-p | seqlm | |||||||||||
TP | FP | missed | TP | FP | missed | |||||||
| # regions | # sites | # regions | # sites | # regions | # sites | # regions | # sites | # regions | # sites | # regions | # sites | |
| 0.0 | 0 | 0 | 71 | 555 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0.025 | 979 | 6005 | 61 | 506 | 3967 | 21005 | 1004 | 4792 | 65 | 330 | 3941 | 22803 |
| 0.050 | 2353 | 15518 | 40 | 295 | 2538 | 11540 | 3014 | 15713 | 172 | 862 | 1988 | 11079 |
| 0.075 | 3237 | 20921 | 29 | 181 | 1632 | 6731 | 4060 | 21450 | 201 | 1027 | 1055 | 5892 |
| 0.10 | 3686 | 23897 | 28 | 169 | 1169 | 4475 | 4626 | 24649 | 232 | 1241 | 591 | 3514 |
| 0.15 | 4140 | 26599 | 23 | 158 | 689 | 2343 | 5082 | 27137 | 257 | 1373 | 252 | 1734 |
| 0.20 | 4360 | 27005 | 27 | 185 | 467 | 1388 | 5280 | 27457 | 263 | 1354 | 112 | 917 |
For each method and for each μ, total number of the detected regions and the corresponding number of sites has been given (divided into true and false positives, ‘TP’ and ‘FP’), together with the the number of missed regions.
For each approach, the number of significant DMRs and the corresponding number of sites has been given
| Single site | Bumphunter | Aclust | Comb-p | seqlm | |||||
|---|---|---|---|---|---|---|---|---|---|
| # sites | # regions | # sites | # regions | # sites | # regions | # sites | # regions | # sites | |
| Lymph node versus others | 58 | 1 | 4 | 2543 | 6943 | 93 | 564 | 21 | 61 |
| Gall bladder versus others | 2875 | 5 | 99 | 6585 | 16054 | 722 | 3062 | 1359 | 2758 |
| Gastric mucosa versus others | 23359 | 2 | 64 | 16661 | 26494 | 2663 | 25248 | 5218 | 24717 |
| Artery versus others | 13639 | 30 | 369 | 16823 | 32626 | 2095 | 9418 | 9126 | 16775 |
| Bone, joint-cartilage versus others | 11553 | 36 | 513 | 17208 | 38107 | 2104 | 9042 | 6816 | 13232 |
| Bladder versus others | 19822 | 39 | 537 | 31413 | 68352 | 3785 | 11865 | 13529 | 22938 |
| Adipose versus others | 28379 | 32 | 407 | 34191 | 63725 | 4182 | 17910 | 15776 | 33196 |
| Ischiatic nerve versus others | 27163 | 20 | 275 | 37846 | 76119 | 4947 | 15895 | 16894 | 30337 |
| Aorta versus others | 40347 | 116 | 1081 | 40063 | 64206 | 6307 | 20809 | 27514 | 47593 |
| Tonsils versus others | 87950 | 12 | 283 | 67344 | 95188 | 6695 | 53474 | 33549 | 94807 |
| Medulla oblongata versus others | 98352 | 179 | 1887 | 97618 | 139954 | 14461 | 51007 | 62370 | 119795 |
| Bone marrow versus others | 173080 | 507 | 4217 | 153184 | 191147 | 33698 | 108590 | 100910 | 187950 |
Fig. 2.Validation of identified DMR-s. Panel A has an example from the tissue dataset together with a DMR detected by seqlm (large grey box). The top plot shows the data on array resolution and in bottom plot a portion of the DMR has been zoomed in, and methylation levels of intermediate CpG sites obtained by Sanger sequencing are shown. Panel B shows the effect sizes that are estimated from array data and from Sanger sequencing data for all 14 regions. The 95% confidence intervals are shown for both sources of estimates (Color version of this figure is available at Bioinformatics online.)