| Literature DB >> 29405013 |
Agnieszka Cecotka1, Joanna Polanska2.
Abstract
Alteration of DNA methylation level in cancer diseases leads to deregulation of gene expression-silencing of tumor suppressor genes and enhancing of protooncogenes. There are several tools devoted to the problem of identification of CpG sites' demethylation but majority of them focuses on single site level and does not allow for quantification of region methylation changes. The aim was to create an adaptive algorithm supporting detection of differentially methylated CpG sites and genomic regions specific for acute myeloid leukemia. Knowledge on AML methylation fingerprint helps in better understanding the epigenetics of leukemogenesis. Proposed algorithm is data driven and does not use predefined quantification thresholds. Gaussian mixture modeling supports classification of CpG sites to several levels of demethylation. p value integration allows for translation from single site demethylation to the demethylation of gene promoter and body regions. Methylation profiles of healthy controls and AML patients were examined (GEO:GSE63409). The differences in whole genome methylation profiles were observed. The methylation profile differs significantly among genomic regions. The lowest methylation level was observed for promoter regions, while sites from intergenic regions were by average higher methylated. The observed number of AML related down methylated sites has not substantially exceeded the expected number by chance. Intergenic regions were characterized by the highest percentage of AML up methylated sites. Methylation enhancement/diminution is the most frequent for intergenic region while methylation compensation (positive or negative) is specific for promoter regions. Functional analysis performed for AML down methylated or extreme high up methylated genes showed strong connection to the leukemic processes.Entities:
Keywords: AML; Acute myeloid leukemia; DMR; DNA methylation; Data driven algorithm; Differentially methylated regions; Epigenetics; Gaussian mixture modeling
Mesh:
Substances:
Year: 2018 PMID: 29405013 PMCID: PMC5838208 DOI: 10.1007/s12539-018-0285-4
Source DB: PubMed Journal: Interdiscip Sci ISSN: 1867-1462 Impact factor: 2.233
Fig. 1Whole genome pooled empirical cdf for HSC and AML samples
Number of low, medium, and high methylated CpG sites in HSC and AML samples
| Methylation level | AML | |||
|---|---|---|---|---|
| Low | Medium | High | Total | |
| HSC | ||||
| Low | 191,043 | 14,739 | 2985 | 208,767 |
| Medium | 5668 | 11,286 | 33,931 | 50,885 |
| High | 2297 | 10,093 | 213,470 | 225,860 |
| Total | 199,008 | 36,118 | 250,386 | 485,512 |
Number of CpG sites assigned to each genome region
| TSS region | Gene body region | Intergenic region |
|---|---|---|
| 189,524 | 227,032 | 93,520 |
Fig. 2Empirical cdfs for a TSS, b gene body, and c intergenic regions
Distribution of low, medium, and high methylated CpG sites in HSC and AML for different genomic regions
| Methylation | AML | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TSS region | Gene body region | Intergenic region | ||||||||||
| Low | Medium | High | Total | Low | Medium | High | Total | Low | Medium | High | Total | |
| HSC | ||||||||||||
| Low | 121,393 | 5693 | 1059 | 73,494 | 5874 | 1300 | 13,548 | 3295 | 711 | |||
| Medium | 2101 | 3578 | 8643 | 2417 | 4639 | 16,438 | 1154 | 3065 | 9893 | |||
| High | 614 | 2741 | 43,702 | 1087 | 4661 | 117,122 | 559 | 2722 | 58,573 | |||
| Total | ||||||||||||
Total and region-specific number of differentially demethylated sites—unadjusted p values (one-side tests, significance level α = 0.025)
| Whole genome | TSS region | Gene body | Intergenic region | |||||
|---|---|---|---|---|---|---|---|---|
| Down | Up | Down | Up | Down | Up | Down | Up | |
| Significantly AML demethylated sites | 15,260 (3.14%) | 84,073 (17.32%) | 5287 (2.79%) | 28,492 (15.03%) | 7010 (3.09%) | 39,622 (17.45%) | 3075 (3.29%) | 19,737 (21.10%) |
Fig. 3Distribution of HL statistics, its GM model, and the identified classes of low, medium, high and extreme high AML up methylation
Parameters of the HL related GMM components
| Component ID | Mean value | Standard deviation | Weight | Component ID | Mean value | Standard deviation | Weight |
|---|---|---|---|---|---|---|---|
| 1 | 0.0128 | 0.0189 | 0.2645 | 5 | 0.1792 | 0.1269 | 0.0597 |
| 2 | 0.0019 | 0.0051 | 0.2148 | 6 | − 0.1248 | 0.1107 | 0.0358 |
| 3 | 0.0348 | 0.0334 | 0.2045 | 7 | − 0.3006 | 0.1940 | 0.0158 |
| 4 | 0.0427 | 0.0748 | 0.1942 | 9 | 0.3818 | 0.1775 | 0.0107 |
Number of significantly AML up methylated CpG sites depending on genomic region and up methylation level
| Level of AML demethylation | Whole genome | TSS region | Gene body | Intergenic | ||||
|---|---|---|---|---|---|---|---|---|
| % | % | % | % | |||||
| Up methylation | 84,073 | 17.32 | 28,492 | 15.03 | 39,622 | 17.45 | 19,737 | 21.10 |
| At least medium | 47,659 | 9.82 | 14,196 | 7.49 | 22,177 | 9.77 | 12,738 | 13.62 |
| At least high | 17,317 | 3.57 | 5577 | 2.94 | 7414 | 3.27 | 4734 | 5.06 |
| Extreme high | 8149 | 1.86 | 2716 | 1.43 | 3477 | 1.53 | 2142 | 2.29 |
AML up and down methylation in relation to HSC methylation status
| AML demethylation | HSC low | HSC medium | HSC high | |
|---|---|---|---|---|
| Whole genome | Down | 5374 | 2373 | 7513 |
| No change | 172,711 | 37,773 | 175,735 | |
| Up | 30,682 | 10,779 | 42,612 | |
| TSS | Down | 2764 | 774 | 1749 |
| No change | 109,047 | 10,694 | 36,014 | |
| Up | 16,334 | 2864 | 9294 | |
| Body | Down | 2189 | 1046 | 3775 |
| No change | 66,433 | 17,439 | 96,528 | |
| Up | 12,046 | 5009 | 22,567 | |
| Intergenic | Down | 523 | 535 | 2017 |
| No change | 12,575 | 10,364 | 47,769 | |
| Up | 4456 | 3213 | 12,068 |
Number of demethylated genes after p value integration with respect to demethylated TSS and gene body regions
| AML associated demethylation at gene level | Unadjusted | Storey’s corrected | ||||||
|---|---|---|---|---|---|---|---|---|
| Genes with demethylated TSS region | Genes with demethylated Body region | Genes with demethylated TSS region | Genes with demethylated Body region | |||||
| % | % | % | % | |||||
| Down methylation | 90 | 0.43 | 112 | 0.55 | 22 | 0.11 | 14 | 0.07 |
| Up methylation | 945 | 4.53 | 948 | 4.62 | 600 | 2.88 | 598 | 2.91 |
| At least medium | 385 | 1.85 | 422 | 2.06 | 187 | 0.90 | 162 | 0.79 |
| At least high | 105 | 0.50 | 115 | 0.56 | 53 | 0.25 | 25 | 0.12 |
| Extreme high | 31 | 0.15 | 35 | 0.17 | 18 | 0.09 | 5 | 0.02 |
Fig. 4Methylation profiles for exemplary genes
Number of significantly overrepresented GO terms
| Gene Ontology terms | TSS down | TSS extreme high | Body down | Body extreme high |
|---|---|---|---|---|
| Biological process | 113 | 74 | 8 | 56 |
| Molecular function | 13 | 4 | 7 | 2 |
| Cellular component | 25 | 7 | 10 | 0 |
Number of demethylated lincRNAs, enhancers and transposable elements
| AML demethylation | Down methylation | Up methylation | At least medium | At least high | Extreme high |
|---|---|---|---|---|---|
| linc RNAs | 289 | 1368 | 814 | 269 | 112 |
| Enhancers | 74 | 262 | 143 | 53 | 19 |
| Transposable elements | 838 | 5325 | 3111 | 727 | 180 |