| Literature DB >> 36241624 |
Eric M Nickels1,2, Shaobo Li2, Swe Swe Myint2, Katti Arroyo2, Qianxi Feng2, Kimberly D Siegmund2, Adam J de Smith2, Joseph L Wiemels3.
Abstract
Aberrant DNA methylation constitutes a key feature of pediatric acute lymphoblastic leukemia at diagnosis, however its role as a predisposing or early contributor to leukemia development remains unknown. Here, we evaluate DNA methylation at birth in 41 leukemia-discordant monozygotic twin pairs using the Illumina EPIC array on archived neonatal blood spots to identify epigenetic variation associated with development of pediatric acute lymphoblastic leukemia, independent of genetic influence. Through conditional logistic regression we identify 240 significant probes and 10 regions associated with the discordant onset of leukemia. We identify a significant negative coefficient bias, indicating DNA hypomethylation in cases, across the array and enhanced in open sea, shelf/shore, and gene body regions compared to promoter and CpG island regions. Here, we show an association between global DNA hypomethylation and future development of pediatric acute lymphoblastic leukemia across disease-discordant genetically identical twins, implying DNA hypomethylation may contribute more generally to leukemia risk.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36241624 PMCID: PMC9568651 DOI: 10.1038/s41467-022-33677-z
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Subject characteristics
| Sex | Female | 27 (63.8%) | |
| Male | 16 (37.2%) | ||
| Race/ethnicity | Non-Hispanic White | 11 (27%) | |
| Non-Hispanic Black | 3 (7%) | ||
| Hispanic | 24 (59%) | ||
| Asian-Pacific Islander | 3 (7%) | ||
| Median gestational age (range) | 258 days (184–306) | ||
| Birthweight (mean) | Case twin | 2427.4 g | |
| Unaffected sibling | 2363.3 g | ||
| Age of diagnosis (years) | 0–23 (median 5) | ||
| Diagnosis | Precursor cell lymphoblastic leukemia, NOS | 19 (44.1%) | |
| B-lymphoblastic leukemia/lymphoma, NOS | 14 (32.6%) | ||
| Precursor B-cell lymphoblastic leukemia | 7 (16.3%) | ||
| T-lymphoblastic leukemia/lymphoma | 2 (4.7%) | ||
| Leukemia/lymphoma with t(12;21)(p13;q22);TEL-AML1(ETV6-RUNX1) | 1 (2.3%) | ||
| Cell Lineage | B-cell | 32 | |
| T-cell | 4 | ||
| Not listed | 7 |
Birth and diagnostic features of the 43 twin pairs who underwent DNA methylation array analysis are demonstrated in Table.
1Two-sided paired Wilcoxon test. NOS not otherwise specified.
Top differentially methylated probes from conditional regression analysis
| Chr. | Position (hg19) | CpG name | UCSC RefGene name | Relation to Island region | Promoter associated | Mean beta (SD) | Mean delta-beta (SD) | Regression results | |
|---|---|---|---|---|---|---|---|---|---|
| Coef. | FDR | ||||||||
| 14 | 73976155 | cg07122798 | Open sea | No | 0.920 (0.025) | −0.00675 (0.0144) | −2.954 | 2.22E-05 | |
| 19 | 51375975 | cg19582822 | Open sea | No | 0.852 (0.045) | 0.0101 (0.0156) | 1.833 | 1.83E-04 | |
| 1 | 24104752 | cg22131571 | Island | Yes | 0.069 (0.008) | −0.00512 (0.00873) | −2.918 | 1.83E-04 | |
| 15 | 90447350 | cg10093558 | Open sea | No | 0.942 (0.018) | 0.00782 (0.0148) | 2.029 | 1.83E-04 | |
| 21 | 32649215 | cg05299823 | Open sea | No | 0.944 (0.029) | 0.00248 (0.00663) | 3.157 | 6.19E-04 | |
| 12 | 27462965 | cg22709041 | Open sea | No | 0.919 (0.014) | 0.00511 (0.0117) | 2.590 | 6.56E-04 | |
| 2 | 111493356 | cg09756855 | Shelf | No | 0.666 (0.027) | −0.0128 (0.0231) | −3.873 | 6.77E-04 | |
| 1 | 27114177 | cg12806381 | Island | No | 0.071 (0.011) | 0.00734 (0.0123) | 2.036 | 6.77E-04 | |
| 9 | 2717531 | cg26428825 | Shore | No | 0.950 (0.015) | −0.00600 (0.0107) | −2.503 | 0.00217 | |
| 17 | 79479583 | cg05984533 | Island | No | 0.029 (0.006) | 0.00336 (0.00718) | 1.698 | 0.00217 | |
| 1 | 101361740 | cg16449184 | Island | Yes | 0.053 (0.009) | −0.00564 (0.0117) | −2.446 | 0.00306 | |
| 6 | 31934523 | cg06250213 | Open sea | No | 0.914 (0.014) | 0.00738 (0.0172) | 2.212 | 0.00318 | |
| 2 | 170682238 | cg23927974 | Shore | No | 0.927 (0.012) | −0.00970 (0.0157) | −2.352 | 0.00331 | |
| 21 | 43240484 | cg14282798 | Island | No | 0.904 (0.015) | −0.0120 (0.0187) | −2.327 | 0.00362 | |
| 19 | 14089711 | cg18642503 | Island | No | 0.056 (0.006) | −0.00349 (0.00656) | −3.349 | 0.00389 | |
| 20 | 32700182 | cg19881050 | Island | Yes | 0.036 (0.006) | 0.00430 (0.00814) | 1.721 | 0.00425 | |
| 5 | 131826485 | cg06942904 | Island | Yes | 0.066 (0.013) | 0.00964 (0.0189) | 1.901 | 0.00486 | |
| 2 | 74919244 | cg05363574 | Intergenic | Open sea | No | 0.955 (0.015) | −0.00366 (0.00620) | −3.168 | 0.00486 |
| 7 | 105679949 | cg06475386 | Intergenic | Open sea | No | 0.940 (0.013) | 0.00418 (0.0106) | 3.164 | 0.00486 |
| 11 | 32915131 | cg05662684 | Island | Yes | 0.078 (0.007) | −0.00338 (0.00885) | −3.647 | 0.00486 | |
Conditional logistic regression analysis was used to assess for a relationship between leukemia status and DNA methylation at the 710,010 array probes included in the study, controlling for sex, array plate, nucleated cell proportions, and clustering by twin pair identity. The top 20 differentially methylated probes (DMPs) ranked by FDR-corrected conditional regression P value are shown out of a total of 240 significant (FDR < 0.05) DMPs. Probes without a genetic association are denoted as intergenic. Mean beta and delta-beta values are shown for twin pairs included in the conditional regression analysis (n = 37). Chr Chromosome. Coef Coefficient. FDR False discovery rate. SD Standard deviation. UCSC University of California Santa Cruz.
Fig. 1Conditional regression analysis of 37 twin pairs discordant for ALL.
a Volcano plot demonstrating the distribution of coefficients and −log10 P values for 710,010 probes assessed from the DNA methylation array. Coefficients and P values were calculated using conditional logistic regression to test the association of DNA methylation at each CpG with ALL development, adjusting for sex, array plate, nucleated cell proportions, and clustering by paired twin relationships. Dotted line indicates the threshold for false discovery rate (FDR) adjusted significance. Significant probes (FDR < 0.05, n = 240) are highlighted in red. b Q–Q plot demonstrating observed versus expected P values across the array. Genomic inflation is represented by λ = 1.02. Trend line represents observed −log10(P) = expected −log10(P). Gray-shaded area represents 95% confidence interval. c Manhattan plot of regionally adjusted P values showing the genomic location of the 10 significant differentially methylated regions (DMRs) identified in regional analysis. Significant regions (shown in red) were defined as Šidák-corrected P < 0.05 identified using comb-p. The most significant DMR, associated with TRIM39-RPP21, includes a 454 bp region encompassing 9 probes on chromosome 6 (note only four significant probes located within this region are shown due to overlapping P values). A second 167 bp region on chromosome 19 including four probes associated with AMH overlaps with a CoRSIV site. Source data are provided as a Source data file.
Differentially methylated regions
| Chr | Start (hg19) | End (hg19) | Width | # Probes | Coefficient Direction | Mean Region Beta (SD) | Mean Region Delta-Beta (SD) | Šidák | Associated gene | Promoter associated | CoRSIV overlap |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 6 | 30297174 | 30297627 | 454 | 9 | - - - - - - - - - | 0.546 (0.0784) | −0.0169 (0.00383) | 2.39E-09 | No | No | |
| 2 | 240033164 | 240033412 | 249 | 3 | - - - | 0.913 (0.0262) | −0.0108 (0.00740) | 6.71E-04 | No | No | |
| 21 | 32649155 | 32649215 | 61 | 3 | - - + | 0.946 (0.00710) | −0.00124 (0.00333) | 0.005991 | No | No | |
| 19 | 2250901 | 2251067 | 167 | 4 | - - - - | 0.142 (0.0837) | −0.00592 (0.00251) | 0.007136 | No | Yes | |
| 1 | 215804951 | 215805150 | 200 | 2 | - - | 0.725 (0.156) | −0.0121 (0.00434) | 0.007565 | No | No | |
| 16 | 85007113 | 85007184 | 72 | 2 | - - | 0.892 (0.0622) | −0.00729 (0.00539) | 0.01132 | No | No | |
| 11 | 464353 | 464554 | 202 | 2 | - - | 0.949 (0.0223) | −0.00539 (0.000677) | 0.01232 | No | No | |
| 9 | 2717325 | 2717531 | 207 | 3 | + - - | 0.952 (0.00815) | −0.00357 (0.00303) | 0.0153 | Yes | No | |
| 7 | 36500566 | 36500624 | 59 | 2 | + - | 0.948 (0.00193) | −0.000239 (0.00394) | 0.02489 | No | No | |
| 14 | 65231319 | 65231406 | 88 | 2 | – | 0.500 (0.177) | −0.0101 (0.00722) | 0.02852 | No | No |
Differentially methylated regions were identified using comb-p, which assesses the spatial correlation of P values obtained from the conditional regression model. Šidák-corrected P value accounts for multiple comparisons using the comb-p algorithm. Regions encompassing at least two probes with Šidák-corrected P < 0.05 were considered significant. The coefficient direction shows positive and negative coefficients for probes within the region based on genomic position. Mean beta and delta-beta values for probes within each region are shown. A region associated with AMH overlaps with a CoRSIV region and has been previously identified to harbor abnormal DNA methylation at diagnosis. Chr Chromosome. CoRSIV Correlated region of systemic interindividual variation. SD Standard deviation.
Fig. 2Bias in negative coefficients from regression analysis indicates regionally specific DNA hypomethylation profile in ALL cases.
a Box and whisker plots demonstrate median DNA methylation content in n = 41 case and n = 41 control twins (pairs connected by gray lines). Significantly lower global DNA methylation was identified in the case of twins when compared to their sibling controls (Two-sided paired Wilcoxon test P = 0.048). b Bar plot demonstrating proportion of positive (indicating hypermethylation in cases) and negative (indicating hypomethylation) regression coefficients in all probes, by CpG island context, and by genetic context. The percentage of positive and negative probes are shown over bars. All regions are significantly biased toward negative regression coefficients. Regions are arranged by the strength of negative coefficient bias, with CpGs in 3′-UTR, gene body, open sea, exon boundaries (Exon Bnd), and CpG island shelf/shore regions showing a stronger negative bias than the overall array results. ** indicates FDR-corrected P < 0.001. Evaluation of DNA hypomethylation signature identified in conditional regression analysis using raw beta values paired by twin relationship (delta beta, or ALL-case DNA methylation beta value minus control beta value) according to c CpG context and d UCSC RefGene Group for n = 41 independent twin pairs. *FDR-corrected P < 0.05. The distribution of median delta beta values (case beta minus control beta) for probes associated with each genomic region is shown for the full set of 41 twin pairs. A significant shift toward DNA hypomethylation in cases (negative delta beta values) was identified for probes associated with the overall array (All), as well as open sea and shelf/shore regions, however not in island regions. When looking at pair-specific median values (e), twin pairs with negative median delta beta values across all probes (globally hypomethylated in cases, n = 30) tended to have median island values near zero, demonstrating regional specificity of the identified DNA hypomethylation profile. Plot colors represent individual twin pairs. Similarly, by RefGene group (f), a significant negative shift was noted in the gene body, 3′-UTR, and 5′-UTR, at exon boundaries, and at TSS1500 sites, however, there was no negative shift associated with 1st exon and TSS200 probes. Among globally hypomethylated and hypermethylated cases, a pair-specific plot (g, h) shows specificity of findings between gene body and promoter-associated probes. Source data are provided as a Source data file. In boxplots, the box represents the interquartile range (IQR, first through the third quartile) with the centerline showing the median value for all subjects/twin pairs, whiskers show a minimum (first quartile minus 1.5 × IQR) and maximum (third quartile plus 1.5 × IQR) data range.
Coefficient directional bias based on associated CpG region
| Coefficients | Delta-Beta Values | ||||||
|---|---|---|---|---|---|---|---|
| Region | Negative (%) | Positive | Total | Median | IQR | ||
| Full array | |||||||
| All probes | 409819 (57.7%) | 300191 | 710010 | 2.5E-323 | −0.00137 | 0.00353 | 0.00889 |
| CpG context | |||||||
| Open sea | 228222 (58.9%) | 158886 | 387108 | 2E-323 | −0.00189 | 0.00599 | 0.0160 |
| Island | 76982 (53.2%) | 67578 | 144560 | 4.00E-135 | 1.66E-04 | 0.00187 | 0.555 |
| Shelf/shore | 52579 (58.7%) | 36971 | 89550 | 1E-323 | −0.00163 | 0.00396 | 0.00926 |
| UCSC RefGene group | |||||||
| TSS200 | 30824 (53.0%) | 27326 | 58150 | 1.11E-47 | 1.56E-05 | 0.00183 | 0.6535 |
| TSS1500 | 50787 (56.7%) | 38788 | 89575 | 1E-323 | −1.00E-03 | 0.00250 | 0.0199 |
| 3′-UTR | 10793 (61.1%) | 6859 | 17652 | 3.44E-194 | −0.00201 | 0.00623 | 0.0207 |
| 5′-UTR | 35295 (57.0%) | 26589 | 61884 | 3.77E-269 | −9.53E-04 | 0.00239 | 0.0127 |
| Body | 156153 (59.5%) | 106283 | 262436 | 1.5E-323 | −0.00161 | 0.00492 | 0.0160 |
| Exon bnd | 2884 (59.0%) | 2006 | 4890 | 2.89E-36 | −0.00157 | 0.00635 | 0.0284 |
| 1st exon | 12511 (53.5%) | 10865 | 23376 | 5.11E-27 | 1.22E-04 | 0.00171 | 0.682 |
| Regulatory feature group | |||||||
| Promoters | 27523 (52.0%) | 25416 | 52939 | 5.47E-20 | 2.57E-04 | 0.00230 | 0.456 |
| Genes | 1253 (60.1%) | 833 | 2086 | 3.42E-20 | −0.00134 | 0.00462 | 0.0143 |
| Non-genes | 381 (51.9%) | 353 | 734 | 0.319 | 1.90E-04 | 0.00187 | 0.868 |
| Unclassified | 22572 (56.2%) | 17589 | 40161 | 9.08E-137 | −4.02E-04 | 0.00168 | 0.226 |
Coefficient frequency is shown based upon probe associations with CpG island region (CGI Shelf/Shore, Island, Open Sea), UCSC RefGene Group (TSS1500, TSS200, 3′-UTR, 5′-UTR), and Regulatory Feature Group (Promoters, Genes) based on Illumina EPIC array probe annotations. Probes may overlap multiple regions. A significant negative bias in coefficients was identified in all groups by a two-sided binomial test*. While 57.7% of coefficients were negative across all probes, probes in CGI shelf/shore regions (58.7%), genes (60.1%), open sea (59.0%), and 3′-UTR (61.1%) were more highly biased toward negative coefficients compared to all probes on the array. In contrast, promoters (52.0%), TSS1500 (56.7%), TSS200 (53.0%), and island (53.3%) were less significantly biased. The median delta beta values of probes associated with the various regions are shown in the right half of the table. Delta-beta values were significantly shifted toward DNA hypomethylation (negative values) in cases in the overall array as well as specific regions including open sea probes (Wilcoxon signed-rank test**). In contrast, delta beta values were not significantly different from 0 in island, promoter, and TSS200 probes. Exon Bnd Exon boundary, IQR interquartile range, TSS transcriptional start site, UTR untranslated region.