| Literature DB >> 34200630 |
Olga Krali1, Josefine Palle1,2, Christofer L Bäcklin3, Jonas Abrahamsson4, Ulrika Norén-Nyström5, Henrik Hasle6, Kirsi Jahnukainen7, Ólafur Gísli Jónsson8, Randi Hovland9, Birgitte Lausen10, Rolf Larsson3, Lars Palmqvist11, Anna Staffas11, Bernward Zeller12, Jessica Nordlund1.
Abstract
Pediatric acute myeloid leukemia (AML) is a heterogeneous disease composed of clinically relevant subtypes defined by recurrent cytogenetic aberrations. The majority of the aberrations used in risk grouping for treatment decisions are extensively studied, but still a large proportion of pediatric AML patients remain cytogenetically undefined and would therefore benefit from additional molecular investigation. As aberrant epigenetic regulation has been widely observed during leukemogenesis, we hypothesized that DNA methylation signatures could be used to predict molecular subtypes and identify signatures with prognostic impact in AML. To study genome-wide DNA methylation, we analyzed 123 diagnostic and 19 relapse AML samples on Illumina 450k DNA methylation arrays. We designed and validated DNA methylation-based classifiers for AML cytogenetic subtype, resulting in an overall test accuracy of 91%. Furthermore, we identified methylation signatures associated with outcome in t(8;21)/RUNX1-RUNX1T1, normal karyotype, and MLL/KMT2A-rearranged subgroups (p < 0.01). Overall, these results further underscore the clinical value of DNA methylation analysis in AML.Entities:
Keywords: 450k array; DNA methylation; acute myeloid leukemia; classification; epigenetics; pediatric AML; subtyping
Mesh:
Substances:
Year: 2021 PMID: 34200630 PMCID: PMC8229099 DOI: 10.3390/genes12060895
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Cytogenetic subtype, age range, central nervous system (CNS) involvement, stem cell transplantation (SCT), and FAB classification for the 142 pediatric AML patient samples. The number of samples taken at AML diagnosis and relapse are presented. When known, the subtype of the relapse samples is indicated as the same subtype as their diagnostic pair.
| Cytogenetic Subtype | Normal | Undefined | t(8;21) | inv(16) | mono 7 | t(15;17) | sole +8 | 3q21q26 | |
|---|---|---|---|---|---|---|---|---|---|
| Number of diagnostic Samples | 30 | 25 | 24 | 19 | 12 | 5 | 4 | 3 | 1 |
| Number of relapse samples | 8 | 1 | 5 | 4 | 1 | - | - | - | - |
| Age range | 1–18 | 0–16 | 0–17 | 2–16 | 1–17 | 1–5 | 3–16 | 3–14 | 14 |
| CNS involvement | - | - | - | - | - | - | - | - | - |
| Yes | 4 | 1 | 2 | 6 | - | - | - | - | - |
| No | 34 | 25 | 27 | 17 | 12 | 4 | 3 | 3 | 1 |
| Missing | - | - | - | - | 1 | 1 | 1 | - | - |
| SCT | - | - | - | - | - | - | - | - | - |
| Yes | 8 | 3 | 1 | - | - | 5 | - | 2 | - |
| No | 30 | 23 | 27 | 23 | 13 | - | 3 | 1 | 1 |
| Missing | - | - | 1 | - | - | - | 1 | - | - |
| FAB | - | - | - | - | - | - | - | - | - |
| M0 | 2 | - | - | 1 | - | 2 | - | 1 | - |
| M1 | 7 | - | 3 | 1 | - | 1 | - | 1 | 1 |
| M2 | 14 | 1 | 3 | 19 | 3 | 1 | - | - | - |
| M3 | 1 | - | 1 | - | - | - | 4 | - | - |
| M4 | 10 | 1 | 10 | 2 | 9 | - | - | - | - |
| M5 | - | 24 | 4 | - | - | 1 | - | 1 | - |
| M6 | 3 | - | 2 | - | - | - | - | - | - |
| M7 | - | - | 4 | - | - | - | - | - | - |
| Missing | 1 | - | 2 | - | 1 | - | - | - | - |
Figure 1Data representation in space for the 142 pediatric AML patient samples for the selected 1300 CpG sites. (a) DNA methylation data projection in 2D space in a UMAP plot for UMAPs 1–2 (left) and 3–4 (right). Each point represents a patient sample. All samples are labeled by cytogenetic subtype, samples with unknown subtype (undefined) are shown as black circles with no fill (N = 24, diagnostic samples), and samples taken at relapse are shown by filled triangles colored by cytogenetic subtype (N = 19, relapse samples). (b) DNA methylation heatmap ordered by hierarchical clustering. Each row in the heatmap denotes a CpG site and each column is a patient. Cytogenetic subtype, FAB classification, and relapse status are shown as annotated bars over the plot.
Figure 2(a) Confusion matrix for the cytogenetically defined (diagnostic-known) vs. DNA methylation predicted cytogenetic subtypes (N = 33). (b) Heatmap and hierarchical clustering of 127 samples taken at AML diagnosis or relapse from patients belonging to the four subtypes most common subtypes; t(8;21)/RUNX1-RUNX1T1, inv(16)/CBFB-MYH11, MLL/KMT2A-rearranged and NK. The methylation status of the selected 1300 CpG sites are plotted in the heatmap. The samples are in columns labeled by AML subtype, whether the sample was cytogenetically defined (true/diagnostic-known) or cytogenetically undefined (DNA methylation predicted), and if the sample was taken at AML diagnosis or relapse.
Precision, recall, F1 scores per subtype, as well as overall accuracy of the classifier for the four subtypes with >5 patients in the group. High precision scores indicate the low number of false positives (FP) while high recall scores the low number of false negatives (FN).
| Precision | Recall | F1 Score | Total samples Test Set | Total Samples Train Set | |
|---|---|---|---|---|---|
| 1 | 0.91 | 0.95 | 11 | 14 | |
| inv(16)/ | 1 | 1 | 1 | 3 | 9 |
| Normal Karyotype (NK) | 0.78 | 0.88 | 0.82 | 8 | 22 |
| t(8;21)/ | 1 | 1 | 1 | 7 | 12 |
Figure 3Heatmap and hierarchical clustering of the 45 NK samples across 1300 CpG sites, including 30 cytogenetically defined (diagnostic-known) patients and 15 DNA methylation predicted NK patients. Each row in the heatmap denotes a CpG site and each column is a patient. The labels at the top of the heatmap represent clinical and molecular features of interest, including the Cluster denotation A and B.
Top selected CpGs and genes per subtype based on a one-vs.-rest CpG Site Selection with adjusted p-value threshold 0.05.
| Subtype | N CpG Sites (Adjusted | N CpG Sites Unique to Subtype (N Genes) | Gene Names (CpG IDs) Unique to Subtype |
|---|---|---|---|
| Normal Karyotype | 569 | 6 (5) | |
| 873 | 59 (33) | ||
| mono7 | 330 | 24 (12) | |
| inv(16)/ | 571 | 22 (15) | |
| t(8;21)/ | 723 | 27 (15) | |
| t(15;17)/ | 328 | 8 (8) |
Figure 4Survival analysis for t(8;21)/RUNX1-RUNX1T1 (a,b), NK (c,d), and MLL/KMT2A-rearranged (e,f) patient samples based on the 50 most significant CpGs that separate the diagnostic samples of patients who later went on to relapse from those who did not. Panels (a,c,e) contain patients with a confirmed diagnostic subtype (diagnostic-known). Panels (b,d,f) contain diagnostic-known in addition DNA methylation predicted samples from the undefined group. In each panel, the heatmap (left), is ordered by hierarchical clustering. Samples along the x-axis are split into two groups color coded by high (red) or low (blue) overall methylation level. The numbers in the legend represent the fraction of patients who did not experience an event. Kaplan–Meier curve for relapse free survival analysis (upper right) and overall survival (lower right) of the two groups identified by clustering. The x-axis represents the time until event (months) and events are plotted on the y-axis.