| Literature DB >> 36123353 |
Benjamin J Huang1,2, Jenny L Smith3, Jason E Farrar4, Yi-Cheng Wang5, Masayuki Umeda6, Rhonda E Ries3, Amanda R Leonti3, Erin Crowgey7, Scott N Furlan3,8, Katherine Tarlock3,8, Marcos Armendariz9, Yanling Liu10, Timothy I Shaw10, Lisa Wei11, Robert B Gerbing5, Todd M Cooper8, Alan S Gamis12, Richard Aplenc13, E Anders Kolb7, Jeffrey Rubnitz14, Jing Ma6, Jeffery M Klco6, Xiaotu Ma10, Todd A Alonzo15, Timothy Triche16, Soheil Meshinchi3,8.
Abstract
Relapsed or refractory pediatric acute myeloid leukemia (AML) is associated with poor outcomes and relapse risk prediction approaches have not changed significantly in decades. To build a robust transcriptional risk prediction model for pediatric AML, we perform RNA-sequencing on 1503 primary diagnostic samples. While a 17 gene leukemia stem cell signature (LSC17) is predictive in our aggregated pediatric study population, LSC17 is no longer predictive within established cytogenetic and molecular (cytomolecular) risk groups. Therefore, we identify distinct LSC signatures on the basis of AML cytomolecular subtypes (LSC47) that were more predictive than LSC17. Based on these findings, we build a robust relapse prediction model within a training cohort and then validate it within independent cohorts. Here, we show that LSC47 increases the predictive power of conventional risk stratification and that applying biomarkers in a manner that is informed by cytomolecular profiling outperforms a uniform biomarker approach.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36123353 PMCID: PMC9485122 DOI: 10.1038/s41467-022-33244-6
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Fig. 1LSC17 in TCGA AML based on age.
a–d Kaplan–Meier estimates for the probability of event free survival (EFS) and overall survival (OS) in patients from The Cancer Genome Atlas (TCGA) AML cohort segregated based on age. a, b LSC17 scores predict survival in younger adults (<60 years of age). c, d Conversely, LSC17 scores do not discriminate between favorable and unfavorable outcomes in older adults (≥60 years of age). Survival differences were determined using the log-rank test (two-sided and without multiple-testing adjustments). e Schematic diagram for our experimental design. Our data set consists of primary samples that were obtained at the time of diagnosis after enrollment in one of four clinical trials listed in the left panel (black). Specific data analyses and associated figures are noted in the bottom panel (gray). Samples underwent either polyadenylation enrichment or ribosomal RNA depletion. Stratified randomization was performed based on fusion category to generate two cohorts for risk model training and validation.
Patient characteristics
| Characteristic | Training cohort ( | Validation cohort ( | |
|---|---|---|---|
| Sex | |||
| Female | 359 (47.7) | 370 (49.3) | 0.520 |
| Male | 394 (52.3) | 380 (50.7) | |
| Age | |||
| <3 years | 175 (23.2) | 174 (23.2) | 0.985 |
| 3–5 years | 60 (8.0) | 63 (8.4) | 0.760 |
| 5–10 years | 134 (17.8) | 145 (19.3) | 0.443 |
| 10–18 years | 323 (42.9) | 318 (42.4) | 0.846 |
| >18 years | 61 (8.1) | 50 (6.7) | 0.288 |
| WBC Count | |||
| <100,000/µL | 594 (79.0) | 586 (78.1) | 0.686 |
| ≥100,000/µL | 158 (21.0) | 164 (21.9) | |
| Unknown | 1 | 0 | |
| Cytomolecular risk group | |||
| Low | 291 (38.6) | 289 (38.5) | 0.964 |
| Standard | 212 (28.2) | 197 (26.3) | 0.411 |
| High | 250 (33.2) | 264 (35.2) | 0.414 |
| MRD at end of induction I | |||
| No | 462 (68.8) | 468 (69.0) | 0.796 |
| Yes | 210 (31.3) | 210 (31.0) | |
| Unknown | 81 | 72 | |
| SCT in CR1 | |||
| No | 650 (86.3) | 659 (87.9) | 0.372 |
| Yes | 103 (13.7) | 91 (12.1) | |
| No | 713 (94.7) | 707 (94.3) | 0.721 |
| Yes | 40 (5.3) | 43 (5.7) | |
| No | 609 (80.9) | 602 (80.3) | 0.765 |
| <0.1 | 26 (3.5) | 28 (3.7) | 0.770 |
| ≥0.1 | 118 (15.7) | 120 (16.0) | 0.861 |
| Fusion category | |||
| 101 (13.4) | 101 (13.5) | 0.976 | |
| 82 (10.9) | 82 (10.9) | 0.978 | |
| 158 (21.0) | 157 (20.9) | 0.981 | |
| 60 (8.0) | 59 (7.9) | 0.942 | |
| 14 (1.9) | 13 (1.7) | 0.854 | |
| Other or no fusion | 338 (44.9) | 338 (45.1) | 0.944 |
Demographic and molecular characteristics of our study cohort. Abbreviations include WBC white blood cells, CNS central nervous system, MRD minimal residual disease, SCT stem cell transplant, CR1 first complete remission, ITD internal tandem duplication, KD kinase domain. P-values were based on the chi-squared test.
Fig. 2LSC17 in pediatric AML.
Kaplan–Meier estimates for the probability of a EFS and b OS in patients within our entire cohort (n = 1503) stratified based on low versus high LSC17 scores. LSC17 scores significantly predict survival for the entire non-stratified cohort. Conversely, LSC17 scores do not improve upon previously established risk stratification models based on cytogenetic and molecular alterations in regards to either c EFS or d OS. e Hazard ratios with 95% confidence intervals for EFS and OS as a function of LSC17 risk group (high versus low) across historical clinical trial cytomolecular risk stratification schema (n = 1503 patients). f Driver gene fusion frequencies within our entire study cohort (n = 1503). g Uniform manifold approximation and projection (UMAP) performed on selected genes based on the nearest shrunken centroids approach clearly discriminates fusion classes. h Gene set enrichment analysis on a 47 LSC gene signature reveals that LSC genes are significantly enriched among fusion-predictive genes. GSEA p-values are calculated by permutation (n = 1000) across the gene set of interest combined with every gene set within the Broad Institute Molecular Signature Database v6.2. i Normalized enrichment scores based on hematopoietic hierarchical cell populations reveal that gene fusion transcriptional signatures align with distinct hematopoietic stem cell and myeloid progenitor cell population states. NES normalized enrichment score. j Box plot of LSC17 scores categorized based on cytogenetic or fusion status reveal that LSC17 scores significantly correlate with underlying alteration (n = 1503 patients). Box plot data are presented as median values with hinges corresponding to the 25th or 75th percentiles and whiskers corresponding to 1.5 times the inter-quartile range. P-values were calculated based on two-sided t-tests. Source data are provided as a Source Data file. k Survival outcomes stratified based on fusion status. Survival differences were determined using the log-rank test (two-sided and without multiple-testing adjustments).
Fig. 3Leukemia stem cell transcriptional signature for pediatric AML.
a The LSC17 gene signature was previously generated based on LASSO Cox regression on 47 genes enriched in LSC AML cell populations (LSC47). Analyzing LSC47 gene expression data within our cohort, AMLs cluster based on underlying fusion category. b The circos plot on the left indicates the previously described LSC17 gene set. Conversely, the circos plot on the right indicates the 17 most predictive genes within our training cohort using the same LASSO based Cox regression analysis. Subsequent risk stratification model building considers all 47 upregulated LSC genes (LSC47). c Kaplan–Meier estimates for the probability of EFS based on LSC17 versus LSC47 gene signatures and associated area under the curve receiver operating characteristic (AUC ROC) curve plotting true positive rates versus false positive rates as a function of LSC17 and LSC47 score thresholds. d Additionally, when AMLs are grouped based on underlying fusion, each class is associated with a distinct LSC gene set. e LSC47 variance and t-tests based on fusion category (n = 753 patients from the training cohort). Box plot data are presented as median values with hinges corresponding to the 25th or 75th percentiles and whiskers corresponding to the 10th or 90th percentiles (left panel). P-values were calculated based on two-sided t-tests (right panel). Source data are provided as a Source Data file. Kaplan–Meier estimates for the probability of EFS and AUC ROC curves among f KMT2A and g Other or No Fusion AML cohorts based on LSC17 versus LSC47. Survival differences were determined using the log-rank test (two-sided and without multiple-testing adjustments).
Fig. 4Additional transcriptional biomarkers for pediatric AML.
a Isolating AMLs that did not have one of the five core fusion alterations, additional cytomolecular subtypes clustered with one another based on LSC47: NPM1, CEBPA, and FLT3 internal tandem duplication (ITD) mutation. b Kaplan–Meier estimates for the probability of EFS within the training cohort for CEBPA, FLT3-ITD, and Other Subtype AMLs. Survival differences were determined using the log-rank test (two-sided and without multiple-testing adjustments). c Notable genes included in LSC47 but not LSC17. Genes are connected to the cytomolecular classes based on whether they contribute to the associated LSC signature and score.
Fig. 5LSC47 risk stratification model.
a To build a robust risk prediction model for pediatric AML, we aggregated LSC47 based signatures with other validated biomarkers (e.g., RUNX1-RUNX1T1 transcriptional signature and CBFB-MYH11 fusion breakpoint location) within our training cohort. NUP98 partner fusion and CBFA2T3-GLIS2 AMLs are associated with 5-year EFS of < 20% and were therefore assigned to the high-risk stratum without further stratification. Kaplan–Meier estimates for the probability of EFS based on b cytomolecular (CM) risk factors, c LSC17, and d, e combined LSC47 model in training and validation cohorts. Survival differences were determined using the log-rank test (two-sided and without multiple-testing adjustments).