| Literature DB >> 34933327 |
Makoto Iwasaki1, Junya Kanda1, Yasuyuki Arai1, Tadakazu Kondo1, Takayuki Ishikawa2, Yasunori Ueda3, Kazunori Imada4, Takashi Akasaka5, Akihito Yonezawa6, Kazuhiro Yago7, Masaharu Nohgawa8, Naoyuki Anzai9, Toshinori Moriguchi10, Toshiyuki Kitano11, Mitsuru Itoh12, Nobuyoshi Arima13, Tomoharu Takeoka14, Mitsumasa Watanabe15, Hirokazu Hirata16, Kosuke Asagoe17, Isao Miyatsuka18, Le My An18, Masanori Miyanishi18, Akifumi Takaori-Kondo1.
Abstract
Graft-versus-host disease-free, relapse-free survival (GRFS) is a useful composite end point that measures survival without relapse or significant morbidity after allogeneic hematopoietic stem cell transplantation (allo-HSCT). We aimed to develop a novel analytical method that appropriately handles right-censored data and competing risks to understand the risk for GRFS and each component of GRFS. This study was a retrospective data-mining study on a cohort of 2207 adult patients who underwent their first allo-HSCT within the Kyoto Stem Cell Transplantation Group, a multi-institutional joint research group of 17 transplantation centers in Japan. The primary end point was GRFS. A stacked ensemble of Cox Proportional Hazard (Cox-PH) regression and 7 machine-learning algorithms was applied to develop a prediction model. The median age for the patients was 48 years. For GRFS, the stacked ensemble model achieved better predictive accuracy evaluated by C-index than other state-of-the-art competing risk models (ensemble model: 0.670; Cox-PH: 0.668; Random Survival Forest: 0.660; Dynamic DeepHit: 0.646). The probability of GRFS after 2 years was 30.54% for the high-risk group and 40.69% for the low-risk group (hazard ratio compared with the low-risk group: 2.127; 95% CI, 1.19-3.80). We developed a novel predictive model for survival analysis that showed superior risk stratification to existing methods using a stacked ensemble of multiple machine-learning algorithms.Entities:
Mesh:
Year: 2022 PMID: 34933327 PMCID: PMC9043925 DOI: 10.1182/bloodadvances.2021005800
Source DB: PubMed Journal: Blood Adv ISSN: 2473-9529
Patient characteristics
| Variable | Total | Training set | Validation set |
|
|---|---|---|---|---|
| n = 2207 (%) | n = 1765 (%) | n = 442 (%) | ||
|
| .327 | |||
| ≤30 | 339 (15.4) | 265 (15.0) | 74 (16.7) | |
| >30-40 | 340 (15.4) | 277 (15.7) | 63 (14.3) | |
| >40-50 | 480 (21.7) | 376 (21.3) | 104 (23.5) | |
| >50-60 | 631 (28.6) | 509 (28.8) | 122 (27.6) | |
| >60 | 417 (18.9) | 338 (19.2) | 79 (17.9) | |
|
| .572 | |||
| Male | 925 (41.9) | 734 (41.6) | 191 (43.2) | |
| Female | 1282 (58.1) | 1031 (58.4) | 251 (56.8) | |
|
| .279 | |||
| BM | 1349 (61.1) | 1061 (60.1) | 288 (65.2) | |
| Peripheral blood | 356 (16.1) | 292 (16.5) | 64 (14.5) | |
| BM + peripheral blood | 7 (0.3) | 6 (0.3) | 1 (0.2) | |
| Cord blood | 495 (22.4) | 406 (23.0) | 89 (20.1) | |
|
| .707 | |||
| ≤6 mo | 793 (35.9) | 638 (36.1) | 155 (35.1) | |
| >6 mo | 1392 (63.1) | 1108 (62.8) | 284 (64.3) | |
| Uncertain/missing | 22 (1.0) | 19 (1.1) | 3 (0.7) | |
|
| .441 | |||
| 1996-2006 | 718 (32.5) | 581 (32.9) | 137 (31.0) | |
| 2007-2016 | 1489 (67.5) | 1184 (67.1) | 305 (69.0) | |
|
| .652 | |||
| AML | 868 (39.3) | 703 (39.8) | 165 (37.3) | |
| ALL | 371 (16.8) | 296 (16.8) | 75 (17.0) | |
| ATL | 130 (5.9) | 102 (5.8) | 28 (6.3) | |
| CML | 124 (5.6) | 94 (5.3) | 30 (6.8) | |
| MDS | 342 (15.5) | 274 (15.5) | 68 (15.4) | |
| Other leukemia | 31 (1.4) | 23 (1.3) | 8 (1.8) | |
| MPN | 38 (1.7) | 28 (1.6) | 10 (2.3) | |
| NHL/HL/other lymphoma | 294 (13.3) | 236 (13.4) | 58 (13.1) | |
| MM/PCD | 9 (0.4) | 9 (0.5) | 0 (0.0) | |
|
| ||||
| Median time, month (range) | 52.5 (0.5-244.6) | 57.6 (0.5-244.6) | 58.6 (0.7-235.4) | .743 |
ALL, acute lymphoblastic leukemia; AML, acute myeloid leukemia; CML, chronic myeloid leukemia; HL, Hodgkin lymphoma; MDS, myelodysplastic syndrome; MM, multiple myeloma; MPN, myeloproliferative neoplasm; NHL, non-Hodgkin lymphoma; PCD, plasma cell disease.
Figure 1.Stacked ensemble model of machine-learning algorithms. Scheme of meta-model construction using stacking as an ensemble method.
Performance of each prediction model according to C-index in the validation cohort
| Risk category | GRFS | OS | Relapse | NRM | aGVHD | cGVHD |
|---|---|---|---|---|---|---|
| Cox-PH | 0.668 | 0.740 | 0.770 | 0.664 | 0.651 | 0.564 |
| Fine-Gray competing risk model | NA | NA | 0.719 | 0.577 | 0.582 | 0.516 |
| Random Survival Forest | 0.660 | 0.745 | 0.788 | 0.761 | 0.580 | 0.577 |
| XGBoost | 0.602 | 0.712 | 0.756 | 0.543 | 0.540 | 0.573 |
| Gradient Boosting | 0.630 | 0.602 | 0.754 | 0.453 | 0.590 | 0.505 |
| Component-wise Gradient Boosting | 0.663 | 0.652 | 0.774 | 0.585 | 0.464 | 0.570 |
| Dynamic DeepHit | 0.646 | 0.710 | 0.730 | 0.691 | 0.537 | 0.555 |
| Stacked Ensemble Model | 0.670 | 0.763 | 0.793 | 0.777 | 0.656 | 0.583 |
aGVHD, grade II-IV acute GVHD; cGVHD, chronic GVHD; NA, not applicable.
Comparison of the Integrated Calibration Index and the median of the absolute difference between the predicted survival probabilities and smoothed survival frequencies for each prediction model
| Integrated calibration index (EC50) | |||||
|---|---|---|---|---|---|
| Risk category | GRFS | OS | Relapse | aGVHD | cGVHD |
| Cox-PH | 0.139 (0.151) | 0.283 (0.248) | 0.055 (0.029) | 0.218 (0.212) | 0.263 (0.208) |
| Random Survival Forest | 0.142 (0.147) | 0.365 (0.372) | 0.048 (0.029) | 0.173 (0.178) | 0.345 (0.346) |
| XGBoost | 0.027 (0.007) | 0.393 (0.381) | 0.176 (0.163) | 0.265 (0.264) | 0.306 (0.265) |
| Gradient Boosting | 0.050 (0.047) | 0.438 (0.449) | 0.159 (0.129) | 0.254 (0.256) | 0.309 (0.275) |
| Component-wise Gradient Boosting | 0.061 (0.068) | 0.397 (0.395) | 0.171 (0.145) | 0.261 (0.264) | 0.324 (0.318) |
| Dynamic DeepHit | 0.054 (0.059) | 0.405 (0.409) | 0.152 (0.153) | 0.106 (0.108) | 0.319 (0.320) |
| Stacked Ensemble Model | 0.023 (0.017) | 0.210 (0.194) | 0.044 (0.018) | 0.017 (0.018) | 0.258 (0.226) |
EC50, the median of the absolute difference between the predicted survival probabilities and smoothed survival frequencies.
Figure 2.SHapley Additive exPlanations feature importance value for GRFS and OS. (A) Representative patients in GRFS (left) and OS (right) model. Red and blue bars indicate positive and negative feature contributions, respectively. (B) SHAP feature importance measured as the mean absolute Shapley values for GRFS (left) and OS (right). Variables having top 10 highest impact on model outputs are shown.
Figure 3.Kaplan-Meier estimates for GRFS and OS in the validation set. Estimates for GRFS (A) and OS (B) are shown. Patients are stratified based on stacked ensemble meta-model score.
Figure 4.Cumulative incidence of relapse, NRM, and GVHD. (A) Relapse. (B) NRM. (C) Grade II-IV acute GVHD. (D) Chronic GVHD. Patients are stratified based on stacked ensemble meta-model score.