| Literature DB >> 35799797 |
Hao Yu1, Fang Chen2,3, Ka-On Lam2,3, Li Yang2, Yang Wang4, Jian-Yue Jin5, Aya Ei Helali3, Feng-Ming Spring Kong2,3.
Abstract
Radiation-induced lymphopenia is known for its survival significance in patients with breast cancer treated with radiation therapy. This study aimed to evaluate the impact of radiotherapy on lymphocytes by applying machine learning strategies. We used Extreme Gradient Boosting (XGboost) to predict the event of lymphopenia (grade≥1) and conduced an independent validation. Then, we induced feature attribution analysis (Shapley additive explanation, SHAP) in explaining the XGboost models to explore the directional contribution of each feature to lymphopenia. Finally, we implemented the proof-of-concept clinical validation. The results showed that the XGboost models had rigorous generalization performances (accuracies 0.764 and ROC-AUC 0.841, respectively) in the independent cohort. The baseline lymphocyte counts are the most protective feature (SHAP = 5.226, direction of SHAP = -0.964). Baseline platelets and monocytes also played important protective roles. The usage of taxane only chemotherapy was less risk on lymphopenia than the combination of anthracycline and taxane. By the contribution analysis of dose, we identified that firstly lymphocytes were sensitive to a radiation dose less than 4Gy; secondly the irradiation volume was more important in promoting lymphopenia than the irradiation dose; thirdly the irradiation dose promoted the event of lymphopenia when the irradiation volume was fixed. Overall, our findings paved the way to clarifying the radiation dose volume effect. To avoid radiation-induced lymphopenia, irradiation volume should be kept to a minimum during the planning process, as long as the target coverage is not compromised.Entities:
Keywords: SHapley Additive exPlanation; breast cancer; machine learning; radiation dose; radiation-induced lymphopenia
Mesh:
Substances:
Year: 2022 PMID: 35799797 PMCID: PMC9253393 DOI: 10.3389/fimmu.2022.768811
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 8.786
Figure 1The study flowchart. (A) The patient flowchart; (B) the machine learning flowchart; (C) the statistical verification flowchart.
The characteristics of breast cancer patients in Testing cohort (total 589 patients), continues features are shown as median (1st to 3rd quantile) and classified features are shown as numbers (percentage).
| Feature | Subgroup | Median (1st - 3rd Qu) | Odd ratio |
| Adjusted | |
|---|---|---|---|---|---|---|
| Without | With | |||||
|
| 249 | 340 | ||||
|
| 5.76(4.82-6.95) | 4.9(3.87-5.96) | 0.85(0.78-0.92) | <0.001 | <0.001 | |
|
| 120(110-128) | 117.5(110-124) | 0.98(0.97-1) | 0.024 | 1 | |
|
| 244(212-282) | 227.5(185.8-271) | 0.99(0.99-1) | 0.003 | 0.14 | |
|
| 3.49(2.75-4.38) | 2.98(2.22-3.98) | 0.92(0.85-0.99) | 0.036 | 1 | |
|
| 1.73(1.45-2.12) | 1.34(1.09-1.66) | 0.14(0.09-0.21) | <0.001 | <0.001 | |
|
| 0.34(0.25-0.42) | 0.28(0.23-0.36) | 0.12(0.04-0.36) | <0.001 | 0.012 | |
|
| RapidArc | 2 (0.8) | 98(28.82) | reference | ||
| 2D-fields | 133 (53.41) | 78(22.94) | 0.012(0.001-0.039) | <0.001 | <0.001 | |
| 3D-fields | 114(45.78) | 164(48.24) | 0.029(0.004-0.095) | <0.001 | <0.001 | |
|
| Tangential breast only | 133(53.41) | 80(23.53) | reference | ||
| Breast/chest wall with regional lymphatics | 116(46.59) | 260(76.47) | 3.73(2.62-5.32) | <0.001 | <0.001 | |
|
| 40.5Gy/15fx | 241(96.79) | 290(85.29) | reference | ||
| more than 50Gy/25fx | 8(3.21) | 50(14.71) | 5.19(2.55-12) | <0.001 | 0.001 | |
|
| none | 92(36.95) | 196(57.65) | 11reference | ||
| 10Gy/5fx | 137(55.02) | 129(37.94) | 0.44(0.31-0.62) | <0.001 | <0.001 | |
| 16Gy/8fx | 20(9.03) | 15(4.41) | 0.35(0.17-0.72) | 0.004 | 0.21 | |
|
| 1.92(0.42-2.81) | 2.25(0.53-4.31) | 1.29(1.18-1.41) | <0.001 | <0.001 | |
|
| 40.68(3.8-41.76) | 39.05(5.02-42.24) | 1.01(1-1.02) | 0.15 | 1 | |
|
| 3.56(2.99-4.06) | 4.48(3.64-6.43) | 2.27(1.91-2.73) | <0.001 | <0.001 | |
|
| 17.4(12.5-23.5) | 22.95(18.57-27.2) | 1.1(1.08-1.13) | <0.001 | <0.001 | |
|
|
| 31.3(23.2-41.1) | 43.25(34.05-60.73) | 1.08(1.06-1.09) | <0.001 | <0.001 |
|
| 8.71(6.43-11.01) | 11.59(8.94-13.22) | 1.29(1.21-1.37) | <0.001 | <0.001 | |
|
| 8.6(6-11.4) | 11.2(9-13.93) | 1.2(1.15-1.27) | <0.001 | <0.001 | |
|
| 15.4(11.6-20) | 22.3(16.68-39.12) | 1.12(1.09-1.15) | <0.001 | <0.001 | |
|
| 4.34(3.26-5.4) | 5.99(4.54-8.13) | 1.57(1.43-1.74) | <0.001 | <0.001 | |
|
| 46(40-53) | 44.5(38-51) | 0.98(0.96-1) | 0.028 | 1 | |
|
| without | 168(67.47) | 217(63.82) | reference | ||
| with | 81(32.53) | 123(36.18) | 1.18(0.83-1.66) | 0.36 | 1 | |
|
| without | 190(76.31) | 242(71.18) | reference | ||
| with | 2(0.8) | 1(0.29) | 0.39(0.018-4.13) | 0.45 | 1 | |
| unknown | 57(22.89) | 97(28.53) | 1.34(0.92-1.96) | 0.13 | 1 | |
|
| without | 192(77.11) | 240(70.59) | reference | ||
| with | 0(0) | 3(0.88) | 1.69E6(2.25e-23-NA) | 0.98 | 1 | |
| unknown | 57(22.89) | 97(28.53) | 1.36(0.94-1.99) | 0.11 | 1 | |
|
| premenopausal | 156(62.65) | 239(70.29) | reference | ||
| perimenopausal | 18(7.23) | 21(6.18) | 0.76(0.39-1.49) | 0.42 | 1 | |
| postmenopausal | 75(30.12) | 80(28.53) | 0.7(0.48-1.01) | 0.057 | 1 | |
|
| 0 | 135(54.22) | 89(26.18) | reference | ||
| more than 0 | 114(45.78) | 251(73.82) | 3.34(2.37-4.74) | <0.001 | <0.001 | |
|
| I | 96(38.55) | 55(16.18) | reference | ||
| II | 117(46.99) | 149(43.82) | 2.22(1.48-3.36) | <0.001 | 0.007 | |
| III | 36(14.46) | 136(40) | 6.59(4.06-10.9) | <0.001 | <0.001 | |
|
| tumor side at left | 139(55.82) | 159(46.76) | reference | ||
| tumor side at right | 110(44.18) | 181(53.23) | 1.42(1.02-1.97) | 0.038 | 1 | |
|
| 2(1.3-2.5) | 2(1.38-2.8) | 1.13(1-1.28) | 0.047 | 1 | |
|
| - | 66(26.51) | 97(28.53) | reference | ||
| + | 183(73.49) | 243(71.47) | 0.9(0.62-1.3) | 0.59 | 1 | |
|
| - | 85(34.14) | 128(37.65) | reference | ||
| + | 164(65.86) | 212(62.35) | 0.86(0.61-1.21) | 0.38 | 1 | |
|
| - | 176(70.68) | 259(76.18) | reference | ||
| + | 73(29.31) | 81(23.82) | 0.75(0.52-1.09) | 0.13 | 1 | |
|
| HR+/HER2- | 143(57.43) | 203(59.71) | reference | ||
| HR-/HER2+ | 29(11.65) | 35(10.29) | 0.85(0.5-1.46) | 0.55 | 1 | |
| HR+/HER2+ | 43(17.27) | 48(14.12) | 0.79(0.49-1.25) | 0.31 | 1 | |
| HR-/HER2- | 34(13.65) | 54(15.88) | 1.12(0.69-1.82) | 0.65 | 1 | |
|
| 30(15-40) | 30(15-40) | 1(0.99-1.01) | 0.98 | 1 | |
|
| BCT | 151(60.64) | 135(39.71) | reference | ||
| MRM | 98(39.36) | 205(60.29) | 2.34(1.68-3.28) | <0.001 | <0.001 | |
|
| SLNB | 127(51) | 77(22.65) | reference | ||
| ALND | 122(49) | 263(77.35) | 3.56(2.5-5.09) | <0.001 | <0.001 | |
|
| clear margin | 235(94.38) | 326(95.88) | reference | ||
| close or positive margin | 14(5.62) | 14(4.12) | 0.72(0.34-1.55) | 0.39 | 1 | |
|
| none | 41(16.48) | 17(5) | reference | ||
| neoadjuvant | 26(10.44) | 78(22.94) | 7.24(3.59-15.2) | <0.001 | <0.001 | |
| adjuvant | 177(71.08) | 236(64.41) | 3.22(1.8-5.99) | <0.001 | 0.007 | |
| neoadjuvant+adjuvant | 5(2) | 9(2.65) | 4.34(1.31-16) | 0.019 | 0.98 | |
|
| others | 45(18.07) | 28(8.24) | reference | ||
| taxane | 82(32.93) | 61(17.94) | 1.2(0.67-2.14) | 0.54 | 1 | |
| anthracycline+taxane | 122(49) | 251(73.82) | 3.31(1.98-5.61) | <0.001 | <0.001 | |
|
| without | 182(73.09) | 260(76.47) | reference | ||
| with | 67(26.91) | 80(23.53) | 0.84(0.57-1.22) | 0.35 | 1 | |
|
| without | 63(25.3) | 89(26.18) | reference | ||
| with | 186(74.7) | 251(73.82) | 0.96(0.66-1.39) | 0.81 | 1 | |
Odd ratio (95% confident interval) and corresponding P values in logistical regression for the events of lymphopenia, the adjusted P values are P values after Bonferroni correction. RT, radiation treatment; ER, estrogen receptors; PR, progesterone receptors; IHC, immunohistochemistry; HR, hormone receptor; HER2, human epidermal growth factor receptor 2; BCT, breast-conserving therapy; MRM, modified radical mastectomy; SLNB, Sentinel lymph node biopsy; ALND, axillary lymph node dissection.
Figure 2The XGboost and Lasso regression models for predicting the radiation-induced lymphopenia were trained in Testing cohort and validated in Validation cohort. (A) The top two trees in one example of XGboost model; (B) coefficients and P-values in one example of Lasso regression; (C, D) the classifying performances (accuracy and ROC-AUC, respectively) of XGboost models across all iterations in Testing cohort and in Validation cohort, compared with the Lasso regressions. In subplots (C, D), numerical labels are median values. The color represents the feature’s group, including: the full model (Orange), dosimetrics (blue), blood cell baselines (maroon), tumor features (green), Treatment regimens (Khaki), clinical features (yellow).
Figure 3The graphic layout of directional SHAP values of each feature classified in five feature groups: (A) baseline blood cells group (maroon), (B) radiation dose group (blue), (C) treatment regimens group (Khaki), (D) tumor characteristics group (green) and (E) clinical characteristics group (yellow). The event of lymphopenia is a gray square. Each feature node is colored by group and sized by the SHAP values; each edge is colored from red (positive) to blue (negative) by the direction value of SHAP and its thickness is sized by the absolute direction values of SHAP.
The dummy features’ contributions to lymphopenia in Testing cohort (total 589 patients), including the mean SHAP value and direction of SHAP in XGboost models and the mean coefficients in Lasso regression across all iterations.
| Feature | XGboost models | Lasso regressions | ||
|---|---|---|---|---|
| SHAP value | direction of SHAP | Coefficients |
| |
| baseline lymphocytes | 5.226 | -0.964 | -0.902 | <0.001 |
| integral dose of the total body | 4.021 | 0.858 | 0.487 | <0.001 |
| V5 of bilateral lungs | 1.451 | 0.886 | 0.282 | <0.001 |
| V5 of ipsilateral lung | 1.151 | 0.871 | 0.349 | <0.001 |
| baseline white blood cells | 0.990 | -0.838 | -0.069 | 0.106 |
| mean bilateral lungs dose | 0.905 | 0.788 | 0.171 | 0.076 |
| maxim heart dose | 0.783 | 0.187 | 0 | 1 |
| mean heart dose | 0.774 | -0.377 | 0 | 1 |
| baseline hemoglobin | 0.737 | 0.378 | 0.124 | <0.001 |
| baseline platelet | 0.579 | -0.681 | -0.104 | <0.001 |
| baseline monocytes | 0.533 | -0.295 | -0.063 | 0.048 |
| mean ipsilateral lung dose | 0.469 | 0.654 | 0.153 | 0.138 |
| chemotherapy regimens: taxane | 0.420 | -0.672 | -0.334 | <0.001 |
| V20 of ipsilateral lung | 0.383 | 0.308 | 0 | 1 |
| chemotherapy regimens: anthracycline+taxane | 0.368 | 0.714 | 0.238 | <0.001 |
| tumor size | 0.359 | -0.629 | 0 | 1 |
| Ki67 | 0.298 | -0.600 | -0.092 | 0.014 |
| baseline neutrophils | 0.283 | 0.053 | 0.061 | 1 |
| age | 0.254 | -0.282 | 0.016 | 1 |
| without HER2 | 0.160 | 0.470 | 0.290 | <0.001 |
| V20 of bilateral lungs | 0.150 | 0.126 | 0.118 | 1 |
| RT technology: 3D-fields | 0.083 | -0.477 | 0 | 1 |
| RT technology: RapidArc | 0.081 | 0.258 | 0.759 | <0.001 |
| with HER2 | 0.053 | -0.341 | 0 | 1 |
| neoadjuvant chemotherapy | 0.050 | 0.256 | 0.321 | <0.001 |
| HR+/HER2- | 0.045 | 0.321 | 0.159 | 1 |
| electron: 10Gy/5fx | 0.041 | 0.290 | 0.276 | 1 |
| HR+/HER2+ | 0.040 | -0.206 | -0.245 | 0.016 |
| modified stage II | 0.040 | 0.054 | -0.056 | 1 |
| without drinking history | 0.025 | -0.142 | -0.796 | 1 |
| tumor side at left | 0.024 | -0.187 | -0.103 | 0.306 |
| without antiHER2 therapy | 0.021 | 0.141 | 0 | 1 |
| with family history | 0.021 | 0.016 | 0.047 | 1 |
| modified stage III | 0.021 | 0.131 | 0.019 | 1 |
| modified stage I | 0.021 | -0.094 | 0 | 1 |
| without endocrine therapy | 0.019 | -0.082 | -0.288 | 1 |
| without PR | 0.018 | 0.052 | -0.195 | 1 |
| adjuvant chemotherapy | 0.016 | -0.116 | -0.105 | 1 |
| premenopausal | 0.015 | 0.109 | 0.187 | 1 |
| without ER | 0.015 | -0.016 | 0.097 | 1 |
| without smoking history | 0.014 | 0.068 | 0 | 1 |
| with antiHER2 therapy | 0.012 | -0.119 | 0 | 1 |
| unknown drinking history | 0.011 | 0.024 | 0 | 1 |
| SLNB | 0.011 | 0.001 | 0.150 | 1 |
| RT technology: 2D-fields | 0.010 | -0.068 | 0 | 1 |
| BCT | 0.010 | 0.009 | -0.300 | 1 |
| unknown smoking history | 0.009 | 0.048 | 0 | 1 |
| with PR | 0.008 | -0.078 | 0 | 1 |
| postmenopausal | 0.008 | -0.047 | -0.151 | 1 |
| ALND | 0.008 | 0.034 | 0 | 1 |
| with endocrine therapy | 0.007 | 0.032 | 0 | 1 |
| none chemotherapy | 0.007 | -0.043 | -0.311 | 0.771 |
| electron: none | 0.006 | 0.000 | 0 | 1 |
| HR-/HER2+ | 0.005 | -0.046 | -0.244 | 1 |
| with ER | 0.005 | -0.040 | 0 | 1 |
| without family history | 0.004 | -0.025 | 0 | 1 |
| HR-/HER2- | 0.004 | 0.026 | 0 | 1 |
| chemotherapy regimens: others | 0.003 | 0.001 | 0 | 1 |
| tumor side at right | 0.003 | 0.027 | 0 | 1 |
| MRM | 0.002 | -0.018 | 0 | 1 |
| modified N stage 0 | 0.001 | 0.000 | 0 | 1 |
| electron: 16Gy/8fx | 0.001 | -0.004 | -0.273 | <0.001 |
| RT fields: Breast/chest wall with regional lymphatics | 0.001 | 0.017 | 0 | 1 |
| neoadjuvant+adjuvant chemotherapy | 0 | 0 | -0.225 | 1 |
| with drinking history | 0 | 0 | 0 | 1 |
| clear margin | 0 | 0 | 0.391 | <0.001 |
| close or positive margin | 0 | 0 | 0 | 1 |
| perimenopausal | 0 | 0 | -0.281 | 0.76 |
| modified N stage more than 0 | 0 | 0 | 0 | 1 |
| RT Dose: 40.5Gy/15fx | 0 | 0 | -0.456 | 1 |
| RT Dose: more than 50Gy/25fx | 0 | 0 | 0 | 1 |
| RT fields: Tangential breast only | 0 | 0 | 0 | 1 |
| with smoking history | 0 | 0 | -0.600 | 0.017 |
SHAP value is more than zero, the higher SHAP value the more contribution of feature to lymphopenia; direction of SHAP is range from -1 to 1, more promotive to lymphopenia when close to 1 while more protective when closet to -1. RT, radiation treatment; ER, estrogen receptors; PR, progesterone receptors; IHC, immunohistochemistry; HR, hormone receptor; HER2, human epidermal growth factor receptor 2; BCT, breast-conserving therapy; MRM, modified radical mastectomy; SLNB, Sentinel lymph node biopsy; ALND, axillary lymph node dissection.
Figure 4The relationships between SHAP value and irradiation volume in the Testing cohort (A) and in the Validation cohort (B); the relationships between SHAP value and irradiation dose in the Testing cohort (C) and in the Validation cohort (D). The regression relationships between irradiation volume and irradiation dose in the Testing cohort (E) and in the Validation cohort (F). Each radiation dose is differently colored and sorted by SHAP value.
Figure 5Paired t-test analysis in matched patients who with controlled discrepancy of important features. Subplots (A–C) are analyzed in the matched patients in Testing cohort, and subplots (D–F) are analyzed in the matched patients in Validation cohort. In the paired patients of each cohort, three boxplots (from left to right) were the comparisons of feature levels between with (blue box) and without (red box) lymphopenia in (A or D) maxim heart dose, (B or E) mean heart dose and (C or F) V20 of ipsilateral lung, respectively.