| Literature DB >> 32908182 |
Bum-Joo Cho1, Kyoung Min Kim2, Sanchir-Erdene Bilegsaikhan3, Yong Joon Suh4.
Abstract
Febrile neutropenia (FN) is one of the most concerning complications of chemotherapy, and its prediction remains difficult. This study aimed to reveal the risk factors for and build the prediction models of FN using machine learning algorithms. Medical records of hospitalized patients who underwent chemotherapy after surgery for breast cancer between May 2002 and September 2018 were selectively reviewed for development of models. Demographic, clinical, pathological, and therapeutic data were analyzed to identify risk factors for FN. Using machine learning algorithms, prediction models were developed and evaluated for performance. Of 933 selected inpatients with a mean age of 51.8 ± 10.7 years, FN developed in 409 (43.8%) patients. There was a significant difference in FN incidence according to age, staging, taxane-based regimen, and blood count 5 days after chemotherapy. The area under the curve (AUC) built based on these findings was 0.870 on the basis of logistic regression. The AUC improved by machine learning was 0.908. Machine learning improves the prediction of FN in patients undergoing chemotherapy for breast cancer compared to the conventional statistical model. In these high-risk patients, primary prophylaxis with granulocyte colony-stimulating factor could be considered.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32908182 PMCID: PMC7481240 DOI: 10.1038/s41598-020-71927-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Clinical demographic characteristics of patients with and without febrile neutropenia in the training dataset.
| Parameters | FN group (n = 366) | non-FN group (n = 477) | |
|---|---|---|---|
| Age (years), means ± SD | 0.004 | ||
| ≤ 50 | 157 (42.9) | 276 (57.9) | < 0.001 |
| > 50 | 209 (57.1) | 201 (42.1) | |
| Body surface area (m2), means ± SD | 1.58 ± 0.14 | 1.57 ± 0.13 | 0.498 |
| Hypertension, n (%) | 112 (30.6) | 113 (23.7) | 0.028 |
| Diabetes mellitus, n (%) | 42 (11.5) | 44 (9.2) | 0.303 |
| Tuberculosis, n (%) | 9 (2.5) | 14 (2.9) | 0.832 |
| Breast-conserving surgery, n (%) | 201 (55.1) | 345 (72.3) | < 0.001 |
| Tumor size (cm), mean ± SD | 2.7 ± 2.0 | 2.3 ± 1.4 | 0.002 |
| Positive lymph node, means ± SD | 2.6 ± 5.2 | 1.2 ± 3.4 | < 0.001 |
| ER, n (%) | 268 (74.0) | 312 (66.0) | 0.012 |
| PR, n (%) | 225 (62.2) | 280 (59.2) | 0.392 |
| Her-2, n (%) | 108 (29.5) | 133 (27.9) | 0.645 |
| CA 15–3, means ± SD | 56.3 ± 202.0 | 24.0 ± 136.5 | 0.009 |
| TNM staging, n (%) | < 0.001 | ||
| I/II | 248 (67.8) | 410 (86.0) | |
| III/IV | 118 (32.2) | 67 (14.0) | |
| Taxane-based regimen, n (%) | 245 (66.9) | 184 (38.6) | < 0.001 |
| Hemoglobin (g/dL) | 12.9 ± 1.2 | 13.0 ± 1.4 | 0.426 |
| Platelet (× 103/µL) | 262 ± 63 | 269 ± 65 | 0.168 |
| Neutrophil (× 103/µL) | 3.725 ± 1.550 | 3.815 ± 1.420 | 0.383 |
| Lymphocyte (× 103/µL) | 1.905 ± 0.603 | 1.971 ± 0.621 | 0.119 |
| Hemoglobin (g/dL) | 10.9 ± 1.7 | 11.5 ± 1.1 | < 0.001 |
| Platelet (× 103/µL) | 221 ± 82 | 226 ± 62 | 0.378 |
| Neutrophil (× 103/µL) | 3.329 ± 2.278 | 3.067 ± 1.343 | 0.052 |
| Lymphocyte (× 103/µL) | 0.867 ± 0.374 | 1.561 ± 0.593 | < 0.001 |
| SERM, n (%) | 139 (38.0) | 221 (46.3) | 0.017 |
| LHRH, n (%) | 112 (30.6) | 190 (39.8) | 0.006 |
| Aromatase inhibitor, n (%) | 148 (40.4) | 121 (25.4) | < 0.001 |
| Radiation treatment, n (%) | 292 (79.8) | 401 (84.1) | 0.122 |
| Herceptin, n (%) | 101 (27.6) | 130 (27.3) | 0.938 |
FN febrile neutropenia, SD standard deviation, BSA body surface area, ER estrogen receptor, PR progesterone receptor, Her-2 human epidermal growth factor receptor 2, CA cancer antigen, CBC complete blood count, SERM selective estrogen receptor modulator, LHRH luteinizing hormone-releasing hormone, F/U follow-up.
Clinical demographic characteristics of patients with and without febrile neutropenia in the testing dataset.
| Parameters | FN group (n = 43) | non-FN group (n = 47) | |
|---|---|---|---|
| Age (years), means ± SD | 0.050 | ||
| ≤ 50 | 13 (30.2) | 24 (51.1) | |
| > 50 | 30 (69.8) | 23 (48.9) | |
| Body surface area (m2), means ± SD | 1.61 ± 0.14 | 1.63 ± 0.14 | 0.383 |
| Hypertension, n (%) | 12 (27.9) | 11 (23.4) | 0.638 |
| Diabetes mellitus, n (%) | 5 (11.6) | 5 (10.6) | 1.000 |
| Tuberculosis, n (%) | 0 (0) | 0 (0) | 1.000 |
| Breast-conserving surgery, n (%) | 28 (65.1) | 37 (78.7) | 0.166 |
| Tumor size (cm), mean ± SD | 2.7 ± 1.3 | 2.5 ± 1.2 | 0.490 |
| Positive lymph node, means ± SD | 1.7 ± 4.0 | 0.3 ± 1.4 | 0.038 |
| ER, n (%) | 34 (79.1) | 26 (55.3) | 0.046 |
| PR, n (%) | 26 (60.5) | 18 (38.3) | 0.057 |
| Her-2, n (%) | 12 (27.9) | 7 (14.9) | 0.196 |
| CA 15–3, means ± SD | 15.2 ± 12.0 | 11.7 ± 7.8 | 0.140 |
| TNM staging, n (%) | 0.003 | ||
| I/II | 33 (76.7) | 46 (97.9) | |
| III/IV | 10 (23.3) | 1 (2.1) | |
| Taxane-based regimen, n (%) | 23 (53.5) | 14 (29.8) | 0.032 |
| Hemoglobin (g/dL) | 13.0 ± 1.6 | 13.1 ± 1.2 | 0.940 |
| Platelet (× 103/µL) | 263 ± 56 | 265 ± 77 | 0.590 |
| Neutrophil (× 103/µL) | 3.661 ± 1.574 | 4.006 ± 1.683 | 0.550 |
| Lymphocyte (× 103/µL) | 2.024 ± 0.575 | 2.193 ± 0.640 | 0.250 |
| Hemoglobin (g/dL) | 10.7 ± 1.1 | 5.314 ± 1.593 | 0.017 |
| Platelet (× 103/µL) | 219 ± 99 | 11.3 ± 1.0 | 0.170 |
| Neutrophil (× 103/µL) | 3.977 ± 3.468 | 3.348 ± 1.208 | 0.220 |
| Lymphocyte (× 103/µL) | 0.838 ± 0.339 | 1.703 ± 0.683 | < 0.001 |
| SERM, n (%) | 13 (30.2) | 23 (48.9) | 0.087 |
| LHRH, n (%) | 11 (25.6) | 20 (42.6) | 0.121 |
| Aromatase inhibitor, n (%) | 22 (51.2) | 14 (29.8) | 0.053 |
| Radiation treatment, n (%) | 35 (81.4) | 41 (87.2) | 0.564 |
| Herceptin, n (%) | 13 (30.2) | 10 (21.3) | 0.346 |
FN febrile neutropenia, SD standard deviation, ER estrogen receptor, PR progesterone receptor, Her-2 human epidermal growth factor receptor 2, CA cancer antigen, CBC complete blood count, SERM selective estrogen receptor modulator, AI aromatase inhibitor, LHRH luteinizing hormone-releasing hormone, F/U follow-up.
Performance of machine learning algorithms for the prediction of febrile neutropenia.
| LR | DT | XGboosting | LASSO | SVM | ANN | |
|---|---|---|---|---|---|---|
| AUC | 0.870 | 0.855 | 0.908 | 0.862 | 0.880 | 0.865 |
| Accuracy | 0.781 | 0.759 | 0.816 | 0.805 | 0.782 | 0.782 |
| Sensitivity | 0.878 | 0.707 | 0.829 | 0.805 | 0.829 | 0.854 |
| Specificity | 0.696 | 0.804 | 0.804 | 0.804 | 0.739 | 0.717 |
| PPV | 0.720 | 0.763 | 0.791 | 0.786 | 0.739 | 0.729 |
| NPV | 0.865 | 0.755 | 0.841 | 0.822 | 0.829 | 0.846 |
LR logistic regression, DT decision tree, LASSO least absolute shrinkage and selection operator, SVM support vector machine, ANN artificial neural network, AUC area under the curve, PPV positive predictive value, NPV negative predictive value.
Figure 1The AUC of each algorithm shown using colored lines. The image was drawn in Python 3.6. AUC area under the curve, ROC receiver operating characteristic, TPR true positive rate, LASSO least absolute shrinkage and selection operator regression, SVM support vector machine, ANN artificial neutral network.
Figure 2Detailed cut-off values displayed in a decision tree model. The image was drawn in Python 3.6. 5D 5 days after chemotherapy, CEA carcinoembryonic antigen, BSA body surface area, WBC white blood cell, PRE pretreatment, PLT platelet.
Figure 3Flow diagram depicting the study design. The image was drawn in Microsoft PowerPoint 2016. WHO World Health Organization, DCIS ductal carcinoma in situ, LCIS lobular carcinoma in situ, F/U follow-up, FN febrile neutropenia.