| Literature DB >> 35505048 |
Min Hyuk Choi1, Dokyun Kim1, Eui Jun Choi2, Yeo Jin Jung2, Yong Jun Choi3, Jae Hwa Cho3, Seok Hoon Jeong4.
Abstract
Improving predictive models for intensive care unit (ICU) inpatients requires a new strategy that periodically includes the latest clinical data and can be updated to reflect local characteristics. We extracted data from all adult patients admitted to the ICUs of two university hospitals with different characteristics from 2006 to 2020, and a total of 85,146 patients were included in this study. Machine learning algorithms were trained to predict in-hospital mortality. The predictive performance of conventional scoring models and machine learning algorithms was assessed by the area under the receiver operating characteristic curve (AUROC). The conventional scoring models had various predictive powers, with the SAPS III (AUROC 0.773 [0.766-0.779] for hospital S) and APACHE III (AUROC 0.803 [0.795-0.810] for hospital G) showing the highest AUROC among them. The best performing machine learning models achieved an AUROC of 0.977 (0.973-0.980) in hospital S and 0.955 (0.950-0.961) in hospital G. The use of ML models in conjunction with conventional scoring systems can provide more useful information for predicting the prognosis of critically ill patients. In this study, we suggest that the predictive model can be made more robust by training with the individual data of each hospital.Entities:
Mesh:
Year: 2022 PMID: 35505048 PMCID: PMC9065110 DOI: 10.1038/s41598-022-11226-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Flowchart depicting steps in obtaining the dataset.
Baseline characteristics and physiological variables obtained within 24 h of ICU admission.
| Admission variables (obtained within 24 h of ICU admission) | Hospital S | Hospital G | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Overall | No hospital mortality | Hospital mortality | Overall | No hospital mortality | Hospital mortality | ||||
| Age, years | 67 [57–74] | 66 [57–74] | 67 [56–76] | < 0.001 | 65 [53–75] | 64 [52–74] | 70 [58–78] | < 0.001 | < 0.001 |
| Sex | 0.004 | 0.174 | < 0.001 | ||||||
| Female | 22,744 (36.9%) | 19,944 (36.7%) | 2800 (38.5%) | 9034 (38.3%) | 7501 (38.5%) | 1533 (37.4%) | |||
| Male | 38,845 (63.1%) | 34,369 (63.3%) | 4476 (61.5%) | 14,523 (61.7%) | 11,957 (61.5%) | 2566 (62.6%) | |||
| Types of admission | < 0.001 | < 0.001 | < 0.001 | ||||||
| Medical | 39,560 (64.2%) | 34,232 (63.0%) | 5328 (73.2%) | 11,365 (48.2%) | 9073 (46.6%) | 2292 (55.9%) | |||
| Surgical | 22,029 (35.8%) | 20,081 (37.0%) | 1948 (26.8%) | 12,192 (51.8%) | 10,385 (53.4%) | 1807 (44.1%) | |||
| Year of admission | < 0.001 | < 0.001 | 0.033 | ||||||
| 2006–2010 | 16,531 (26.8%) | 14,422 (26.6%) | 2109 (29.0%) | 6379 (27.1%) | 5322 (27.4%) | 1057 (25.8%) | |||
| 2011–2015 | 20,945 (34.0%) | 18,371 (33.8%) | 2574 (35.4%) | 8179 (34.7%) | 6637 (34.1%) | 1542 (37.6%) | |||
| 2016–2020 | 24,113 (39.2%) | 21,520 (39.6%) | 2593 (35.6%) | 8999 (38.2%) | 7499 (38.5%) | 1500 (36.6%) | |||
| APACHE II | 13 [10–17] | 13 [10–16] | 19 [13–26] | < 0.001 | 16 [11–21] | 15 [10–19] | 22 [18–28] | < 0.001 | < 0.001 |
| APACHE III | 55 [47–65] | 54 [46–63] | 74 [57–96] | < 0.001 | 60 [50–74] | 57 [48–69] | 79 [66–95] | < 0.001 | < 0.001 |
| SAPS II | 34 [28–42] | 33 [27–40] | 48 [36–64] | < 0.001 | 37 [29–48] | 35 [27–44] | 52 [42–64] | < 0.001 | < 0.001 |
| SAPS III | 47 [40–55] | 46 [40–53] | 59 [50–70] | < 0.001 | 51 [43–61] | 49 [42–57] | 65 [56–74] | < 0.001 | < 0.001 |
| MPM0 II | 2 [2–2] | 2 [2–2] | 2 [1–3] | < 0.001 | 2 [1, 2] | 1 [1, 2] | 2 [1–3] | < 0.001 | < 0.001 |
| MPM0 III | 2 [2–3] | 2 [2, 3] | 2 [1–3] | < 0.001 | 2 [1, 2] | 2 [1, 2] | 2 [2–4] | < 0.001 | < 0.001 |
| SOFA | 4 [2–8] | 4 [2–7] | 7 [4–11] | < 0.001 | 6 [3–9] | 6 [2–9] | 10 [7–12] | < 0.001 | < 0.001 |
| Pitt bacteremia score | 1 [0–4] | 1 [0–4] | 2 [0–5] | < 0.001 | 3 [1–5] | 3 [1–4] | 5 [3–7] | < 0.001 | < 0.001 |
| Underlying comorbidities | |||||||||
| Charlson comorbidity index | 5 [4–6] | 5 [4–6] | 5 [4–7] | < 0.001 | 5 [3–6] | 4 [3–6] | 5 [4–7] | < 0.001 | < 0.001 |
| Cancer | 5137 (8.3%) | 3101 (5.7%) | 2036 (28.0%) | < 0.001 | 4239 (18.0%) | 3002 (15.4%) | 1237 (30.2%) | < 0.001 | < 0.001 |
| Cerebrovascular diseases | 14,373 (23.3%) | 13,433 (24.7%) | 940 (12.9%) | < 0.001 | 4533 (19.2%) | 3716 (19.1%) | 817 (19.9%) | 0.226 | < 0.001 |
| Diabetes mellitus | 16,696 (27.1%) | 15,126 (27.8%) | 1570 (21.6%) | < 0.001 | 4086 (17.3%) | 3391 (17.4%) | 695 (17.0%) | 0.482 | < 0.001 |
| Hypertension | 28,407 (46.1%) | 26,538 (48.9%) | 1869 (25.7%) | < 0.001 | 5406 (22.9%) | 4677 (24.0%) | 729 (17.8%) | < 0.001 | < 0.001 |
| Chronic pulmonary diseases | 1832 (3.0%) | 1364 (2.5%) | 468 (6.4%) | < 0.001 | 541 (2.3%) | 332 (1.7%) | 209 (5.1%) | < 0.001 | < 0.001 |
| Hemiplegia | 2041 (3.3%) | 1900 (3.5%) | 141 (1.9%) | < 0.001 | 1650 (7.0%) | 1500 (7.7%) | 150 (3.7%) | < 0.001 | < 0.001 |
| Liver diseases | 2080 (3.4%) | 1235 (2.3%) | 845 (11.6%) | < 0.001 | 1248 (5.3%) | 922 (4.7%) | 326 (8.0%) | < 0.001 | < 0.001 |
| Myocardial infarction | 11,682 (19.0%) | 10,888 (20.0%) | 794 (10.9%) | < 0.001 | 2886 (12.3%) | 2482 (12.8%) | 404 (9.9%) | < 0.001 | < 0.001 |
| Renal diseases | 3462 (5.6%) | 2798 (5.2%) | 664 (9.1%) | < 0.001 | 1367 (5.8%) | 981 (5.0%) | 386 (9.4%) | < 0.001 | 0.313 |
| Ulcer | 1168 (1.9%) | 926 (1.7%) | 242 (3.3%) | < 0.001 | 485 (2.1%) | 370 (1.9%) | 115 (2.8%) | < 0.001 | 0.131 |
| Transplantation | 766 (1.2%) | 265 (0.5%) | 501 (6.9%) | < 0.001 | 203 (0.9%) | 153 (0.8%) | 50 (1.2%) | 0.008 | < 0.001 |
| Ventilator use | 10,537 (17.1%) | 8895 (16.4%) | 1642 (22.6%) | < 0.001 | 5957 (25.3%) | 4745 (24.4%) | 1212 (29.6%) | < 0.001 | < 0.001 |
| Vasopressor use | 21,448 (34.8%) | 18,089 (33.3%) | 3359 (46.2%) | < 0.001 | 11,708 (49.7%) | 8439 (43.4%) | 3269 (79.8%) | < 0.001 | < 0.001 |
| Cardiac arrest | 809 (1.3%) | 694 (1.3%) | 115 (1.6%) | 0.038 | 1228 (5.2%) | 414 (2.1%) | 814 (19.9%) | < 0.001 | < 0.001 |
| Site of infection | < 0.001 | < 0.001 | < 0.001 | ||||||
| Multiple sites | 843 (1.4%) | 108 (0.2%) | 735 (10.1%) | 275 (1.2%) | 906 (4.7%) | 529 (12.9%) | |||
| Lungs | 428 (0.7%) | 283 (0.5%) | 145 (2.0%) | 1435 (6.1%) | 383 (2.0%) | 238 (5.8%) | |||
| Bloodstream | 251 (0.4%) | 119 (0.2%) | 132 (1.8%) | 183 (0.8%) | 110 (0.6%) | 73 (1.8%) | |||
| Urinary tract | 503 (0.8%) | 363 (0.7%) | 140 (1.9%) | 621 (2.6%) | 383 (2.0%) | 238 (5.8%) | |||
| CNS | 3 (0.0%) | 0 (0.0%) | 3 (0.0%) | 0 (0.0%) | 165 (0.8%) | 110 (2.7%) | |||
| Abdomen | 27 (0.0%) | 19 (0.0%) | 8 (0.1%) | 156 (0.7%) | 103 (0.5%) | 53 (1.3%) | |||
| None | 59,521 (96.6%) | 53,418 (98.4%) | 6103 (83.9%) | 20,887 (88.7%) | 17,791 (91.4%) | 3096 (75.5%) | |||
| Antibiotic use at ICU admission (may be multiple) | 28,255 (45.9%) | 22,754 (41.9%) | 5501 (75.6%) | < 0.001 | 16,867 (71.6%) | 13,110 (67.4%) | 3757 (91.7%) | < 0.001 | < 0.001 |
| 3rd-generation cephalosporins | 6788 (11.0%) | 4947 (9.1%) | 1841 (25.3%) | < 0.001 | 5649 (24.0%) | 4490 (23.1%) | 1159 (28.3%) | < 0.001 | < 0.001 |
| 4th-generation cephalosporins | 450 (0.7%) | 30 (0.1%) | 420 (5.8%) | < 0.001 | 591 (2.5%) | 296 (1.5%) | 295 (7.2%) | < 0.001 | < 0.001 |
| Beta lactam/beta lactamase inhibitors | 8744 (14.2%) | 5646 (10.4%) | 3098 (42.6%) | < 0.001 | 6362 (27.0%) | 4401 (22.6%) | 1961 (47.8%) | < 0.001 | < 0.001 |
| Carbapenems | 2300 (3.7%) | 587 (1.1%) | 1713 (23.5%) | < 0.001 | 2524 (10.7%) | 1409 (7.2%) | 1115 (27.2%) | < 0.001 | < 0.001 |
| Glycopeptides | 7144 (11.6%) | 4280 (7.9%) | 2864 (39.4%) | < 0.001 | 3526 (15.0%) | 2182 (11.2%) | 1344 (32.8%) | < 0.001 | < 0.001 |
| Penicillins | 3389 (5.5%) | 3082 (5.7%) | 307 (4.2%) | < 0.001 | 211 (0.9%) | 150 (0.8%) | 61 (1.5%) | < 0.001 | < 0.001 |
| Quinolones | 3926 (6.4%) | 1773 (3.3%) | 2153 (29.6%) | < 0.001 | 4369 (18.5%) | 2845 (14.6%) | 1524 (37.2%) | < 0.001 | < 0.001 |
Characteristics are summarized as the median [IQR], or n (%).
ICU intensive care unit, APACHE Assessment and Chronic Health Evaluation, SAPS Simplified Acute Physiology Score, MPM Mortality Probability Model, SOFA Sequential Oran Failure Assessment.
*P value for difference between cases with and without in-hospital mortality.
†P value for difference between hospital S and hospital G.
Performance metrics for the conventional scoring systems by hospital and study period.
| Hospital S in 2005–2010 | Hospital S in 2011–2015 | Hospital S in 2016–2020 | Hospital S in Total period | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AUROC (95% CI) | Cutoff | Accuracy | F1 score | AUROC (95% CI) | Cutoff | Accuracy | F1 score | AUROC (95% CI) | Cutoff | Accuracy | F1 score | AUROC (95% CI) | Cutoff | Accuracy | F1 score | |
| APACHE II | 0.691 (0.678–0.704) | 15 | 0.686 | 0.318 | 0.721 (0.709–0.732) | 19 | 0.794 | 0.376 | 0.808 (0.798–0.817) | 17 | 0.754 | 0.379 | 0.738 (0.731–0.745) | 17 | 0.747 | 0.356 |
| APACHE III | 0.720 (0.706–0.735) | 64 | 0.793 | 0.405 | 0.747 (0.735–0.759) | 70 | 0.821 | 0.441 | 0.798 (0.787–0.809) | 68 | 0.813 | 0.429 | 0.755 (0.747–0.762) | 69 | 0.826 | 0.436 |
| SAPS II | 0.757 (0.744–0.769) | 38 | 0.776 | 0.412 | 0.770 (0.759–0.781) | 44 | 0.799 | 0.427 | 0.793 (0.782–0.804) | 47 | 0.831 | 0.441 | 0.766 (0.759–0.772) | 45 | 0.822 | 0.430 |
| SAPS III | 0.756 (0.743–0.768) | 53 | 0.730 | 0.388 | 0.777 (0.767–0.788) | 56 | 0.793 | 0.423 | 0.781 (0.771–0.791) | 56 | 0.770 | 0.378 | 0.773 (0.766–0.779) | 56 | 0.793 | 0.401 |
| MPM0 II | 0.574 (0.560–0.588) | 1 | 0.719 | 0.290 | 0.547 (0.534–0.560 | 1 | 0.768 | 0.291 | 0.540 (0.526–0.553) | 3 | 0.791 | 0.267 | 0.477 (0.469–0.485) | 3 | 0.766 | 0.222 |
| MPM0 III | 0.612 (0.598–0.626) | 1 | 0.772 | 0.317 | 0.563 (0.550–0.577) | 1 | 0.795 | 0.297 | 0.504 (0.490–0.518) | 4 | 0.862 | 0.226 | 0.553 (0.545–0.561) | 1 | 0.787 | 0.285 |
| SOFA | 0.647 (0.634–0.660) | 4 | 0.600 | 0.296 | 0.693 (0.682–0.705) | 5 | 0.633 | 0.312 | 0.784 (0.774–0.793) | 6 | 0.681 | 0.343 | 0.706 (0.699–0.712) | 5 | 0.625 | 0.307 |
| Quick SOFA | 0.526 (0.513–0.539) | 0 | 0.754 | 0.216 | 0.513 (0.501–0.526) | 3 | 0.825 | 0.164 | 0.555 (0.542–0.567) | 3 | 0.770 | 0.229 | 0.512 (0.505–0.520) | 3 | 0.816 | 0.173 |
| Pitt bacteremia score | 0.537 (0.526–0.549) | 2 | 0.627 | 0.238 | 0.585 (0.573–0.596) | 2 | 0.613 | 0.260 | 0.607 (0.595–0.619) | 8 | 0.866 | 0.270 | 0.576 (0.569–0.583) | 2 | 0.603 | 0.240 |
AUROC area under the receiver operating characteristic curve, APACHE Assessment and Chronic Health Evaluation, SAPS Simplified Acute Physiology Score. MPM Mortality Probability Model, SOFA Sequential Oran Failure Assessment.
Performance metrics for the machine learning algorithms with the test set for each cohort.
| Cohort C | AUROC (95% CI) | Accuracy | F1 score |
|---|---|---|---|
| К-nearest neighbor (KNN) | 0.873 (0.864–0.882) | 0.899 | 0.573 |
| Decision tree (DT) | 0.919 (0.911–0.926) | 0.922 | 0.731 |
| Random forest (RF) | 0.951 (0.846–0.956) | 0.923 | 0.731 |
| eXtreme gradient boosting (XGBoost) | 0.961 (0.957–0.965) | 0.932 | 0.758 |
| Light gradient boosting (LightGBM) | 0.961 (0.957–0.965) | 0.933 | 0.765 |
| Support vector machine (SVM) | 0.919 (0.911–0.928) | 0.921 | 0.690 |
| Artificial neural network (ANN) | 0.950 (0.941–0.959) | 0.931 | 0.751 |
AUROC area under the receiver operating characteristic curve, CI Confidence interval.
Figure 2Comparison of machine learning-based in-hospital mortality prediction models. AUROC area under the receiver operating characteristic curve, AUPRC area under the precision-recall curve, KNN K-nearest neighbor, DT decision tree, RF random forest, XGBoost eXtreme gradient boosting, LightGBM light gradient boosting, SVM support vector machine. The calibration plots show the agreement of between predicted probability and observed in-hospital mortality. The black line at 45 degrees indicates perfect calibration where the predicted and observed probabilities are equal.
Figure 3Critical variables with feature importance plots and SHAP values for predicting in-hospital mortality. Critical variables with feature importance plots and SHAP values for predicting in-hospital mortality. XGBoost extreme gradient boost, LightGBM light gradient boosting machine, SHAP Shapley additive explanation. (a) Feature importance including feature weight, mean gain and coverage of XGBoost and (b) SHAP value summary dot plot of the LightGBM-based prediction model in cohort C. (c) Feature importance including feature weight, mean gain and coverage of XGBoost and (d) SHAP value summary dot plot of the LightGBM-based prediction model in cohort S. (e) Feature importance including feature weight, mean gain and coverage of XGBoost and (f) SHAP value summary dot plot of the LightGBM-based prediction model in cohort G. The color of the SHAP dot represents the value of the feature, and the location of the dot on the X axis represents its SHAP value. Red dots indicate higher values or affirmative responses (for binary features), and blue dots indicate the opposite. A positive SHAP value indicates that the variables increase the likelihood of in-hospital mortality.