| Literature DB >> 35511882 |
Nima Safaei1, Babak Safaei2, Seyedhouman Seyedekrami3, Mojtaba Talafidaryani4, Arezoo Masoud1, Shaodong Wang5, Qing Li5, Mahdi Moqri6.
Abstract
Improving the Intensive Care Unit (ICU) management network and building cost-effective and well-managed healthcare systems are high priorities for healthcare units. Creating accurate and explainable mortality prediction models helps identify the most critical risk factors in the patients' survival/death status and early detect the most in-need patients. This study proposes a highly accurate and efficient machine learning model for predicting ICU mortality status upon discharge using the information available during the first 24 hours of admission. The most important features in mortality prediction are identified, and the effects of changing each feature on the prediction are studied. We used supervised machine learning models and illness severity scoring systems to benchmark the mortality prediction. We also implemented a combination of SHAP, LIME, partial dependence, and individual conditional expectation plots to explain the predictions made by the best-performing model (CatBoost). We proposed E-CatBoost, an optimized and efficient patient mortality prediction model, which can accurately predict the patients' discharge status using only ten input features. We used eICU-CRD v2.0 to train and validate the models; the dataset contains information on over 200,000 ICU admissions. The patients were divided into twelve disease groups, and models were fitted and tuned for each group. The models' predictive performance was evaluated using the area under a receiver operating curve (AUROC). The AUROC scores were 0.86 [std:0.02] to 0.92 [std:0.02] for CatBoost and 0.83 [std:0.02] to 0.91 [std:0.03] for E-CatBoost models across the defined disease groups; if measured over the entire patient population, their AUROC scores were 7 to 18 and 2 to 12 percent higher than the baseline models, respectively. Based on SHAP explanations, we found age, heart rate, respiratory rate, blood urine nitrogen, and creatinine level as the most critical cross-disease features in mortality predictions.Entities:
Mesh:
Year: 2022 PMID: 35511882 PMCID: PMC9070907 DOI: 10.1371/journal.pone.0262895
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
A methodological overview of the related works.
| Ref. | Algorithm(s) | A brief overview of the algorithm(s) | Limitations of the research methodology |
|---|---|---|---|
| [ | ISeeU | A visually interpretable deep learning framework based on CNNs, Shapley values, and coalitional game theory | Using a high number of input features in developing the models; considering the same performance for the models across various disease groups; using a limited number of baselines for assessing the performance of the models |
| [ | Logistic Regression; Gradient Boosting Machine; Support Vector Classifier; Artificial Neural Network; and tree-based ensembles | A set of machine learning models | Using a high number of input features in developing the models; using a limited number of baselines for assessing the performance of the models |
| [ | XGBoost | A gradient boosting machine learning model | Using a high number of input features in developing the models; using a limited number of baselines for assessing the performance of the models |
| [ | Logistic Regression; AdaBoost; GB trees; XGBoost; and CatBoost | A set of machine learning models | Using a limited number of baselines for assessing the performance of the models; using a small data sample for developing the models |
| [ | A deep rule-based fuzzy system | A modified supervised fuzzy k-prototype clustering model | Developing explainable but not highly accurate models; using a high number of input features in developing the models; considering the same performance for the models across various disease groups; using a limited number of baselines for assessing the performance of the models |
| [ | Attended multi-task recurrent neural networks | An interpretable deep learning model based on RNNs and the attention mechanism | Using a high number of input features in developing the models; considering the same performance for the models across various disease groups |
| [ | A community-based federated machine learning algorithm | A method for clustering the data into clinically significant groups based on diagnosis results and regional locations | Using a limited number of baselines for assessing the performance of the models |
| [ | DeepSOFA | A modified RNN with GRU units and temporal measurements | Using a high number of input features in developing the models; considering the same performance for the models across various disease groups; using a limited number of baselines for assessing the performance of the models |
| [ | XGBoost | A gradient boosting machine learning model | Using a high number of input features in developing the models; using a limited number of baselines for assessing the performance of the models; using a small data sample for developing the models |
| [ | LightGBM | A gradient boosting tree algorithm | Using a high number of input features in developing the models; using a limited number of baselines for assessing the performance of the models; using a small data sample for developing the models |
Source: elaborated by the authors.
Fig 1The relative frequency of twelve diagnosed disease groups in the eICU-CRD v2.0.
Cross-validation results (AUROC) of the non-GBT baseline models using 10-fold cross-validation.
| Method | NB | Logistic Regression | SVM | ANN | KNN | AdaBoost | Bagging | Random Forest | Decision Tree |
|---|---|---|---|---|---|---|---|---|---|
|
| 0.88 | 0.89 | 0.81 | 0.89 | 0.76 | 0.84 | 0.88 | 0.90 | 0.66 |
|
| 0.87 | 0.88 | 0.79 | 0.88 | 0.79 | 0.84 | 0.87 | 0.89 | 0.74 |
|
| 0.88 | 0.90 | 0.80 | 0.86 | 0.80 | 0.82 | 0.86 | 0.89 | 0.70 |
|
| 0.85 | 0.86 | 0.77 | 0.72 | 0.76 | 0.85 | 0.85 | 0.88 | 0.75 |
|
| 0.85 | 0.86 | 0.76 | 0.74 | 0.76 | 0.79 | 0.75 | 0.88 | 0.70 |
|
| 0.81 | 0.83 | 0.73 | 0.71 | 0.70 | 0.76 | 0.81 | 0.83 | 0.69 |
|
| 0.87 | 0.88 | 0.79 | 0.78 | 0.80 | 0.86 | 0.83 | 0.90 | 0.70 |
|
| 0.82 | 0.83 | 0.73 | 0.81 | 0.71 | 0.79 | 0.86 | 0.89 | 0.70 |
|
| 0.80 | 0.83 | 0.74 | 0.79 | 0.69 | 0.77 | 0.80 | 0.84 | 0.69 |
|
| 0.84 | 0.86 | 0.75 | 0.79 | 0.73 | 0.79 | 0.82 | 0.86 | 0.70 |
|
| 0.87 | 0.76 | 0.77 | 0.76 | 0.81 | 0.81 | 0.80 | 0.88 | 0.65 |
|
| 0.91 | 0.90 | 0.79 | 0.80 | 0.84 | 0.79 | 0.78 | 0.90 | 0.53 |
Cross-validation results (mean AUROC [standard deviation AUROC]) of GBT models using 10-fold cross-validation.
| Disease group | Model | AUROC | Disease group | Model | AUROC |
|---|---|---|---|---|---|
|
| LightGBM | 0.91 [0.03] |
| LightGBM | 0.90 [0.04] |
| XGBoost | 0.93 [0.02] | XGBoost | 0.90 [0.03] | ||
| CatBoost | 0.92 [0.02] | CatBoost | 0.91 [0.01] | ||
|
| LightGBM | 0.90 [0.04] |
| LightGBM | 0.86 [0.04] |
| XGBoost | 0.90 [0.03] | XGBoost | 0.87 [0.02] | ||
| CatBoost | 0.91 [0.01] | CatBoost | 0.87 [0.03] | ||
|
| LightGBM | 0.89 [0.03] |
| LightGBM | 0.85 [0.03] |
| XGBoost | 0.91 [0.01] | XGBoost | 0.84 [0.04] | ||
| CatBoost | 0.90 [0.01] | CatBoost | 0.86 [0.02] | ||
|
| LightGBM | 0.89 [0.05] |
| LightGBM | 0.87 [0.02] |
| XGBoost | 0.89 [0.02] | XGBoost | 0.88 [0.03] | ||
| CatBoost | 0.89 [0.01] | CatBoost | 0.88 [0.01] | ||
|
| LightGBM | 0.88 [0.04] |
| LightGBM | 0.88 [0.05] |
| XGBoost | 0.88 [0.03] | XGBoost | 0.86 [0.03] | ||
| CatBoost | 0.89 [0.02] | CatBoost | 0.89 [0.04] | ||
|
| LightGBM | 0.84 [0.05] |
| LightGBM | 0.94 [0.09] |
| XGBoost | 0.85 [0.02] | XGBoost | 0.92 [0.08] | ||
| CatBoost | 0.86 [0.02] | CatBoost | 0.92 [0.07] |
Cross-validation results (AUROC) of CatBoost and ICU illness severity scoring systems using 10-fold cross-validation.
| Disease group | Model | AUROC | Disease group | Model | AUROC |
|---|---|---|---|---|---|
|
| APACHE IVa | 0.88 [0.04] |
| APACHE IVa | 0.88 [0.02] |
| SAPS II | 0.87 [0.05] | SAPS II | 0.83 [0.02] | ||
| OASIS | 0.87 [0.04] | OASIS | 0.83 [0.03] | ||
| MODS | 0.80 [0.08] | MODS | 0.77 [0.03] | ||
| SOFA | 0.84 [0.07] | SOFA | 0.81 [0.02] | ||
| LODS | 0.80 [0.09] | LODS | 0.78 [0.03] | ||
|
|
|
|
| ||
|
| APACHE IVa | 0.87 [0.03] |
| APACHE IVa | 0.83 [0.02] |
| SAPS II | 0.83 [0.06] | SAPS II | 0.77 [0.03] | ||
| OASIS | 0.82 [0.06] | OASIS | 0.78 [0.03] | ||
| MODS | 0.78 [0.10] | MODS | 0.75 [0.04] | ||
| SOFA | 0.80 [0.08] | SOFA | 0.77 [0.04] | ||
| LODS | 0.80 [0.07] | LODS | 0.76 [0.05] | ||
|
|
|
|
| ||
|
| APACHE IVa | 0.88 [0.02] |
| APACHE IVa | 0.88 [0.02] |
| SAPS II | 0.84 [0.03] | SAPS II | 0.78 [0.03] | ||
| OASIS | 0.84 [0.03] | OASIS | 0.75 [0.02] | ||
| MODS | 0.80 [0.04] | MODS | 0.73 [0.03] | ||
| SOFA | 0.82 [0.04] | SOFA | 0.74 [0.03] | ||
| LODS | 0.80 [0.05] | LODS | 0.75 [0.04] | ||
|
|
|
|
| ||
|
| APACHE IVa | 0.85 [0.02] |
| APACHE IVa | 0.83 [0.01] |
| SAPS II | 0.82 [0.03] | SAPS II | 0.80 [0.02] | ||
| OASIS | 0.80 [0.03] | OASIS | 0.79 [0.01] | ||
| MODS | 0.78 [0.04] | MODS | 0.75 [0.02] | ||
| SOFA | 0.79 [0.04] | SOFA | 0.77 [0.03] | ||
| LODS | 0.79 [0.05] | LODS | 0.77 [0.03] | ||
|
|
|
|
| ||
|
| APACHE IVa | 0.86 [0.03] |
| APACHE IVa | 0.88 [0.05] |
| SAPS II | 0.81 [0.04] | SAPS II | 0.82 [0.05] | ||
| OASIS | 0.80 [0.05] | OASIS | 0.82 [0.06] | ||
| MODS | 0.78 [0.05] | MODS | 0.76 [0.06] | ||
| SOFA | 0.80 [0.04] | SOFA | 0.77 [0.05] | ||
| LODS | 0.77 [0.06] | LODS | 0.74 [0.06] | ||
|
|
|
|
| ||
|
| APACHE IVa | 0.82 [0.02] |
| APACHE IVa | 0.92 [0.06] |
| SAPS II | 0.79 [0.04] | SAPS II | 0.86 [0.08] | ||
| OASIS | 0.78 [0.03] | OASIS | 0.87 [0.08] | ||
| MODS | 0.74 [0.03] | MODS | 0.82 [0.09] | ||
| SOFA | 0.76 [0.04] | SOFA | 0.85 [0.09] | ||
| LODS | 0.75 [0.05] | LODS | 0.78 [0.10] | ||
|
|
|
|
|
Fig 2Feature importance plots based on Shapley values for burns/trauma, cardiovascular, neurologic, and oncology disease groups.
Top three most important features in mortality predictions across various disease categories based on Shapley values.
| Disease group | Three most important features |
|---|---|
| endocrine | diabetes, age, respiratoryrate |
| gastrointestinal | meanbp, respiratoryrate, creatinine |
| hematology | platelets × 1000, heartrate, creatinine |
| infectious diseases | age, oobventday1, heartrate |
| neurologic | age, respiratoryrate, heartrate |
| oncology | respiratoryrate, BUN, heartrate |
| pulmonary | age, heartrate, BUN |
| renal | heartrate, age, WBC × 1000 |
| surgery | age, admissionweight, verbal |
| toxicology | glucose, hospitaladmitoffset, age |
| burns-trauma | age, verbal, creatinine |
| cardiovascular | heartrate, age, BUN |
Fig 3Force plots of the most important features in mortality prediction of patients in endocrine and gastrointestinal disease groups.
Fig 4A detailed explanation of patient features with the highest mortality probability in endocrine and gastrointestinal disease groups using Shapley values (force plots).
Fig 5Feature importance of individual patients calculated using LIME in surgery, toxicology, burns-trauma, and cardiovascular disease groups.
Fig 6Partial dependence (PD) and individual conditional expectation (ICE) plots for the most important features in the prediction of ICU discharge status in surgery, toxicology, burns-trauma, and cardiovascular disease groups.
Fig 7Bar and KDE plots for the most important features in predicting ICU discharge status for surgery, toxicology, burns-trauma, and cardiovascular disease groups.
Cross-validation (mean AUROC [standard deviation AUROC]) results of E-CatBoost model using 10-fold cross-validation.
| Disease group | AUROC | Disease group | AUROC |
|---|---|---|---|
|
| 0.91 [0.03] |
| 0.89 [0.01] |
|
| 0.88[0.01] |
| 0.86 [0.03] |
|
| 0.88 [0.01] |
| 0.83 [0.02] |
|
| 0.86 [0.02] |
| 0.84 [0.01] |
|
| 0.86 [0.03] |
| 0.88 [0.04] |
|
| 0.84 [0.02] |
| 0.91 [0.06] |