| Literature DB >> 35639445 |
Cristina Barboi1, Andreas Tzavelis2,3, Lutfiyya NaQiyba Muhammad4.
Abstract
BACKGROUND: Severity of illness scores-Acute Physiology and Chronic Health Evaluation, Simplified Acute Physiology Score, and Sequential Organ Failure Assessment-are current risk stratification and mortality prediction tools used in intensive care units (ICUs) worldwide. Developers of artificial intelligence or machine learning (ML) models predictive of ICU mortality use the severity of illness scores as a reference point when reporting the performance of these computational constructs.Entities:
Keywords: artificial intelligence; intensive care unit mortality; machine learning; severity of illness models
Year: 2022 PMID: 35639445 PMCID: PMC9198821 DOI: 10.2196/35293
Source DB: PubMed Journal: JMIR Med Inform
Recommended structure for reporting MLa models.
| Research question and ML justification | Data sources and preprocessing (feature selection) | Model training and validation |
| Clinical question | Population | Hardware, software, and packages used |
| Intended use of the result | Sample record and measurement characteristics | Evaluation (calibration and discrimination) |
| Defined problem type | Data collection and quality | Configuration (parameters and hyperparameters) |
| Available data | Data structure and types | Model optimization and generalization (hyperparameter tuning and parameter limits) |
| Defined ML method and rationale | Differences between evaluation and validation sets | Validation method and data split and cross-validation |
| Defined evaluation measures, training protocols, and validation | Data preprocessing (data aggregation, missing data, transformation, and label source) | Validation method performance metrics on an external data set |
| N/Ab | Input configuration | Reproducibility, code reuse, and explainability |
aML: machine learning.
bN/A: not applicable.
CHARMS (Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies) checklist.
| Author | Data source (description) | Outcome mortality | Data preparation | Model training | Predictive performance | Generalizability | ||||||||||||
|
|
|
| Aa | Bb | Cc | Dd | Ee | Ff | Gg | Hh | Ii | Jj | Kk | Ll | Mm | Nn | ||
| Pirracchio et al [ | MIMICo 2 | Hospital |
|
|
| ✓ | ✓ | ✓ | ✓ |
|
| ✓ |
| ✓ |
| ✓ | ||
| Nielsen et al [ | Danish ICUp | Hospital 30/90 days | ✓ | ✓ | ✓ | ✓ | ✓ |
| ✓ | ✓ |
| ✓ |
|
| ✓ |
| ||
| Nimgaonkar et al [ | ICU India | Hospital |
|
|
| ✓ | ✓ | ✓ | ✓ |
|
|
|
|
|
|
| ||
| Xia et al [ | MIMIC 3 | 28 days/hospital | ✓ | ✓ |
| ✓ | ✓ |
| ✓ | ✓ |
|
|
|
| ✓ | ✓ | ||
| Purushotham et al [ | MIMIC 3 | Hospital, 2 days, 3 days, 30 days, 1 year | ✓ | ✓ |
| ✓ | ✓ |
| ✓ | ✓ |
| ✓ |
|
| ✓ | ✓ | ||
| Nanayakkara et al [ | ANZICSq | Hospital | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
|
| ✓ | ✓ | ✓ | ✓ | ||
| Meyer et al [ | Germany | Hospital |
| ✓ |
| ✓ | ✓ |
| ✓ | ✓ |
| ✓ |
| ✓ |
| ✓ | ||
| Meiring et al [ | CCHICr United Kingdom | Hospital | ✓ | ✓ | ✓ | ✓ | ✓ |
| ✓ |
|
|
|
|
| ✓ | ✓ | ||
| Lin et al [ | MIMIC 3 | Hospital | ✓ | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ |
|
|
|
|
|
| ||
| Krishnan et al [ | MIMIC 3 | ICU | ✓ | ✓ |
| ✓ | ✓ |
| ✓ | ✓ |
|
|
| ✓ |
|
| ||
| Kang et al [ | Korea | Hospital | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
|
| ✓ |
|
|
| ||
| Johnson et al [ | United Kingdom | ICU and hospital | ✓ | ✓ |
| ✓ | ✓ | ✓ | ✓ |
| ✓ | ✓ |
| ✓ |
|
| ||
| Holmgren et al [ | Sweden | Hospital and 30 days |
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
|
|
|
| ✓ | ✓ | ||
| Garcia-Gallo et al [ | MIMIC 3 | Hospital and 1 year | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
|
|
|
| ✓ | ✓ | ||
| El-Rashidy et al [ | MIMIC 3 | ICU and hospital | ✓ | ✓ |
| ✓ | ✓ |
| ✓ | ✓ |
| ✓ |
|
| ✓ | ✓ | ||
| Silva et al [ | EURICUSs 2 | ICU | ✓ | ✓ | ✓ | ✓ | ✓ |
| ✓ | ✓ | ✓ |
| ✓ | ✓ |
|
| ||
| Caicedo-Torres et al [ | MIMIC 3 | ICU | ✓ | ✓ |
| ✓ | ✓ |
| ✓ | ✓ |
|
| ✓ |
|
|
| ||
| Deshmukh et al [ | eICU-CRDt | ICU | ✓ | ✓ | ✓ | ✓ | ✓ |
| ✓ | ✓ |
|
| ✓ |
| ✓ |
| ||
| Ryan et al [ | MIMIC 2 | ICU and hospital | ✓ | ✓ | ✓ | ✓ | ✓ |
| ✓ | ✓ |
| ✓ |
|
| ✓ | ✓ | ||
| Mayaud et al [ | MIMIC 2 | Hospital | ✓ | ✓ |
| ✓ | ✓ | ✓ | ✓ |
| ✓ |
|
|
|
|
| ||
aData normalization/outlier addressed.
bMissing data addressed.
cHyperparameter optimization addressed.
dOverfitting/shrinkage and cross-validation addressed.
ePredictor selection, full model versus backward elimination.
fCalibration assessed (Brier, Hosmer-Lemeshow, and calibration plot).
gDiscrimination/reclassification performed (net reclassification improvement/integrated discrimination improvement).
hClassification reported.
iRecalibration performed.
jExternally validated.
kExplainability addressed/decision curve analysis.
lClinical applicability addressed.
mPrediction span defined.
nIntended moment of use reported.
oMIMIC: Medical Information Mart for Intensive Care.
pICU: intensive care unit.
qANZICS: Australia New Zealand Intensive Care Unit Society.
rCCHIC: Critical Care Health Informatics Collaborative.
sEURICUS: European ICU studies.
teICU CRD: Electronic ICU Collaborative Research Database.
Figure 1Search strategy and selection process. AUROC: area under the receiver operating curve; ICU: intensive care unit.
Figure 2Frequency and type of ML model input variables (x-axis: number of studies using the input variables; y-axis: input variable). ASA: American Society of Anesthesiology; COPD: chronic obstructive pulmonary disease; CVA: cerebral vascular accident; FIO2: fraction of inspired oxygen; ICU: intensive care unit; LFT: liver function test; ML: machine learning; RBC: red blood cell; SpO2: oxygen saturation; PaO2: arterial oxygen pressure; PaCO2: arterial CO2 pressure.
Information on the MLa prediction model development, validation, and performance, and on the severity of illness score performance.
| Author | ML model type (AUROCb test) | Data training/test (split %) | Features | K-fold/validation | External validation data set | ML AUROC external | Severity of illness score model type (AUROC) |
| Pirracchio et al, 2015 [ |
Ensemble SICULAc (0.85) | 24,508 | 17 | 5-fold cross-validation | 200 | 0.94 |
SAPSd-II (0.78) APACHEe-II (0.83) SOFf (0.71) |
| Nielsen et al, 2019 [ |
NNg (0.792) | 10,368 | 44 | 5-fold cross-validation | 1528 | 0.773 |
SAPS-II (0.74) APACHE-II (0.72) |
| Nimgaonkar et al, 2004 [ |
NN (0.88) | 2962 | 15 | N/Ah | N/A | N/A |
APACHE-II (0.77) |
| Xia et al, 2019 [ |
Ensemble-LSTMi (0.85) LSTM (0.83) DTj (0.82) | 18,415 | 50 | Bootstrap and RSMk | N/A | N/A |
SAPS-II (0.77) SOFA (0.73) APACHE-II (0.74) |
| Purushotham et al, 2018 [ |
NN (0.87) Ensemble (0.84) | 35,627 | 17/22/ | 5-fold cross-validation | External benchmark | N/A |
SAPS-II (0.80) SOFA (0.73) |
| Nanayakkara et al, 2018 [ |
DT (0.86) SVMl (0.86) NN (0.85) Ensemble (0.87) GBMm (0.87) | 39,560 | 29 | 5-fold cross-validation | N/A | N/A |
APACHE-III (0.8) |
| Meyer et al, 2018 [ |
NN (0.95) | 5898 | 52 | 10-fold cross-validation | 5989 | 0.81 |
SAPS-II (0.71) |
| Meiring et al, 2018 [ |
DT (0.85) NN (0.86) SVM (0.86) | 80/20 | 25 | 21,911 | N/A | N/A |
APACHE-II (0.83) |
| Lin et al, 2019 |
DT (0.86) NN (0.83) SVM (0.86) | 19,044 | 15 | 5-fold cross-validation | N/A | N/A |
SAPS-II (0.79) |
| Krishnan et al, 2018 [ |
NN-ELMo (0.99) | 10,155 | 1 | 10-fold cross-validation | N/A | N/A |
SAPS (0.80) SOFA (0.73) APSp-III (0.79) |
| Kang et al, 2020 [ |
SVM (0.77) DT (0.78) NN (0.776) k-NNq (0.76) | 1571 | 33 | 10-fold cross-validation | N/A | N/A |
SOFA (0.66) APACHE-II (0.59) |
| Johnson et al, 2013 [ |
LRr univariate (0.902) LR multivariate (0.876) | 39,070 | 10 | 10-fold cross-validation | 23,618 | 0.837 (univariate); 0.868 (multivariate) |
APS-III (0.86) |
| Holmgren et al, 2019 [ |
NN (0.89) | 217,289 | 8 | 5-fold cross-validation | N/A | N/A |
SAPS-III (0.85) |
| Garcia-Gallo et al, 2020 [ |
SGB-LASSOs (0.803) | 5650 | 18 | 10-fold cross-validation | N/A | N/A |
SOFA (0.58) SAPS (0.70) |
| El-Rashidy et al, 2020 [ |
Ensemble (0.93) | 10,664 | 80 | 10-fold cross-validation | External benchmark | N/A |
APACHE-II (0.73) SAPS-II (0.81) SOFA-II (0.78) |
| Silva et al, 2006 [ |
NN (0.85) | 13,164 | 12 | Hold out | N/A | N/A |
SAPS- II (0.8) |
| Caicedo-Torres et al, 2019 [ |
NN (0.87) | 22,413 | 22 | 5-fold cross-validation | N/A | N/A |
SAPS-II (0.73) |
| Deshmukh et al, 2020 [ |
XGBt (0.85) | 5691 | 34 | 5-fold cross-validation | N/A | N/A |
APACHE-IV (0.8) |
| Ryan et al, 2020 [ |
DT (0.86) | 35,061 | 12 | 5-fold cross-validation | 114 | 0.91 |
qSOFAu (0.76) |
| Mayaud et al, 2013 [ |
GAv+LR (0.82) | 2113 | 25 | BBCCVw | N/A | N/A |
APACHE-III (0.68) |
aML: machine learning.
bAUROC: area under the receiver operating curve.
cSICULA: Super ICU Learner Algorithm.
dSAPS: Simplified Acute Physiology Score.
eAPACHE: Acute Physiology and Chronic Health Evaluation.
fSOFA: Sequential Organ Failure Assessment.
gNN: neural network.
hN/A: not applicable.
iLSTM: long short-term memory.
jDT: decision tree.
kRSM: random subspace method.
lSVM: support vector machine.
mGBM: gradient boosting machine.
nLOO: leave one out.
oELM: extreme learning machine.
pAPS: Acute Physiology Score.
qk-NN: k-nearest neighbor.
rLR: logistic regression.
sSGB-LASSO: stochastic gradient boosting least absolute shrinkage and selection operator.
tXGB: extreme gradient boosting.
uqSOFA: Quick Sequential Organ Failure Assessment.
vGA: genetic algorithm.
wBBCV: bootstrap bias–corrected cross-validation.
Reported performance measures of the MLa models.
| Author and ML model | Classification measurements | Calibration measurements | Other | |||||||||
|
| Specificity | PPVb/precision | Recall/sensitivity | F1 score | Accuracy | HLc score | Brier score | Calibration curve |
| |||
|
| ||||||||||||
|
| Ensemble SLd-1 | N/Ae | N/A | N/A | N/A | N/A | N/A | 0.079 | DSg=0.21 | |||
|
| Ensemble SL-2 | N/A | N/A | N/A | N/A | N/A | N/A | 0.079 | DS=0.26 | |||
|
| ||||||||||||
|
| NNh | N/A | 0.388 | N/A | N/A | N/A | N/A | N/A | N/A | Mathews correlation coefficient | ||
|
| ||||||||||||
|
| NN | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | 0.491 (AUPRCi) | ||
|
| Ensemble | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | 0.435 (AUPRC) | ||
|
| ||||||||||||
|
| NN-15 features | N/A | N/A | N/A | N/A | N/A | 27.7 | N/A | Calibration plot | N/A | ||
|
| NN-22 features | N/A | N/A | N/A | N/A | N/A | 22.4 | N/A | Calibration plot | N/A | ||
|
| ||||||||||||
|
| Ensemble-LSTMj | 0.7503 | 0.294 | 0.7758 | 0.4262 | 0.7533 | N/A | N/A | N/A | N/A | ||
|
| LSTM | 0.7746 | 0.305 | 0.7384 | 0.4317 | 0.7703 | N/A | N/A | N/A | N/A | ||
|
| RFk | 0.7807 | 0.306 | 0.71197 | 0.4290 | 0.7734 | N/A | N/A | N/A | N/A | ||
|
| ||||||||||||
|
| RF | 0.79 | 0.75 | 0.76 | N/A | 0.78 | N/A | 0.156 | Calibration plot | 0.47 (log loss) | ||
|
| SVCl | 0.81 | 0.77 | 0.75 | N/A | 0.78 | N/A | 0.153 | Calibration plot | 0.47 (log loss) | ||
|
| GBMm | 0.78 | 0.75 | 0.8 | N/A | 0.79 | N/A | 0.147 | Calibration plot | 0.45 (log loss) | ||
|
| NN | 0.72 | 0.71 | 0.82 | N/A | 0.77 | N/A | 0.158 | Calibration plot | 0.48 (log loss) | ||
|
| Ensemble | 0.81 | 0.77 | 0.77 | N/A | 0.79 | N/A | 0.148 | Calibration plot | 0.45 (log loss) | ||
|
| ||||||||||||
|
| RNNn | 0.91 | 0.9 | 0.85 | 0.88 | 0.88 | N/A | N/A | N/A | N/A | ||
|
| ||||||||||||
|
| DTo, NN, SVMp | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | ||
|
| ||||||||||||
|
| RF | N/A | N/A | N/A | 0.459 | 0.728 | N/A | 0.085 | Calibration plot | N/A | ||
|
| NN | N/A | N/A | N/A | 0.406 | 0.666 | N/A | 0.091 | Calibration plot | N/A | ||
|
| SVM | N/A | N/A | N/A | 0.460 | 0.729 | N/A | 0.086 | Calibration plot | N/A | ||
|
| ||||||||||||
|
| ANN-ELMq | N/A | N/A | 0.98 | 0.98 | 0.98 | N/A | N/A | N/A | Mathews correlation coefficient | ||
|
| ||||||||||||
|
| k-NNr | N/A | N/A | N/A | 0.745 | 0.673 | N/A | N/A | Calibration plot | N/A | ||
|
| SVM | N/A | N/A | N/A | 0.752 | 0.696 | N/A | N/A | Calibration plot | N/A | ||
|
| RF | N/A | N/A | N/A | 0.762 | 0.69 | N/A | N/A | Calibration plot | N/A | ||
|
| XGBs | N/A | N/A | N/A | 0.763 | 0.711 | N/A | N/A | Calibration plot | N/A | ||
|
| NN | N/A | N/A | N/A | 0.749 |
| N/A | N/A | Calibration plot | N/A | ||
|
| ||||||||||||
|
| LRt univariate | N/A | N/A | N/A | N/A | N/A | 22 | 0.051 | N/A | N/A | ||
|
| LR multivariate | N/A | N/A | N/A | N/A | N/A | 19.6 | 0.048 | N/A | N/A | ||
|
| ||||||||||||
|
| NN | N/A | N/A | N/A | N/A | N/A | N/A | 0.106 | Calibration plot | N/A | ||
|
| ||||||||||||
|
| SGBu | N/A | N/A | N/A | N/A | 0.725 | 0.0916 | N/A | Calibration plot | N/A | ||
|
| SGB-LASSOv | N/A | N/A | N/A | N/A | 0.712 | 0.0916 | N/A | Calibration plot | N/A | ||
|
| ||||||||||||
|
| Ensemble | 0.94 | N/A | 0.911 | 0.937 | 0.944 | N/A | N/A | N/A | N/A | ||
|
| ||||||||||||
|
| NN | 0.79 | N/A | 0.78 | N/A | 0.7921 | N/A | N/A | N/A | N/A | ||
|
| ||||||||||||
|
| NN | 0.827 | N/A | 0.75 | N/A | N/A | N/A | N/A | N/A | N/A | ||
|
| ||||||||||||
|
| XGB | 0.27 | N/A | 1 | N/A | N/A | N/A | N/A | N/A | N/A | ||
|
| ||||||||||||
|
| XGB | 0.75 | N/A | 0.801 | 0.378 | 0.75 | N/A | N/A | N/A | N/A | ||
|
| ||||||||||||
|
| GAw+LR | N/A | N/A | N/A | N/A | N/A | 10.43 | N/A | Calibration plot | N/A | ||
aML: machine learning.
bPPV: positive predictive value.
cHL: Hosmer-Lemeshow.
dSL: super learner.
eN/A: not available.
fU statistics.
gDS: discrimination slope.
hNN: neural network.
iAUPRC: area under the precison-recall curve.
jLSTM: long short-term memory.
kRF: random forest.
lSVC: support vector classifier.
mGBM: gradient boosting machine.
nRNN: recurrent neural network.
oDT: decision tree.
pSVM: support vector machine.
qANN-ELM: artificial neural network extreme learning machine.
rk-NN: k-nearest neighbor.
sXGB: extreme gradient boosting.
tLR: logistic regression.
uSGB: stochastic gradient boosting.
vLASSO: least absolute shrinkage and selection operator.
wGA: genetic algorithm.
Assessment for ROBa and applicability for prognostic models with the Prediction model ROB Assessment Tool checklist.
| Authors | ROB and applicability | ||||||||||
|
| Participants | Predictors | Outcome | Analysis | Overall judgment | ||||||
|
| ROB | Applicability | ROB | Applicability | ROB | Applicability | ROB | ROB | Applicability | ||
| Pirracchio et al [ | Lowb | Low | Low | Low | Low | Low | Low | Low | Low | ||
| Nielsen et al [ | Low | Low | Unclearc | Low | Unclear | Low | Low | Unclear | Low | ||
| Nimgaonkar et al [ | Low | Unclear | Low | Low | Low | Low | Highd | High | Unclear | ||
| Xia et al [ | Low | Low | Low | Low | Unclear | Low | High | High | Low | ||
| Purushotham et al [ | Low | Low | Low | Low | Low | Low | Low | Low | Low | ||
| Nanayakkara et al [ | Low | Unclear | Low | Low | Low | Low | High | High | Unclear | ||
| Meyer et al [ | Low | Unclear | Low | Low | Low | Low | Low | Low | Unclear | ||
| Meiring et al [ | Low | Low | Low | Low | Low | Low | High | High | Low | ||
| Lin et al [ | Low | Unclear | Low | Low | Low | Low | High | High | Unclear | ||
| Krishnan et al [ | Low | Low | Low | Low | Low | Low | High | High | Low | ||
| Kang et al [ | Low | Unclear | Low | Low | Low | Low | High | High | Unclear | ||
| Johnson et al [ | Low | Low | Low | Low | Low | Low | Low | Low | Low | ||
| Holmgren et al [ | Low | Low | Low | Low | Unclear | Low | High | High | Low | ||
| Garcia-Gallo et al [ | Low | Unclear | Low | Low | Low | Low | High | High | Unclear | ||
| El-Rashidy et al [ | Low | Low | Low | Low | Low | Low | Low | Low | Low | ||
| Silva et al [ | Low | Low | Low | Low | Low | Low | High | High | Low | ||
| Caicedo-Torres et al [ | Low | Low | Low | Low | Low | Low | High | High | Low | ||
| Deshmukh et al [ | Low | Unclear | Low | Low | Low | Low | High | High | Unclear | ||
| Ryan et al [ | Low | Low | Unclear | Low | Low | Low | Low | Unclear | Low | ||
| Mayaud et al [ | Low | Unclear | Low | Unclear | Low | Low | High | High | Unclear | ||
aROB: risk of bias.
bLow risk: no relevant shortcomings in ROB assessment.
cUnclear risk: unclear ROB in at least one domain and all other domains at low ROB.
dHigh risk: relevant shortcomings in the ROB assessment, at least one domain with high ROB, or model developed without external validation.
Figure 3Meta-analysis results: pooled AUROC for externally validated Ensemble models. Gray boxes represent the fixed weight estimates of the AUROC value from each study. Larger gray boxes represent larger fixed weight estimates of the AUROC values. The horizontal line through each gray box illustrates the 95% CI of the AUROC value from that study. Black horizontal lines through a gray box indicate that the CI limits exceed the length of the gray box. White horizontal lines represent CI limits that are within the length of the gray box. The vertical dashed lines in the forest plot are the estimated random pooled effect of the AUROC value from the random-effects meta-analysis. The gray diamonds illustrate the 95% CI for the random pooled effects. Tests of heterogeneity included I2, τ2, and Cochran Q P value (denoted as P). AUROC: area under the receiver operating curve;.
Figure 7Meta-analysis results: pooled AUROC for APACHE-II. Gray boxes represent the fixed weight estimates of the AUROC value from each study. Larger gray boxes represent larger fixed weight estimates of the AUROC values. The horizontal line through each gray box illustrates the 95% CI of the AUROC value from that study. Black horizontal lines through a gray box indicate that the CI limits exceed the length of the gray box. White horizontal lines represent CI limits that are within the length of the gray box. The vertical dashed lines in the forest plot are the estimated random pooled effect of the AUROC value from the random-effects meta-analysis. The gray diamonds illustrate the 95% CI for the random pooled effects. Tests of heterogeneity included I2, τ2, and Cochran Q P value (denoted as P). APACHE-II: Acute Physiology and Chronic Health Evaluation-II; AUROC: area under the receiver operating curve;.
Figure 4Meta-analysis results: pooled AUROC for externally validated NN models. Gray boxes represent the fixed weight estimates of the AUROC value from each study. Larger gray boxes represent larger fixed weight estimates of the AUROC values. The horizontal line through each gray box illustrates the 95% CI of the AUROC value from that study. Black horizontal lines through a gray box indicate that the CI limits exceed the length of the gray box. White horizontal lines represent CI limits that are within the length of the gray box. The vertical dashed lines in the forest plot are the estimated random pooled effect of the AUROC value from the random-effects meta-analysis. The gray diamonds illustrate the 95% CI for the random pooled effects. Tests of heterogeneity included I2, τ2, and Cochran Q P value (denoted as P). AUROC: area under the receiver operating curve; NN: neural network.
Figure 5Meta-analysis results: pooled AUROC for SAPS-II. Gray boxes represent the fixed weight estimates of the AUROC value from each study. Larger gray boxes represent larger fixed weight estimates of the AUROC values. The horizontal line through each gray box illustrates the 95% CI of the AUROC value from that study. Black horizontal lines through a gray box indicate that the CI limits exceed the length of the gray box. White horizontal lines represent CI limits that are within the length of the gray box. The vertical dashed lines in the forest plot are the estimated random pooled effect of the AUROC value from the random-effects meta-analysis. The gray diamonds illustrate the 95% CI for the random pooled effects. Tests of heterogeneity included I2, τ2, and Cochran Q P value (denoted as P). AUROC: area under the receiver operating curve; SAPS-II: Simplified Acute Physiology Score II.