| Literature DB >> 35562803 |
Suru Yue1,2, Shasha Li1,2, Xueying Huang1,2, Jie Liu1,2, Xuefei Hou1,2, Yumei Zhao1, Dongdong Niu1, Yufeng Wang1,2, Wenkai Tan3, Jiayuan Wu4,5.
Abstract
BACKGROUND: Acute kidney injury (AKI) is the most common and serious complication of sepsis, accompanied by high mortality and disease burden. The early prediction of AKI is critical for timely intervention and ultimately improves prognosis. This study aims to establish and validate predictive models based on novel machine learning (ML) algorithms for AKI in critically ill patients with sepsis.Entities:
Keywords: Acute kidney injury; MIMIC- III database; Machine learning; Prediction model; Sepsis
Mesh:
Year: 2022 PMID: 35562803 PMCID: PMC9101823 DOI: 10.1186/s12967-022-03364-0
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 8.440
Fig. 1The flowchart of patient selection. MIMIC: Medical Information Mort for Intensive Care; ICU: intensive care unit
Baseline characteristics of the patients with sepsis
| Characteristics | Total (n = 3176) | Non-AKI (n = 779) | AKI (n = 2397) | |
|---|---|---|---|---|
| Male, n (%) | 1756 (55.3%) | 465 (59.7%) | 1291 (53.9%) | 0.004 |
| Age (years) | 66 (54–77) | 62 (52–73) | 67 (55–78) | < 0.001 |
| Ethnicity, n (%) | 0.147 | |||
| White | 2391 (75.3%) | 566 (72.7%) | 1825 (76.1%) | |
| Black | 260 (8.2%) | 71 (9.1%) | 189 (7.9%) | |
| Other | 525 (16.5%) | 142 (18.2%) | 383 (16.0%) | |
| BMI (kg/m2) | 27 (23–32) | 26 (23–30) | 28 (23–32) | < 0.001 |
| Congestive heart failure, n (%) | 1034 (32.6%) | 193 (24.8%) | 841 (35.1%) | < 0.001 |
| Cardiac arrhythmias, n (%) | 1256 (39.5%) | 222 (28.5%) | 1034 (43.1%) | < 0.001 |
| Hypertension, n (%) | 1422 (44.8%) | 317 (40.7%) | 1105 (46.1%) | 0.008 |
| Paralysis, n (%) | 125 (3.9%) | 45 (5.8%) | 80 (3.3%) | 0.002 |
| Chronic pulmonary, n (%) | 766 (24.1%) | 163 (20.9%) | 603 (25.2%) | 0.016 |
| Diabetes, n (%) | 865 (27.2%) | 187 (24.0%) | 678 (28.3%) | 0.020 |
| Liver disease, n (%) | 626 (19.7%) | 110 (14.1%) | 516 (21.5%) | < 0.001 |
| Coagulopathy, n (%) | 830 (26.1%) | 173 (22.2%) | 657 (27.4%) | 0.004 |
| Mechanical ventilation, n (%) | 1840 (57.9%) | 262 (33.6%) | 1578 (65.8%) | < 0.001 |
| Vasopressor, n (%) | 1990 (62.7%) | 352 (45.2%) | 1638 (68.3%) | < 0.001 |
| Anion gap_min (mmol/l) | 12 (11–14) | 12 (10–14) | 13 (11–14) | < 0.001 |
| Anion gap_max (mmol/l) | 16 (14–19) | 15 (13–18) | 16 (14–19) | < 0.001 |
| Albumin_min (g/dl) | 2.7 (2.3–3.1) | 2.8 (2.4–3.1) | 2.6 (2.2–3.1) | < 0.001 |
| Albumin_max (g/dl) | 2.8 (2.4–3.2) | 2.8 (2.5–3.3) | 2.8 (2.3–3.2) | 0.002 |
| Bilirubin_min (mg/dl) | 0.7 (0.4–1.7) | 0.6 (0.4–1.3) | 0.7 (0.4–1.7) | < 0.001 |
| Bilirubin_max (mg/dl) | 0.8 (0.5–2.1) | 0.7 (0.4–1.5) | 0.9 (0.5–2.3) | < 0.001 |
| Creatinine_min (mg/dl) | 0.9 (0.7–1.4) | 0.9 (0.7–1.2) | 1 (0.7–1.4) | < 0.001 |
| Creatinine_max (mg/dl) | 1.2 (0.8–1.8) | 1.1 (0.8–1.55) | 1.2 (0.8–1.8) | < 0.001 |
| Chloride_min (mEq/l) | 103 (98–108) | 103 (99–108) | 103 (98–107) | 0.003 |
| Chloride_max (mEq/l) | 109 (104–113) | 109 (105–113) | 108 (104–113) | 0.004 |
| Glucose_min (mg/dl) | 104 (87–125) | 104 (88–121.5) | 104 (87–127) | 0.76 |
| Glucose_max (mg/dl) | 156 (124–205) | 146 (118.5–194) | 158 (126–209) | < 0.001 |
| Lactate_min (mmol/l) | 1.5 (1.1–2.1) | 1.4 (1–1.8) | 1.5 (1.1–2.1) | < 0.001 |
| Lactate_max (mmol/l) | 2.4 (1.6–4.2) | 2.1 (1.5–3.5) | 2.5 (1.6–4.4) | < 0.001 |
| Platelet_min (K/UL) | 183 (109–278) | 188 (120–283) | 182 (106–276) | 0.103 |
| Platelet_max (K/UL) | 229.5 (148.3–339) | 229 (149–339) | 230 (148–340) | 0.673 |
| Potassium_min (mEq/l) | 3.6 (3.3–4) | 3.6 (3.3–3.9) | 3.6 (3.3–4) | 0.005 |
| Potassium_max (mEq/l) | 4.4 (4–4.9) | 4.2 (3.9–4.7) | 4.4 (4–4.9) | < 0.001 |
| PTT_min (seconds) | 31.1 (27.1–36.8) | 30.2 (26.8–35.3) | 31.3 (27.2–37.3) | < 0.001 |
| PTT_max (seconds) | 35.7 (29.7–46.9) | 34 (28.85–41.85) | 36.3 (30–48.6) | < 0.001 |
| INR_min | 1.3 (1.2–1.6) | 1.3 (1.1–1.5) | 1.3 (1.2–1.6) | < 0.001 |
| INR_max | 1.4 (1.2–1.9) | 1.4 (1.2–1.7) | 1.5 (1.3–2) | < 0.001 |
| PT_min (seconds) | 14.7 (13.4–17) | 14.3 (13.2–16.2) | 14.8 (13.4–17.3) | < 0.001 |
| PT_max (seconds) | 15.8 (14–19.4) | 15.2 (13.8–17.8) | 16 (14.1–19.9) | < 0.001 |
| Sodium_min (mEq/l) | 136 (133–140) | 137 (133–140) | 136 (133–140) | 0.028 |
| Sodium_max (mEq/l) | 140 (137–143) | 140 (137–143) | 140 (137–143) | 0.216 |
| BUN_min (mg/dl) | 21 (14–35) | 18 (12–30) | 23 (14–37) | < 0.001 |
| BUN_max (mg/dl) | 26 (17–42) | 23 (15–38) | 27 (18–44) | < 0.001 |
| WBC_min (K/UL) | 10.6 (6.6–15.7) | 10.9 (6.6–15.7) | 10.6 (6.6–15.7) | 0.647 |
| WBC_max (K/UL) | 15 (9.7–21.5) | 14.9 (9.5–21.25) | 15 (9.8–21.6) | 0.403 |
| HeartRate_min (beats/minute) | 78 (67–90) | 77 (65–89) | 78 (67–90) | 0.02 |
| HeartRate_max (beats/minute) | 114 (99–128) | 110 (97–125.5) | 115 (100–129) | < 0.001 |
| SysBP_min (mmHg) | 83 (75–92) | 86 (78–95) | 82 (73–91) | < 0.001 |
| SysBP_max (mmHg) | 140 (126–154) | 138 (126–153) | 140 (127–155) | 0.06 |
| DiasBP_min (mmHg) | 41 (34–48) | 42 (35–49) | 40 (34–47) | < 0.001 |
| DiasBP_max (mmHg) | 80 (70.3–91) | 81 (72–90) | 80 (70–91) | 0.706 |
| Temperature_min (℃) | 36.1 (35.6–36.7) | 36.2 (35.8–36.7) | 36.1 (35.6–36.6) | < 0.001 |
| Temperature_max (℃) | 37.7 (37.1–38.4) | 37.7 (37.1–38.6) | 37.6 (37.1–38.3) | 0.006 |
| SpO2_Min (%) | 92 (89–95) | 93 (90–95) | 92 (89–95) | 0.002 |
| SpO2_Max (%) | 100 (100–100) | 100 (100–100) | 100 (100–100) | 0.003 |
| Urine output (ml) | 790 (422.3–1380) | 1005 (512.5–1627.5) | 745 (410–1280) | < 0.001 |
| eGFR (ml/min/1.73m2) | 75.2 (47.4–107.8) | 87.4 (58.0–118.2) | 70.4 (43.7–105.4) | < 0.001 |
AKI acute kidney injury, BMI body mass Index, PT prothrombin time, PTT partial thromboplastin time, INR International Normalized Ratio, BUN blood urea nitrogen, WBC white blood cell, SpO2 oxygen saturation, SysBP systolic blood pressure, DiasBP diastolic blood pressure, eGFR estimated glomerular filtration rate
Fig. 2Feature selection based on the Boruta algorithm. The horizontal axis is the name of each variable, and the vertical axis is the Z-value of each variable. The box plot shows the Z-value of each variable during model calculation. The green boxes represent the first 35 important variables, the yellow represents tentative attributes, and the red represents unimportant variables. BMI: body mass Index; eGFR: estimated glomerular filtration rate; PT: prothrombin time; PTT: partial thromboplastin time; INR: International Normalized Ratio; BUN: blood urea nitrogen; SysBP: systolic blood pressure; DiasBP: diastolic blood pressure
Fig. 3Receiver operating characteristic curve of the seven models. LR: logistic regression; KNN, k-nearest neighbors; SVM: support vector machine; XGBoost: Extreme Gradient Boosting; ANN: artificial neural network; SOFA: sequential organ failure assessment; SAPS II: the customized simplified acute physiology score; AUC: area under the curve
Model performance metrics
| Models | AUC | Recall | Accuracy | F1 score | Sensitivity | Specificity |
|---|---|---|---|---|---|---|
| LR | 0.737 | 0.796 | 0.765 | 0.858 | 0.834 | 0.878 |
| KNN | 0.664 | 0.798 | 0.742 | 0.840 | 0.886 | 0.857 |
| SVM | 0.735 | 0.797 | 0.788 | 0.874 | 0.833 | 0.926 |
| Decision tree | 0.749 | 0.834 | 0.793 | 0.870 | 0.910 | 0.882 |
| Random forest | 0.779 | 0.809 | 0.794 | 0.876 | 0.935 | 0.923 |
| XGBoost | 0.817 | 0.852 | 0.832 | 0.895 | 0.943 | 0.913 |
| ANN | 0.755 | 0.778 | 0.783 | 0.875 | 0.824 | 0.899 |
| SOFA | 0.646 | 0.755 | 0.723 | 0.781 | 0.633 | 0.712 |
| SAPS II | 0.702 | 0.774 | 0.762 | 0.814 | 0.811 | 0.845 |
AUC area under curve, LR logistic regression KNN: k-nearest neighbors, SVM support vector machine, XGBoost extreme gradient boosting, ANN artificial neural network, SOFA Sequential Organ Failure Assessment, SAPS II the Simplified Acute Physiology Score II
Fig. 4Decision curve analyses of the seven models. The horizontal line indicates no patients develop AKI, and the gray oblique line indicates patients develop AKI. LR logistic regression, KNN k-nearest neighbors, SVM support vector machine, XGBoost Extreme Gradient Boosting, ANN artificial neural network, AKI acute kidney injury
Fig. 5Feature importance derived from the XGBoost model. BMI body mass Index, PTT partial thromboplastin time, BUN blood urea nitrogen, eGFR estimated glomerular filtration rate, Max Maximum, Min Minimum, AKI acute kidney injury