| Literature DB >> 33262471 |
Qiangrong Zhai1, Zi Lin2, Hongxia Ge1, Yang Liang1, Nan Li3, Qingbian Ma4, Chuyang Ye5.
Abstract
The number of critically ill patients has increased globally along with the rise in emergency visits. Mortality prediction for critical patients is vital for emergency care, which affects the distribution of emergency resources. Traditional scoring systems are designed for all emergency patients using a classic mathematical method, but risk factors in critically ill patients have complex interactions, so traditional scoring cannot as readily apply to them. As an accurate model for predicting the mortality of emergency department critically ill patients is lacking, this study's objective was to develop a scoring system using machine learning optimized for the unique case of critical patients in emergency departments. We conducted a retrospective cohort study in a tertiary medical center in Beijing, China. Patients over 16 years old were included if they were alive when they entered the emergency department intensive care unit system from February 2015 and December 2015. Mortality up to 7 days after admission into the emergency department was considered as the primary outcome, and 1624 cases were included to derive the models. Prospective factors included previous diseases, physiologic parameters, and laboratory results. Several machine learning tools were built for 7-day mortality using these factors, for which their predictive accuracy (sensitivity and specificity) was evaluated by area under the curve (AUC). The AUCs were 0.794, 0.840, 0.849 and 0.822 respectively, for the SVM, GBDT, XGBoost and logistic regression model. In comparison with the SAPS 3 model (AUC = 0.826), the discriminatory capability of the newer machine learning methods, XGBoost in particular, is demonstrated to be more reliable for predicting outcomes for emergency department intensive care unit patients.Entities:
Year: 2020 PMID: 33262471 PMCID: PMC7708467 DOI: 10.1038/s41598-020-77548-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Schematic diagram of the present study.
Baseline characteristics of study population.
| Characteristic | Total (n = 1624) | Survivors (n = 1413) | Non-survivors (n = 211) | |
|---|---|---|---|---|
| Male, n (%) | 969 (60.0) | 844 (59.7) | 125 (59.2) | 0.91 |
| Age, years | 64.7 ± 18.1 | 63.7 ± 18.3 | 71.2 ± 14.9 | < 0.001* |
| Glasgow coma scale | 15 (15–15) | 15 (15–15) | 15 (7–15) | < 0.001* |
| Respiratory rate (breaths/min) | 22.0 ± 7.0 | 21.6 ± 6.6 | 24.3 ± 6.6 | < 0.001* |
| Heart rate (beats/min) | 96.8 ± 30.5 | 96.4 ± 30.6 | 99.7 ± 32.5 | 0.147 |
| Systolic blood pressure (mm Hg) | 133.4 ± 31.1 | 134.4 ± 29.6 | 127.1 ± 39.8 | 0.002* |
| White cell count (109/L) | 9.4 (7.0–13.0) | 9.1 (6.9–12.3) | 12.6 (8.8–17.0) | 0.007* |
| Platelet count (109/L) | 213.7 ± 93.5 | 217.2 ± 91.5 | 189.9 ± 102.7 | < 0.001* |
| Hemoglobin (g/L) | 124.7 ± 31.3 | 126.5 ± 30.6 | 112.1 ± 33.3 | < 0.001* |
| Serum potassium | 4.2 ± 0.8 | 4.1 ± 1.8 | 4.4 ± 1.1 | < 0.001* |
*The difference between the survivor and non-survivor groups was statistically significant.
Figure 2Feature selection steps adopted by machine learning.
Importance ranking of all features.
| Feature name | Total score | |
|---|---|---|
| 1 | GCS score first | 683 |
| 2 | Hemoglobin | 670 |
| 3 | Glucose | 639 |
| 4 | FiO2 first | 622 |
| 5 | Planed admission | 600 |
| 6 | BUN | 591 |
| 7 | Septic shock | 584 |
| 8 | Shock first | 577 |
| 9 | White Blood Cell | 569 |
| 10 | Systolic blood pressure first | 553 |
| 11 | Cancer therapy | 551 |
| 12 | Respiratory rate first | 543 |
| 13 | Disch dx cerebrovas | 534 |
| 14 | Metastatic cancer | 505 |
| 15 | Sodium | 505 |
| 16 | SpO2 first | 503 |
| 17 | pO2 first | 501 |
| 18 | Platelet | 498 |
| 19 | Disch dx neoplasms | 471 |
| 20 | Tbil | 471 |
| 21 | Agitation | 470 |
| 22 | pH first | 467 |
| 23 | Coma | 463 |
| 24 | Disch dx digestive disease | 450 |
| 25 | Palpitation | 445 |
| 26 | Altered mental status | 434 |
| 27 | Potassium | 434 |
| 28 | Acute abdomen | 425 |
| 29 | Disch dx circulatory disease | 422 |
| 30 | Creatinine | 421 |
| 31 | Heart rate first | 420 |
| 32 | Active malignancy | 403 |
| 33 | Dyspnea | 393 |
| 34 | Severe acute pancreatitis | 392 |
| 35 | O2 flow rate first | 391 |
| 36 | Fever | 389 |
| 37 | Age | 385 |
| 38 | Hypovolemic hemorrhagic shock | 367 |
| 39 | Focal neurologic deficit | 356 |
| 40 | Intracranial effect | 356 |
| 41 | Obtunded | 350 |
| 42 | Disch dx flu pneumonia | 347 |
| 43 | Disch dx gu disease | 347 |
| 44 | Confusion | 342 |
| 45 | Arrhythmia | 337 |
| 46 | Cirrhosis | 328 |
| 47 | Stupor | 325 |
| 48 | Vigilance disturbance | 325 |
| 49 | Anaphylactic shock | 317 |
| 50 | Disch dx resp | 312 |
| 51 | Disch dx other disease | 307 |
| 52 | Hypovolemic non-hemorrhagic shock | 305 |
| 53 | Disch dx chronic lower resp | 304 |
| 54 | Seizures | 303 |
| 55 | Live failure | 299 |
| 56 | O2 device first | 299 |
| 57 | Vomiting | 293 |
| 58 | Mix shock | 289 |
| 59 | Use vasoactive drugs | 278 |
| 60 | Hematemesis | 272 |
| 61 | Infection | 268 |
| 62 | Disch dx aids | 258 |
| 63 | Steroid therapy | 248 |
| 64 | Chronic heart failure iv | 231 |
| 65 | Chest pain | 208 |
| 66 | Fatigue | 192 |
| 67 | Disch dx abnormal nos | 187 |
| 68 | Hematologic cancer | 174 |
| 69 | Disch dx injury | 169 |
| 70 | Trauma | 147 |
| 71 | Syncope | 143 |
| 72 | Bloody stools | 138 |
| 73 | Headache | 124 |
| 74 | Abdominal pain | 120 |
| 75 | Chest tightness | 86 |
Figure 3AUC curve for feature selection. The curve of the test set shows that the obtained feature importance table can play a role in optimizing the results for a lightweight model.
Figure 4ROC curve for the three methods. By observing the ROC curves of the three methods, it is obvious that the machine learning method has better performance than the traditional SAPS-3 scoring method, and the machine learning model after feature selection is superior.
Figure 5ROC curves for the four predictive models. The ROC curve of XGBoost is superior to other methods and has the best AUC index performance.
Predictive accuracy for the four predictive models.
| Method | Sensitivity | Specificity | Overall accuracy | AUC | Youden Index (Se + Sp-1) |
|---|---|---|---|---|---|
| LR | 0.742 | 0.772 | 0.832 | 0.822 (0.72–0.85) | 0.514 |
| SVM | 0.712 | 0.777 | 0.802 | 0.794 (0.72–0.86) | 0.489 |
| GBDT | 0.790 | 0.754 | 0.834 | 0.840 (0.76–0.88) | 0.543 |
| XGBoost | 0.756 | 0.806 | 0.837 | 0.849 (0.81–0.89) | 0.562 |