| Literature DB >> 34905530 |
Yuzhuo Zhao1, Lijing Jia1, Ruiqi Jia2, Hui Han1, Cong Feng1, Xueyan Li3, Zijian Wei4, Hongxin Wang5, Heng Zhang1, Shuxiao Pan6, Jiaming Wang2, Xin Guo7, Zheyuan Yu2, Xiucheng Li2, Zhaohong Wang2, Wei Chen8,9, Jing Li2, Tanshi Li1.
Abstract
ABSTRACT: Early warning prediction of traumatic hemorrhagic shock (THS) can greatly reduce patient mortality and morbidity. We aimed to develop and validate models with different stepped feature sets to predict THS in advance. From the PLA General Hospital Emergency Rescue Database and Medical Information Mart for Intensive Care III, we identified 604 and 1,614 patients, respectively. Two popular machine learning algorithms (i.e., extreme gradient boosting [XGBoost] and logistic regression) were applied. The area under the receiver operating characteristic curve (AUROC) was used to evaluate the performance of the models. By analyzing the feature importance based on XGBoost, we found that features in vital signs (VS), routine blood (RB), and blood gas analysis (BG) were the most relevant to THS (0.292, 0.249, and 0.225, respectively). Thus, the stepped relationships existing in them were revealed. Furthermore, the three stepped feature sets (i.e., VS, VS + RB, and VS + RB + sBG) were passed to the two machine learning algorithms to predict THS in the subsequent T hours (where T = 3, 2, 1, or 0.5), respectively. Results showed that the XGBoost model performance was significantly better than the logistic regression. The model using vital signs alone achieved good performance at the half-hour time window (AUROC = 0.935), and the performance was increased when laboratory results were added, especially when the time window was 1 h (AUROC = 0.950 and 0.968, respectively). These good-performing interpretable models demonstrated acceptable generalization ability in external validation, which could flexibly and rollingly predict THS T hours (where T = 0.5, 1) prior to clinical recognition. A prospective study is necessary to determine the clinical utility of the proposed THS prediction models.Entities:
Mesh:
Year: 2022 PMID: 34905530 PMCID: PMC8663521 DOI: 10.1097/SHK.0000000000001842
Source DB: PubMed Journal: Shock ISSN: 1073-2322 Impact factor: 3.454
Fig. 1Model development overview. (A) Data extraction and processing. Data including admission diagnosis, demographic information (e.g., age and sex), vital signs, and laboratory results were extracted from PLAGH-ERD. Patients were divided into THS and non-THS groups. (B1) Imputation for the time-series data of vital signs based on cluster. (B2) Imputation for the time-series data of vital signs based on multivariate imputation via chained equations. Features with missing rates greater than 50% were removed. (C) Feature importance was calculated based on the average gain of XGBoost to analyze the relationship between the features and THS. (D) Training time-window prediction models. (i) The data set was divided into 10 groups using 10-fold cross-validation, with nine of the groups serving as training data and one as test data. (ii) The construction and tuning of time-window prediction. (iii) Evaluation. The AUROC, AUPRC, F1.5, precision, recall, accuracy, and 95% confidence interval (CI) values were utilized to evaluate the performance of each model for different stepped feature sets and time windows. (iv) Comparison of results from XGBoost and logistic regression. AUPRC, area under the precision-recall curve; AUROC, area under the receiver operating characteristic curve; THS, traumatic hemorrhagic shock; XGBoost, extreme gradient boosting.
Fig. 2Panel (A) shows the extraction process for the study cohort in the PLAGH-ERD. Panel (B) shows the extraction process for the study cohort in the MIMIC III. MIMIC III, the Medical Information Mart for Intensive Care III.
Baseline statistical characteristics of the study population
| The PLAGH-ERD | The MIMIC III | |||
| Characteristics | THS (n = 102) | Non-THS (n = 502) | THS (n = 244) | Non-THS (n = 1,370) |
| Age (years; median, IQR) | 43.98 (31.92–58.87) | 49.41∗ (35.56–63.01) | 52.10 (36.13–72.16) | 50.11∗ (31.61–73.36) |
| Gender, n (%) | ||||
| Female | 20 (19.6) | 110 (21.7) | 82 (33.6) | 438 (32.0) |
| Male | 82 (80.4) | 396 (78.3) | 162 (66.4) | 932 (68.0) |
| Injured body part, n (%) | ||||
| Head | 32 (31.4) | 239 (47.6) | 156 (63.93) | 897 (65.5) |
| Chest | 12 (11.8) | 63 (12.5) | 72 (29.51) | 241 (17.6) |
| Abdomen | 22 (21.6) | 64 (12.7) | 55 (22.54) | 254 (18.5) |
| Pelvis | 8 (7.8) | 26 (5.2) | 38 (15.57) | 78 (5.7) |
| Limbs | 6 (5.9) | 15 (3.0) | 60 (24.59) | 204 (14.9) |
| Other | 64 (63.7) | 164 (32.7) | 19 (7.79) | 152 (11.1) |
| Hospital LOS (days; median, IQR) | 1.03 (0.44–1.90) | 0.38∗ (0.18–0.82) | 11.01 (5.77–20.18) | 4.54∗ (2.58–7.75) |
| Time interval from admission to shock (h; median, IQR) | 4.91 (1.97–12.14) | – | 13.35 (4.66–50.84) | – |
IQR, interquartile range; LOS, length of stay; THS, traumatic hemorrhagic shock.
Statistically significant difference between the experimental and control groups.
The stepped feature sets used for prediction
| Forecast indicator dataset – 1 (vital signs) | Forecast indicator dataset – 2 (vital signs + routine blood) | Forecast indicator dataset – 3 (vital signs + routine blood + blood gas analysis) |
| HR | HR | HR |
| SBP | SBP | SBP |
| DBP | DBP | DBP |
| RESP | RESP | RESP |
| TEMP | TEMP | TEMP |
| PLT | PLT | |
| WBC | WBC | |
| Hb | Hb | |
| RBC | RBC | |
| Hct | Hct | |
| BE | ||
| Lac | ||
| pH | ||
| TCO2 | ||
| PaCO2 | ||
| PaO2 |
BE, base excess; DBP, diastolic blood pressure; Hb, hemoglobin; Hct, hematocrit; HR, heart rate; Lac, lactate; PaO2, partial pressure of oxygen; PLT, platelets; RESP, respiration rate; SBP, systolic blood pressure; TEMP, temperature; WBC, white blood cell count.
Fig. 3Partial SHAP dependence plots for features of vital signs, routine blood, and blood gas analysis. SHAP, SHapley Additive exPlanations.
The P-value of the DeLong's test compared the significant difference between performance of models with different stepped feature sets of the PLAGH-ERD and time windows
| XGBoost | LR | ||
| 0.5 h in advance | |||
| VS | 0.935 | 0.875 | <0.001 |
| 1 h in advance | |||
| VS | 0.927 | 0.878 | 0.028 |
| VS + BR | 0.95 | 0.889 | <0.001 |
| VS + BG + BR | 0.968 | 0.93 | <0.001 |
| 2 h in advance | |||
| VS | 0.937 | 0.866 | <0.001 |
| VS + BR | 0.957 | 0.905 | <0.001 |
| VS + BG + BR | 0.934 | 0.912 | <0.001 |
| 3 h in advance | |||
| VS + BR | 0.946 | 0.919 | 0.016 |
| VS + BG + BR | 0.957 | 0.905 | <0.001 |
BG, blood gas; PLAGH-ERD, the PLA General Hospital Emergency Rescue Database; VS, vital signs.
Validation of time-window prediction models for traumatic hemorrhagic shock
| Internal validation | External validation | ||||||||||||
| Machine learning model | Prediction dataset |
| acc | pre | rec | AUROC | AUPRC |
| acc | pre | rec | AUROC | AUPRC |
| 0.5 h in advance | |||||||||||||
| XGBoost | VS | 0.849 | 0.865 | 0.866 | 0.847 | 0.935 | 0.943 | 0.704 | 0.665 | 0.571 | 0.804 | 0.785 | 0.720 |
| LR | VS | 0.794 | 0.835 | 0.853 | 0.778 | 0.875 | 0.887 | 0.701 | 0.661 | 0.558 | 0.801 | 0.797 | 0.773 |
| 1 h in advance | |||||||||||||
| XGBoost | VS | 0.793 | 0.833 | 0.853 | 0.773 | 0.927 | 0.920 | 0.695 | 0.649 | 0.551 | 0.804 | 0.769 | 0.697 |
| VS + RB | 0.866 | 0.883 | 0.889 | 0.860 | 0.950 | 0.943 | 0.834 | 0.841 | 0.851 | 0.827 | 0.913 | 0.915 | |
| VS + RB + BG | 0.900 | 0.903 | 0.898 | 0.903 | 0.968 | 0.962 | 0.804 | 0.822 | 0.847 | 0.787 | 0.901 | 0.908 | |
| LR | VS | 0.775 | 0.819 | 0.836 | 0.755 | 0.878 | 0.897 | 0.703 | 0.652 | 0.549 | 0.815 | 0.792 | 0.761 |
| VS + RB | 0.863 | 0.875 | 0.867 | 0.863 | 0.889 | 0.883 | 0.834 | 0.846 | 0.863 | 0.822 | 0.916 | 0.928 | |
| VS + RB + BG | 0.872 | 0.875 | 0.872 | 0.875 | 0.930 | 0.932 | 0.841 | 0.856 | 0.882 | 0.824 | 0.916 | 0.929 | |
| 2 h in advance | |||||||||||||
| XGBoost | VS | 0.781 | 0.859 | 0.873 | 0.760 | 0.937 | 0.905 | 0.679 | 0.653 | 0.558 | 0.777 | 0.772 | 0.699 |
| VS + RB | 0.863 | 0.873 | 0.886 | 0.858 | 0.947 | 0.950 | 0.807 | 0.836 | 0.881 | 0.778 | 0.924 | 0.911 | |
| VS + RB + BG | 0.869 | 0.870 | 0.856 | 0.880 | 0.934 | 0.914 | 0.798 | 0.830 | 0.876 | 0.770 | 0.922 | 0.914 | |
| LR | VS | 0.730 | 0.806 | 0.779 | 0.730 | 0.866 | 0.849 | 0.687 | 0.659 | 0.555 | 0.780 | 0.783 | 0.747 |
| VS + RB | 0.835 | 0.847 | 0.875 | 0.828 | 0.905 | 0.898 | 0.835 | 0.846 | 0.862 | 0.823 | 0.909 | 0.924 | |
| VS + RB + BG | 0.860 | 0.847 | 0.813 | 0.887 | 0.912 | 0.891 | 0.832 | 0.847 | 0.870 | 0.817 | 0.913 | 0.926 | |
| 3 h in advance | |||||||||||||
| XGBoost | VS + RB | 0.857 | 0.888 | 0.935 | 0.828 | 0.946 | 0.959 | 0.703 | 0.766 | 0.842 | 0.658 | 0.842 | 0.840 |
| VS + RB + BG | 0.863 | 0.873 | 0.886 | 0.858 | 0.957 | 0.950 | 0.807 | 0.836 | 0.881 | 0.778 | 0.924 | 0.911 | |
| LR | VS + RB | 0.838 | 0.864 | 0.905 | 0.815 | 0.919 | 0.943 | 0.775 | 0.793 | 0.815 | 0.758 | 0.856 | 0.876 |
| VS + RB + BG | 0.835 | 0.847 | 0.875 | 0.828 | 0.905 | 0.898 | 0.835 | 0.846 | 0.862 | 0.823 | 0.909 | 0.924 | |
VS represents vital signs; VS + RB represents vital signs + routine blood; VS + RB + BG represents vital signs + routine blood + blood gas analysis. AUPRC, area under the precision-recall curve; AUROC, area under the receiver operating characteristic curve; BG, blood gas; RB, routine blood; VS, vital signs.
Fig. 4(A) Receiver operating characteristic (ROC) curve of the prediction model for the vital signs dataset. (B) ROC curve of the prediction model for the vital signs + routine blood dataset. (C) ROC curve of the prediction model for the vital signs + routine blood + blood gas analysis dataset.