Yan Wang1, Songqiao Feng. 1. Department of Critical Care Medicine, The Second Affiliated Hospital of Dalian Medical University, Dalian, China.
Abstract
To establish a prediction model for the 30-day mortality in sepsis patients. The data of 1185 sepsis patients were extracted from the Medical Information Mart for Intensive Care III (MIMIC-III) and all participants were randomly divided into the training set (n = 829) and the testing set (n = 356). The model was established in the training set and verified in the testing set. After standardization of the data, age, gender, input, output, and variables with statistical difference between the survival group and the death group in the training set were involved in the extreme gradient boosting (XGBoost) model. Subgroup analysis was performed concerning age and gender in the testing set. In the XGBoost model with variables related to intravenous (IV) fluid management and electrolytes for the 30-day mortality of sepsis patients, the area under the curve (AUC) was 0.868 (95% confidence interval [CI]: 0.867-0.869) in the training set and 0.781 (95% CI: 0.779-0.782) in the testing set. The sensitivity was 0.815 (95% CI: 0.774-0.857) in the training set and 0.755 (95% CI: 0.686-0.825) in the testing set. The specificity was 0.761 (95% CI: 0.723-0.798) in the training set, and 0.737 (95% CI: 0.677-0.797) in the testing set. In the XGBoost forest model without variables related to IV fluid management and electrolytes for the 30-day mortality of sepsis patients, in the training set, the AUC was 0.830 (95% CI: 0.829-0.831), the sensitivity was 0.717 (95% CI: 0.669-0.765), the specificity was 0.797 (95% CI: 0.762-0.833), and the accuracy was 0.765 (95% CI: 0.736-0.794). In the testing set, the AUC was 0.751 (95% CI: 0.750-0.753), the sensitivity was 0.612 (95% CI: 0.533-0.691), the specificity was 0.756 (95% CI: 0.698-0.814), and the accuracy was 0.697(95% CI: 0.649-0.744). The prediction model including variables associated with IV fluids and electrolytes had good predictive value for the 30-day mortality of sepsis patients.
To establish a prediction model for the 30-day mortality in sepsis patients. The data of 1185 sepsis patients were extracted from the Medical Information Mart for Intensive Care III (MIMIC-III) and all participants were randomly divided into the training set (n = 829) and the testing set (n = 356). The model was established in the training set and verified in the testing set. After standardization of the data, age, gender, input, output, and variables with statistical difference between the survival group and the death group in the training set were involved in the extreme gradient boosting (XGBoost) model. Subgroup analysis was performed concerning age and gender in the testing set. In the XGBoost model with variables related to intravenous (IV) fluid management and electrolytes for the 30-day mortality of sepsis patients, the area under the curve (AUC) was 0.868 (95% confidence interval [CI]: 0.867-0.869) in the training set and 0.781 (95% CI: 0.779-0.782) in the testing set. The sensitivity was 0.815 (95% CI: 0.774-0.857) in the training set and 0.755 (95% CI: 0.686-0.825) in the testing set. The specificity was 0.761 (95% CI: 0.723-0.798) in the training set, and 0.737 (95% CI: 0.677-0.797) in the testing set. In the XGBoost forest model without variables related to IV fluid management and electrolytes for the 30-day mortality of sepsis patients, in the training set, the AUC was 0.830 (95% CI: 0.829-0.831), the sensitivity was 0.717 (95% CI: 0.669-0.765), the specificity was 0.797 (95% CI: 0.762-0.833), and the accuracy was 0.765 (95% CI: 0.736-0.794). In the testing set, the AUC was 0.751 (95% CI: 0.750-0.753), the sensitivity was 0.612 (95% CI: 0.533-0.691), the specificity was 0.756 (95% CI: 0.698-0.814), and the accuracy was 0.697(95% CI: 0.649-0.744). The prediction model including variables associated with IV fluids and electrolytes had good predictive value for the 30-day mortality of sepsis patients.
Sepsis, a life-threatening disease leading to organ dysfunction in emergency medicine and critical care, is the major reason for mortality among hospitalized patients.[ A previous study estimated that there were over 31 million cases of sepsis and about 5 million hospitalized deaths annually all over the world.[ Despite the increasing use of advanced technology for its treatment, the prognosis of sepsis remains poor.[ Sepsis has caused a substantial burden to the health system and society, which has become a public health problem and was reported to cost USD 16–25 billion per year in the United States.[ Early recognition of patients with sepsis at high risk of mortality and appropriate interventions would be provided for improving their outcomes.[Fluid resuscitation is a common intervention in the management of patients in the intensive care units (ICUs) and more than 1/3 of these patients receive intravenous (IV) fluids on any given day.[ The 2012 “Surviving Sepsis Campaign” guidelines recommend IV fluids along with other treatments as an early management of patients with septic shock in ICUs.[ Fluid strategy is essential for the successful management of patients with sepsis which is associated with the mortality of patients with sepsis.[ Previously, the quick Sequential Organ Failure Assessment (qSOFA) score was recommended as a tool for identifying patients with a high risk of mortality in ICUs according to the third international consensus on the definition of sepsis and septic shock (Sepsis-3).[ However, several studies have indicated that the performance of qSOFA score for predicting the mortality in sepsis patients was not ideal.[ Currently, some other prediction models for the mortality in sepsis patients were also established with an area under the curve (AUC) ranged from 0.69 to 0.88.[ These models were constructed based on metabolite biomarkers in the blood or other clinical data of patients. The IV fluids and electrolytes status were rarely applied as predictors for the mortality of sepsis patients.The purpose of this study was to evaluate the predictive values of IV fluids and electrolytes for the mortality of sepsis patients. Two prediction models were established with or without variables related to IV fluids and electrolytes. We compared the predictive performance of the 2 models and found a better model for predicting the 30-day mortality in these patients.
2. Methods
2.1. Study population
In the present case-control study, the data of 1185 sepsis patients were extracted from the Medical Information Mart for Intensive Care III version 1.4 (MIMIC-III v1.4). MIMIC-III is a free database including the information of 46,520 patients who were admitted to various ICUs of Beth Israel Deaconess Medical Center (BIDMC) in Boston, Massachusetts from 2001 to 2012.[ The data including demographics, vital signs, laboratory tests, fluid balance and vital status; documents International Classification of Diseases and Ninth Revision (ICD-9) codes; records hourly physiologic data from bedside monitors validated by ICU nurses; and stores written evaluations of radiologic films by specialists covering in the corresponding time period on patients were recorded. The diagnosis of sepsis in patients was based on ICD-9 (99591, 99592, and 78552) according to the Sepsis-3.[ According to the relevant ethical policies and regulations of China on medical scientific research, our study was exempted from ethical review and the project is approved to carry out the relevant clinical research by the Institutional Review Board of the Second Affiliated Hospital of Dalian Medical University.
2.2. Potential predictors
The potential predictors for the 30-day mortality of sepsis involved in this study including age (years), gender, ethnicity (Hispanic, White, Black, Asian, or Others), and the initial 24 hours data (“day 1”) of patient ICU stays including respiratory rate (time/min), heart rate (time/min), mean arterial pressure (MAP, mm Hg), peripheral oxygen saturation (SpO2, %), sodium (mEq/L), potassium (mEq/L), phosphate (mEq/L), calcium (mEq/L), magnesium (mEq/L), international normalized ratio (INR), bicarbonate (mEq/L), chronic obstructive pulmonary disease (COPD, Yes or No), heart failure (Yes or No), diabetes mellitus (Yes or No), renal failure (Yes or No), malignant cancer (Yes or No), the simplified acute physiology score-II (SAPS-II), SOFA, dialysis (Yes or No), input (mL), and output (mL).
2.3. Measurement of variables
Inputs are any fluids which have been administered to the patient: such as oral or tube feedings or intravenous solutions containing medications. Outputs are urine output which have been excreted by the patient (https://mimic.mit.edu/docs/iii/about/io/). Fluid inputs exist in 2 separate tables: INPUTEVENTS_CV and INPUTEVENTS_MV. INPUTEVENTS_CV contains CareVue inputs, while INPUTEVENTS_MV contains Metavision inputs. In the present study, total fluid input was calculated as the sum of fluid administered for a 24-hour interval. The fluid strategy was determined at the beginning of each interval. The output of fluid was recorded as the urine output in the first 24 hours in ICUs.
2.4. Statistical analysis
The measurement data of normal distribution were described by Mean ± standard deviation (mean ± SD), and comparisons between groups were performed by independent sample t test. Non-normally distributed measurement data were described by M (Q1, Q3), and difference between groups were compared by Mann–Whitney U rank sum test. The enumeration data were described by N (%) and χ2 test or Fisher exact probability method were used for comparisons between groups. Multiple interpolation was performed for the missing data, and sensitivity analysis before and after interpolation was conducted. All participants were randomly divided into the training set (n = 829) and the testing set (n = 356) at a ratio of 7:3,[ and the equilibrium test between the training set and the testing set was analyzed. The prediction model was established in the training set and the validation was performed in the testing set. The participants in the training set were divided into the survival group (n = 493) and the death group (n = 336), and differences between the 2 groups were compared. After standardization of the data, age, gender, input, output and variables with statistical difference between the survival group and the death group were involved in the XGBoost prediction model. Prediction model without variables related to IV fluid management and electrolytes were also constructed and the performance were compared with prediction with variables related to IV fluid management and electrolytes. Subgroup analysis was performed concerning age and gender in the testing set. The AUC, positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, and accuracy were applied to evaluate the predictive performance of the models. Delong method was applied to compare the AUCs of prediction models with and without including variables related to IV fluids and electrolytes. The Statistical analysis was conducted by 2-sided tests, and P < .05 was considered to be statistically significant. Statistical analysis was completed by Python 3.7.4 (Python Software Foundation, Delaware, USA) and SAS 9 (SAS Institute Inc., Cary, NC) .
3. Results
3.1. The characteristics of participants
This study collected the data of 1185 sepsis patients from the MIMIC-III. All patients were divided into the training set (n = 829) and the testing set (n = 356). Among all the participants, the average age was 67.40 years. There were 651 male patients, accounting for 54.94%, and 534 female patients, accounting for 45.06%. The median survival time was 30.00 days, and the longest follow-up time was 30.00 days. The median fluid input was 7398.00 mL in all patients and the median fluid output was 1488.00 mL. 702 people survived 30 days, accounting for 59.24%, and 483 patients died within 30 days, accounting for 40.76% (Table 1).
Table 1
The equilibrium test between the characteristics of subjects in the training set and the testing set.
Variable
Total (n = 1185)
Group
Statistics
P
Training set (n = 829)
Testing set (n = 356)
Age, mean ± SD
67.40 ± 16.57
67.64 ± 16.55
66.83 ± 16.59
t = 0.772
.440
Gender, n (%)
χ2 = 0.095
.757
Male
651 (54.94)
453 (54.64)
198 (55.62)
Female
534 (45.06)
376 (45.36)
158 (44.38)
Ethnicity, n (%)
χ2 = 0.559
.968
Hispanic
36 (3.04)
24 (2.90)
12 (3.37)
White
981 (82.78)
684 (82.51)
297 (83.43)
Black
115 (9.70)
83 (10.01)
32 (8.99)
Asian
35 (2.95)
25 (3.02)
10 (2.81)
Others
18 (1.52)
13 (1.57)
5 (1.40)
Respiratory rate, M (Q1, Q3)
22.00 (18.00, 26.00)
22.00 (18.00, 26.00)
22.00 (18.00, 27.00)
Z = 1.441
.150
Heart rate, mean ± SD
98.52 ± 21.24
98.69 ± 21.37
98.13 ± 20.94
t = 0.417
.677
MAP, mean ± SD
75.97 ± 16.58
76.45 ± 16.75
74.86 ± 16.13
t = 1.507
.132
SPO2, mean ± SD
95.83 ± 7.39
95.88 ± 7.57
95.70 ± 6.96
t = 0.391
.696
Sodium, mean ± SD
138.21 ± 6.81
138.13 ± 6.77
138.39 ± 6.90
t = −0.615
.539
Potassium, mean ± SD
4.36 ± 0.99
4.32 ± 0.96
4.44 ± 1.05
t = −1.869
.062
Phosphate, M (Q1, Q3)
3.50 (2.70, 4.60)
3.40 (2.60, 4.60)
3.50 (2.70, 4.80)
Z = 1.293
.196
Calcium, mean ± SD
8.08 ± 1.06
8.07 ± 1.06
8.10 ± 1.04
t = −0.456
.649
magnesium, mean ± SD
1.91 ± 0.49
1.91 ± 0.47
1.92 ± 0.53
t = −0.315
.753
INR, M (Q1, Q3)
1.48 (1.20, 2.00)
1.40 (1.20, 1.96)
1.50 (1.20, 2.00)
Z = 0.097
.923
Bicarbonate, mean ± SD
21.63 ± 5.84
21.58 ± 5.74
21.76 ± 6.07
t = −0.476
.634
COPD, n (%)
χ2 = 0.237
.626
No
1024 (86.41)
719 (86.73)
305 (85.67)
Yes
161 (13.59)
110 (13.27)
51 (14.33)
Heart failure, n (%)
χ2 = 1.192
.275
No
651 (54.94)
464 (55.97)
187 (52.53)
Yes
534 (45.06)
365 (44.03)
169 (47.47)
Diabetes mellitus, n (%)
χ2 = 1.383
.239
No
850 (71.73)
603 (72.74)
247 (69.38)
Yes
335 (28.27)
226 (27.26)
109 (30.62)
Renal failure, n (%)
χ2 = 0.083
.773
No
389 (32.83)
270 (32.57)
119 (33.43)
Yes
796 (67.17)
559 (67.43)
237 (66.57)
Malignant cancer, n (%)
χ2 = 1.552
.213
No
918 (77.47)
634 (76.48)
284 (79.78)
Yes
267 (22.53)
195 (23.52)
72 (20.22)
SAPS-II, M (Q1, Q3)
47.00 (37.00, 57.00)
47.00 (37.00, 57.00)
46.00 (37.00, 56.00)
Z = −1.097
.273
SOFA, M (Q1, Q3)
7.00 (4.00, 9.00)
7.00 (4.00, 9.00)
6.00 (4.00, 9.00)
Z = −0.633
.527
Dialysis, n (%)
χ2 = 0.017
.895
No
1053 (88.86)
736 (88.78)
317 (89.04)
Yes
132 (11.14)
93 (11.22)
39 (10.96)
Input, M (Q1, Q3)
7398.00 (4147.00, 13376.88)
7402.68 (4062.50, 13305.00)
7360.44 (4271.67, 13523.67)
Z = −0.475
.634
Output, M (Q1, Q3)
1488.00 (685.00, 2746.00)
1455.00 (660.00, 2670.00)
1596.50 (718.75, 3007.50)
Z = 1.398
.162
Expire flag, n (%)
χ2 = 0.060
.807
Death
702 (59.24%)
493 (59.47%)
209 (58.71%)
Survival
483 (40.76%)
336 (40.53%)
147 (41.29%)
COPD = chronic obstructive pulmonary disease, INR = international normalized ratio, MAP = mean arterial pressure, SAPS-II = the simplified acute physiology score-II, SOFA = the Sequential Organ Failure Assessment.
The equilibrium test between the characteristics of subjects in the training set and the testing set.COPD = chronic obstructive pulmonary disease, INR = international normalized ratio, MAP = mean arterial pressure, SAPS-II = the simplified acute physiology score-II, SOFA = the Sequential Organ Failure Assessment.
3.2. The equilibrium test between the training set and the testing set
All patients were divided into the training set (n = 829) and the testing set (n = 356). The equilibrium test was performed between the data in the training set and the testing set. The results showed that there was no statistical difference between the data in the training set and the testing set in terms of demographic data, laboratory examination indexes, clinical data, and variables related to IV fluid management and electrolytes (All P > .05) (Table 1).
3.3. Comparisons between the survival group and the death group in the training set
The characteristics were compared between the survival group and the death group in the training set. The results revealed that the average age (69.56 years vs 66.33 years, t = −2.760, P = .006), average potassium (4.46 mEq/L vs 4.23 mEq/L, t = −3.346, P < .001), and magnesium level (2.01 mEq/L vs 1.84 mEq/L, t = 5.159, P < .001), and medium phosphate level (4.00 mEq/L vs 3.20 mEq/L, Z = 6.870, P < .001), INR (1.60 vs 1.40, Z = 4.785, P < .001), SAPS-II (56.96 vs 42.62, t = −13.270, P < .001), and SOFA score (8.00 vs 6.00, Z = 10.300, P < .001) in patients with sepsis from the death group were higher than in the death group. The average MAP (74.94 mm Hg vs 77.47 mm Hg, t = 2.133, P = .033) and output (976.50 mL vs 1750.00 mL, Z = −8.309, P < .001) in patients with sepsis from the death group were lower than in the death group (Table 2).
Table 2
Comparisons between the survival group and the death group in the training set.
Variable
Total (n = 829)
Group
Statistics
P
Survival (n = 493)
Death (n = 336)
Age, mean ± SD
67.64 ± 16.55
66.33 ± 17.11
69.56 ± 15.50
t = −2.760
.006
Gender, n (%)
χ2 = 0.390
.532
Male
453 (54.64)
265 (53.75)
188 (55.95)
Female
376 (45.36)
228 (46.25)
148 (44.05)
Ethnicity, n (%)
χ2 = 9.603
.048
Hispanic
24 (2.90)
16 (3.25)
8 (2.38)
White
684 (82.51)
402 (81.54)
282 (83.93)
Black
83 (10.01)
51 (10.34)
32 (9.52)
Asian
25 (3.02)
20 (4.06)
5 (1.49)
Others
13 (1.57)
4 (0.81)
9 (2.68)
Respiratory rate, M (Q1, Q3)
22.53 ± 7.16
22.38 ± 6.90
22.76 ± 7.53
t = −0.736
.462
Heart rate, mean ± SD
98.69 ± 21.37
97.79 ± 20.65
100.02 ± 22.32
t = −1.458
.145
MAP, mean ± SD
76.45 ± 16.75
77.47 ± 16.83
74.94 ± 16.51
t = 2.133
.033
SPO2, mean ± SD
95.88 ± 7.57
96.17 ± 7.25
95.46 ± 7.99
t = 1.339
.181
Sodium, mean ± SD
138.13 ± 6.77
138.22 ± 5.70
137.99 ± 8.09
t = 0.444
.657
Potassium, mean ± SD
4.32 ± 0.96
4.23 ± 0.91
4.46 ± 1.01
t = −3.346
<.001
Phosphate, M (Q1, Q3)
3.40 (2.60, 4.60)
3.20 (2.50, 4.00)
4.00 (2.98, 5.20)
Z = 6.870
<.001
Calcium, mean ± SD
8.07 ± 1.06
8.05 ± 1.01
8.11 ± 1.13
t = −0.882
.378
magnesium, mean ± SD
1.91 ± 0.47
1.84 ± 0.45
2.01 ± 0.48
t = −5.159
<.001
INR, M (Q1, Q3)
1.40 (1.20, 1.96)
1.40 (1.20, 1.80)
1.60 (1.30, 2.31)
Z = 4.785
<.001
Bicarbonate, mean ± SD
21.58 ± 5.74
21.88 ± 5.15
21.14 ± 6.49
t = 1.742
.082
COPD, n (%)
χ2 = 0.015
.903
No
719 (86.73)
427 (86.61)
292 (86.90)
Yes
110 (13.27)
66 (13.39)
44 (13.10)
Heart failure, n (%)
χ2 = 0.018
.894
No
464 (55.97)
275 (55.78)
189 (56.25)
Yes
365 (44.03)
218 (44.22)
147 (43.75)
Diabetes mellitus, n (%)
χ2 = 0.736
.391
No
603 (72.74)
364 (73.83)
239 (71.13)
Yes
226 (27.26)
129 (26.17)
97 (28.87)
Renal failure, n (%)
χ2 = 12.513
<.001
No
270 (32.57)
184 (37.32)
86 (25.60)
Yes
559 (67.43)
309 (62.68)
250 (74.40)
Malignant cancer, n (%)
χ2 = 8.007
.005
No
634 (76.48)
394 (79.92)
240 (71.43)
Yes
195 (23.52)
99 (20.08)
96 (28.57)
SAPS-II, M (Q1, Q3)
48.43 ± 16.33
42.62 ± 13.57
56.96 ± 16.29
t = −13.270
<.001
SOFA, M (Q1, Q3)
7.00 (4.00, 9.00)
6.00 (4.00, 8.00)
8.00 (6.00, 11.00)
Z = 10.300
<.001
Dialysis, n (%)
χ2 = 0.144
.704
No
736 (88.78)
436 (88.44)
300 (89.29)
Yes
93 (11.22)
57 (11.56)
36 (10.71)
Input, M (Q1, Q3)
7402.68 (4062.50, 13305.00)
7361.80 (3930.00, 12761.19)
7538.39 (4135.25, 14657.98)
Z = 1.305
.192
Output, M (Q1, Q3)
1455.00 (660.00, 2670.00)
1750.00 (1000.00, 3150.00)
976.50 (325.00, 1982.75)
Z = −8.309
<.001
COPD = chronic obstructive pulmonary disease, INR = international normalized ratio, MAP = mean arterial pressure, SAPS-II = the simplified acute physiology score-II, SOFA = the Sequential Organ Failure Assessment.
Comparisons between the survival group and the death group in the training set.COPD = chronic obstructive pulmonary disease, INR = international normalized ratio, MAP = mean arterial pressure, SAPS-II = the simplified acute physiology score-II, SOFA = the Sequential Organ Failure Assessment.Construction of the prediction model for 30-day mortality of sepsis patients with variables related to IV fluid management and electrolytes.Variables with statistical difference between the survival group and the death group, gender and input were involved in the XGBoost model. After adjusting by GridSearchCV grid, the optimal model was: tree quantity: 50, tree depth: 3, learning rate: 0.1, subsample: 0.2, colsample_bytree: 0.3. The AUC was 0.868 (95% CI: 0.867–0.869) in the training set and 0.781 (95% CI: 0.779–0.782) in the testing set (Fig. 1). According to the Youden index, the cut-off value was 0.365. The sensitivity was 0.815 (95% CI: 0.774–0.857) in the training set and 0.755 (95% CI: 0.686–0.825) in the testing set. The NPV was 0.858 (95% CI: 0.825–0.891) in the training set and 0.811 (95% CI: 0.755–0.866) in the testing set. The accuracy was 0.783 (95% CI: 0.755–0.811) in the training set and 0.744 (95% CI: 0.699–0.790) in the testing set (Table 3). These data suggested that the prediction model had good predictive performance. The calibration curve of the model was shown in Figure 2, which revealed that the prediction values of the model in the training set and testing set deviated slightly from the perfected model, but was close to matching, indicating the prediction model had good agreement between the predictive probability and the actual probability. Feature importance diagram from the model revealed that malignant cancer, SAPS-II, potassium, SOFA, MAP, input, and output of fluid were importance variables for 30-day mortality of sepsis patients (Fig. 3).
Figure 1.
The ROC curve of the XGBoost model with variables related to IV fluid management and electrolytes. ROC = receiver operating characteristic curve, IV = intravenous, XGBoost = extreme gradient boosting.
Table 3
Construction of random forest prediction model for the 30-day mortality of sepsis patients.
Parameter
With IV fluid management and electrolytes
Without IV fluid management and electrolytes
Training set
Testing set
Training set
Testing set
AUC (95% CI)
0.868 (0.867–0.869)
0.781 (0.779–0.782)
0.830 (0.829–0.831)
0.751 (0.750–0.753)
Sensitivity (95% CI)
0.815 (0.774–0.857)
0.755 (0.686–0.825)
0.717 (0.669–0.765)
0.612 (0.533–0.691)
Specificity (95% CI)
0.761 (0.723–0.798)
0.737 (0.677–0.797)
0.797 (0.762–0.833)
0.756 (0.698–0.814)
PPV (95% CI)
0.699 (0.654–0.744)
0.669 (0.597–0.740)
0.707 (0.658–0.755)
0.638 (0.559–0.718)
NPV (95% CI)
0.858 (0.825–0.891)
0.811 (0.755–0.866)
0.805 (0.770–0.840)
0.735 (0.676–0.794)
Accuracy (95% CI)
0.783 (0.755–0.811)
0.744 (0.699–0.790)
0.765 (0.736–0.794)
0.697 (0.649–0.744)
AUC = area under the curve, CI = confidence interval, IV = intravenous, NPV = negative predictive value, PPV = positive predictive value.
Figure 2.
The calibration curve of the XGBoost model with variables related to IV fluid management and electrolytes. IV = intravenous, XGBoost = extreme gradient boosting.
Figure 3.
The feature importance diagram of the XGBoost model with variables related to IV fluid management and electrolytes. INR = international normalized ratio, IV = intravenous, MAP = mean arterial pressure, SAPS-II = the simplified acute physiology score-II, SOFA = Sequential Organ Failure Assessment, XGBoost = extreme gradient boosting.
Construction of random forest prediction model for the 30-day mortality of sepsis patients.AUC = area under the curve, CI = confidence interval, IV = intravenous, NPV = negative predictive value, PPV = positive predictive value.The ROC curve of the XGBoost model with variables related to IV fluid management and electrolytes. ROC = receiver operating characteristic curve, IV = intravenous, XGBoost = extreme gradient boosting.The calibration curve of the XGBoost model with variables related to IV fluid management and electrolytes. IV = intravenous, XGBoost = extreme gradient boosting.The feature importance diagram of the XGBoost model with variables related to IV fluid management and electrolytes. INR = international normalized ratio, IV = intravenous, MAP = mean arterial pressure, SAPS-II = the simplified acute physiology score-II, SOFA = Sequential Organ Failure Assessment, XGBoost = extreme gradient boosting.Construction of the prediction model without variables related to IV fluid management and electrolytes for 30-day mortality of sepsis patientsAfter removing the variables related to IV fluid management and electrolytes, another prediction model was established. The results depicted that in the training set, the AUC was 0.830 (95% CI: 0.829–0.831), the specificity was 0.797 (95% CI: 0.762–0.833), the NPV was 0.805 (95% CI: 0.770–0.840), and the accuracy was 0.765 (95% CI: 0.736–0.794). In the testing set, the AUC was 0.751 (95% CI: 0.750–0.753) (Fig. 4; Table 3). The AUC in the model including variables related to IV fluid management and electrolytes were statistically higher than the AUC in the model without variables related to IV fluid management and electrolytes (training set: Z = 51.279, P < .001, testing set: Z = 27.719, P < .001). Therefore, the prediction model including variables related to IV fluid management and electrolytes might be better than that without variables related to IV fluid management and electrolytes, and was selected as the final model. The calibration curve depicted that the prediction values of the model in the training set and testing set were close to the ideal model, suggesting that the prediction model had good agreement between the predictive probability and the actual probability (Fig. 5). The feature importance diagram of the model was shown in Figure 6, showing that SOFA, SAPS-II, and age were top 3 important variables in the model.
Figure 4.
The ROC curve of the XGBoost model without variables related to IV fluid management and electrolytes. ROC = receiver operating characteristic curve, IV = intravenous, XGBoost = extreme gradient boosting.
Figure 5.
The calibration curve of the XGBoost model without variables related to IV fluid management and electrolytes. IV = intravenous, XGBoost = extreme gradient boosting.
Figure 6.
The feature importance diagram of the XGBoost model without variables related to IV fluid management and electrolytes. IV = intravenous, XGBoost = extreme gradient boosting.
The ROC curve of the XGBoost model without variables related to IV fluid management and electrolytes. ROC = receiver operating characteristic curve, IV = intravenous, XGBoost = extreme gradient boosting.The calibration curve of the XGBoost model without variables related to IV fluid management and electrolytes. IV = intravenous, XGBoost = extreme gradient boosting.The feature importance diagram of the XGBoost model without variables related to IV fluid management and electrolytes. IV = intravenous, XGBoost = extreme gradient boosting.
3.4. Subgroup analysis of predictive value the prediction model
3.4.1. Age.
Samples in the testing set were divided into ≤ 65 years group (n = 200) and > 65 years group (n = 156). In the ≤ 65 years group, the AUC was 0.821 (95% CI: 0.818–0.823), the sensitivity was 0.792 (95% CI: 0.683–0.902), the specificity was 0.728 (95% CI: 0.642–0.814), the PPV was 0.600 (95% CI: 0.485–0.715), the NPV was 0.872 (95% CI: 0.802–0.943) and the accuracy was 0.750 (95% CI: 0.682–0.818). In the >65 years group, the AUC was 0.758 (95% CI: 0.756–0.760), the specificity was 0.745 (95% CI: 0.662–0.828), the NPV was 0.760 (95% CI: 0.677–0.842) and the accuracy was 0.740 (95% CI: 0.679–0.801) (Table 4). The AUC in ≤65 years group was higher than >65 years group (Z = 38.57, P < .001), indicating that the prediction ability of the model might be better in patients ≤65 years than patients >65 years (Fig. 7). The results of calibration curve revealed that the model had a good overall fit (Fig. 8).
Table 4
Subgroup analysis of predictive value the prediction model.
Parameter
Age (yr)
Gender
≤65
>65
Male
Female
AUC (95% CI)
0.821 (0.818–0.823)
0.758 (0.756–0.760)
0.751 (0.682–0.820)
0.820 (0.752–0.887)
Sensitivity (95% CI)
0.792 (0.683–0.902)
0.734 (0.645–0.823)
0.687 (0.587–0.787)
0.844 (0.755–0.933)
Specificity (95% CI)
0.728 (0.642–0.814)
0.745 (0.662–0.828)
0.730 (0.649–0.812)
0.745 (0.657–0.833)
PPV (95% CI)
0.600 (0.485–0.715)
0.719 (0.629–0.809)
0.648 (0.548–0.748)
0.692 (0.590–0.795)
NPV (95% CI)
0.872 (0.802–0.943)
0.760 (0.677–0.842)
0.764 (0.684–0.843)
0.875 (0.803–0.947)
Accuracy (95% CI)
0.750 (0.682–0.818)
0.740 (0.679–0.801)
0.712 (0.649–0.775)
0.785 (0.721–0.849)
AUC = area under the curve, CI = confidence interval, NPV = negative predictive value, PPV = positive predictive value.
Figure 7.
The ROC curve of the XGBoost model with variables related to IV fluid management and electrolytes in different age groups. ROC = receiver operating characteristic curve, IV = intravenous, XGBoost = extreme gradient boosting.
Figure 8.
The calibration curve of the XGBoost model with variables related to IV fluid management and electrolytes in different age groups. IV = intravenous, XGBoost = extreme gradient boosting.
Subgroup analysis of predictive value the prediction model.AUC = area under the curve, CI = confidence interval, NPV = negative predictive value, PPV = positive predictive value.The ROC curve of the XGBoost model with variables related to IV fluid management and electrolytes in different age groups. ROC = receiver operating characteristic curve, IV = intravenous, XGBoost = extreme gradient boosting.The calibration curve of the XGBoost model with variables related to IV fluid management and electrolytes in different age groups. IV = intravenous, XGBoost = extreme gradient boosting.
3.4.2. Gender.
All the data in the testing set were divided into male group (n = 198) and female group (n = 158). In the male group, the AUC was 0.751 (95% CI: 0.682–0.820), the specificity was 0.730 (95% CI: 0.649–0.812), the NPV was 0.764 (95% CI: 0.684–0.843) and the accuracy was 0.712 (95% CI: 0.649–0.775). In the female group, the AUC was 0.820 (95% CI: 0.752–0.887), the sensitivity was 0.844 (95% CI: 0.755–0.933), the specificity was 0.745 (95% CI: 0.657–0.833), the NPV was 0.875 (95% CI: 0.803–0.947) and the accuracy was 0.785 (95% CI: 0.721–0.849) (Fig. 9; Table 4). The model presented good prediction ability in both males and females. There was no statistical difference in the AUC between male and female groups (Z = 1.401, P = .161), indicating that the prediction ability of the model was similar in different genders. The data of the calibration curve indicated that the model had a good overall fit (Fig. 10).
Figure 9.
The ROC curve of the XGBoost model with variables related to IV fluid management and electrolytes in different gender groups. ROC = receiver operating characteristic curve, IV = intravenous, XGBoost = extreme gradient boosting.
Figure 10.
The calibration curve of the XGBoost model with variables related to IV fluid management and electrolytes in different gender groups. IV = intravenous, XGBoost = extreme gradient boosting.
The ROC curve of the XGBoost model with variables related to IV fluid management and electrolytes in different gender groups. ROC = receiver operating characteristic curve, IV = intravenous, XGBoost = extreme gradient boosting.The calibration curve of the XGBoost model with variables related to IV fluid management and electrolytes in different gender groups. IV = intravenous, XGBoost = extreme gradient boosting.
4. Discussion
This study evaluated the predictive values of variables related to IV fluid management and electrolytes for 30-day mortality of sepsis patients and constructed 2 prediction models with or without the variables related to IV fluid management and electrolytes based on the data of 1185 sepsis patients MIMIC-III database. The results depicted that the model including variables related to IV fluid management and electrolytes had better predictive value for the 30-day mortality of sepsis patients. Malignant cancer, SAPS-II, potassium, SOFA, MAP, input and output of fluid were important variables associated with 30-day mortality of sepsis patients.The fluid and electrolyte balance are essential for regulating body functions and sustaining health and a slight deviation from average electrolyte concentrations can result in various problems or even increase the risk of death.[ Dysfunction of kidney leads to the disturbances of fluids and electrolytes including sodium, potassium, chlorine, and calcium imbalances, which are prevalent in acute kidney injury (AKI) patients in ICUs.[ Immediate and decisive treatment is required for fluid and electrolyte disturbances.[ IV fluids are commonly applied in ICUs due to their low-risk, go-to interventions for patients with fluid deficits and electrolyte imbalances.[ AKI patients receive high amounts of fluid due to their severe acute illness state and impaired hemodynamics.[ AKI patients also frequently suffered from oliguria which might result in impaired fluid output.[ Thus, the input and output of fluids and electrolytes in AKI patients were important factors associated with the outcomes of these patients. These were allied with the findings in the present study, which depicted that input and output of fluid and electrolytes were important predictors for the mortality of AKI patients. Malignant cancers were risk factors associated with poor prognosis of AKI patients as cancer patients were at increased risk of infection, sepsis, tumor lysis syndrome, drug-related toxicity, and other comorbidities.[ SAPS-II and SOFA were important severity score systems for AKI patients in clinical practice, and higher SAPS-II or SOFA scores were associated with poor prognosis.[ Herein, SAPS-II and SOFA scores were important variables associated with 30-day mortality of AKI patients. Previously, low MAP increased the short-term mortality in patients with cardiac surgery-associated AKI.[ This provide evidence to the result in our study, showing that MAP was a vital predictor for the 30-day mortality of AKI patients.Herein, the predictive values of IV fluids and electrolytes for the 30-day mortality of sepsis patients were evaluated. A prediction model for the mortality of sepsis patients was established based on age, gender, and variables with statistical difference between death and survival group as well as variables associated with input and output of IV fluids and electrolytes. Meanwhile, another predication model for the mortality of sepsis patients was also constructed without the variables associated with input and output of IV fluids and electrolytes. The predictive performances of models including variables associated with input and output of IV fluids and electrolytes or not were compared for selecting the best model for predicting the 30-day mortality in these patients. At present, although the qSOFA score was widely applied for predicting the mortality of sepsis patients in clinic, but several studies indicated the predictive value of the qSOFA score for the mortality of sepsis patients were not good.[ Previous studies also evaluated the predictive values of SOFA, the Systemic inflammatory response syndrome (SIRS) scoring system, National Early Warning Score (NEWS) systems, modified early warning score (MEWS).[ SOFA had prognostic accuracy with an AUC around 0.70, and the predictive value was not ideal.[ SIRS had low specificity for mortality prediction, which might cause overdiagnosis, and unnecessary hospitalization and drug use, while NEWS did not mainly indicate the diagnosis and recommended evaluating the clinical status and frequency of follow-up of patients during hospitalization.[ Tirotta et al[ demonstrated that MEWS might be not able to predict the in-hospital mortality risk of sepsis. Other prediction models for mortality in sepsis patients were not ideal and the performance was low (AUC: 0.69–0.88).[ In the current study, the predictive value of the model including variables associated with input and output of IV fluids and electrolytes showed high AUC, sensitivity, specificity, NPV, and accuracy, indicating that the model had good ability for identifying patients with high risk of 30-day mortality in sepsis patients. Subgroup analysis was also performed concerning different age and gender, which showed that the model had better predictive performance in people ≤ 65 years than people > 65 years, and had similar predictive value for 30-day mortality in female and male sepsis patients. This prediction model for 30-day mortality in sepsis patients can help quickly identify those at high risk of mortality in sepsis patients and remind the clinicians to provide timely interventions on those patients for improving the outcomes of these people.There were several limitations in this study. Firstly, this was a retrospective study with all the data collected from MIMIC-III, in which the variables were not comprehensive and some important data were not included. For example, the components of output of IV fluids and electrolytes were not clear, and the detailed treatments of patients during ICU were not analyzed. Secondly, the results in this study were not validated in other external cohorts. Thirdly, this study used the older version of surviving sepsis campaign guideline for caring the patients. Fourthly, the severity of sepsis in patients were not clear to the outcome assessor, which might cause bias. In the future, prospective studies with large scale of sample size were required to validate the findings of the current study.
5. Conclusion
In this study, 2 prediction models with or without the variables related to IV fluid management and electrolytes were constructed for predicting the 30-day mortality of sepsis patients based on the data of 1185 sepsis patients MIMIC-III database. The prediction model including variables IV fluids and electrolytes had good predictive value for the 30-day mortality of sepsis patients. The prediction model might help quickly identify those at high risk of mortality in sepsis patients and remind the clinicians to provide timely interventions on those patients for improving the outcomes of these people.
Author contributions
YW and SF designed the study. YW wrote the manuscript. YW and SF collected, analyzed and interpreted the data. SF critically reviewed, edited and approved the manuscript. All authors read and approved the final manuscript.Conceptualization: Yan Wang, Songqiao Feng.Data curation: Yan Wang.Formal analysis: Yan Wang.Investigation: Yan Wang.Methodology: Yan Wang.Supervision: Yan Wang.Validation: Yan Wang.Writing – original draft: Yan Wang, Songqiao Feng.Writing – review & editing: Yan Wang, Songqiao Feng.
Authors: Maurizio Cecconi; Christoph Hofer; Jean-Louis Teboul; Ville Pettila; Erika Wilkman; Zsolt Molnar; Giorgio Della Rocca; Cesar Aldecoa; Antonio Artigas; Sameer Jog; Michael Sander; Claudia Spies; Jean-Yves Lefrant; Daniel De Backer Journal: Intensive Care Med Date: 2015-07-11 Impact factor: 17.440
Authors: Stefan Hagel; Sandra Fiedler; Andreas Hohn; Alexander Brinkmann; Otto R Frey; Heike Hoyer; Peter Schlattmann; Michael Kiehntopf; Jason A Roberts; Mathias W Pletz Journal: Trials Date: 2019-06-06 Impact factor: 2.279