| Literature DB >> 34960595 |
Nicola Altini1, Antonio Brunetti1,2, Stefano Mazzoleni1, Fabrizio Moncelli1, Ilenia Zagaria1, Berardino Prencipe1, Erika Lorusso1, Enrico Buonamico3, Giovanna Elisiana Carpagnano3, Davide Fiore Bavaro4, Mariacristina Poliseno4, Annalisa Saracino4, Annalisa Schirinzi5, Riccardo Laterza5, Francesca Di Serio5, Alessia D'Introno6, Francesco Pesce7, Vitoantonio Bevilacqua1,2.
Abstract
The coronavirus disease 2019 (COVID-19) pandemic has affected hundreds of millions of individuals and caused millions of deaths worldwide. Predicting the clinical course of the disease is of pivotal importance to manage patients. Several studies have found hematochemical alterations in COVID-19 patients, such as inflammatory markers. We retrospectively analyzed the anamnestic data and laboratory parameters of 303 patients diagnosed with COVID-19 who were admitted to the Polyclinic Hospital of Bari during the first phase of the COVID-19 global pandemic. After the pre-processing phase, we performed a survival analysis with Kaplan-Meier curves and Cox Regression, with the aim to discover the most unfavorable predictors. The target outcomes were mortality or admission to the intensive care unit (ICU). Different machine learning models were also compared to realize a robust classifier relying on a low number of strongly significant factors to estimate the risk of death or admission to ICU. From the survival analysis, it emerged that the most significant laboratory parameters for both outcomes was C-reactive protein min; HR=17.963 (95% CI 6.548-49.277, p < 0.001) for death, HR=1.789 (95% CI 1.000-3.200, p = 0.050) for admission to ICU. The second most important parameter was Erythrocytes max; HR=1.765 (95% CI 1.141-2.729, p < 0.05) for death, HR=1.481 (95% CI 0.895-2.452, p = 0.127) for admission to ICU. The best model for predicting the risk of death was the decision tree, which resulted in ROC-AUC of 89.66%, whereas the best model for predicting the admission to ICU was support vector machine, which had ROC-AUC of 95.07%. The hematochemical predictors identified in this study can be utilized as a strong prognostic signature to characterize the severity of the disease in COVID-19 patients.Entities:
Keywords: COVID-19; Cox regression; Kaplan–Meier; hematochemical parameters; machine learning; prognostic markers
Mesh:
Year: 2021 PMID: 34960595 PMCID: PMC8705488 DOI: 10.3390/s21248503
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Summary of materials and methods exploited in related works.
| Authors | Materials | Methods | ||||
|---|---|---|---|---|---|---|
| Sample Size | Location | Period | Predictors | Outcomes | Techniques | |
| Yoshida et al. | 776 patients | New Orleans, LA | 27 February– | Demographics, comorbidities, | ICU admission, invasive mechanical | Chi-square test, Fischer’s exact |
| Nachtigall et al. | 1904 patients | Network of | 12 February– | Demographics, comorbidities | ICU admission, invasive mechanical | Descriptive statistics; survival |
| Banoei et al. | 250 patients | Miami, FL, USA | since June 2020 | Clinical features, comorbidities, | In-hospital death | SIMPLS (statistically inspired |
| Zuccaro et al. | 426 patients | Lombardy, Italy | 21 February– | Demographics, comorbidities, | In-hospital death, discharge | Student |
| Zhou et al. | 116 patients | Chongqing, China | 24 January– | Demographics, epidemiological | Disease progression from milder | Chi-square test, Fischer’s exact test, |
| Niu et al. | 150 patients | Huanggang, China | 23 January– | Epidemiological and demographic | In-hospital death | Chi-square test, Fischer’s exact test, |
Figure 1Data Processing Workflow. The figure shows the study workflow, starting from the data collection step until the development and assessment of the different predictive models. ML stands for machine learning. Considered ML classifiers include decision trees, random forests, support vector machines, Gaussian naive Bayes, AdaBoost, and K-nearest neighbors.
Demographic characteristics of the patient cohort. The table displays the demographic characteristics presented as absolute frequency (percentage frequency) of all the patients enrolled in the study.
| Total | Deceased | Survived | Admitted | |||
|---|---|---|---|---|---|---|
|
| 303 | 85 (28.1) | 218 (71.9) | 74 (24.4) | ||
|
| 0.6220 | 0.0384 | ||||
| Male | 184 (60.7) | 54 (29.3) | 130 (70.7) | 53 (28.8) | ||
| Female | 119 (39.3) | 31 (26.1) | 88 (73.9) | 21 (17.6) | ||
|
| <0.001 | <0.001 | ||||
| Under 55 | 90 (29.7) | 10 (11.1) | 80 (88.9) | 13 (14.4) | ||
| 55–65 | 72 (23.8) | 10 (13.9) | 62 (86.1) | 19 (26.4) | ||
| 65–80 | 74 (24.4) | 36 (48.6) | 38 (51.4) | 34 (45.9) | ||
| Over 80 | 67 (22.1) | 29 (43.3) | 38 (56.7) | 8 (11.9) |
Figure 2Kaplan–Meier survival curves. (A) Kaplan–Meier curves for death as a function of hospitalization days stratified by sex. (B) Kaplan–Meier curves for the admission to ICU as a function of hospitalization days before the admission stratified by sex. (C) Kaplan–Meier curves for death as a function of hospitalization days stratified by age. (D) Kaplan–Meier curves for the admission to ICU as a function of hospitalization days before the admission stratified by age.
Blood parameters. Data are reported as absolute frequency (percentage frequency).
| Hematochemical Test | Survived | Deceased | Not Admitted to ICU | Admitted to ICU | |
|---|---|---|---|---|---|
|
| <4.6 mg/dL | 170 (90.4) | 66 (82.5) | 185 (94.9) | 51 (69.9) |
| 4.6–5.3 mg/dL | 17 (9.0) | 13 (16.2) | 9 (4.6) | 21 (28.8) | |
| >5.3 mg/dL | 1 (0.5) | 1 (1.2) | 1 (0.5) | 1 (1.4) | |
| 188 | 80 | 195 | 73 | ||
|
| ≤2.9 mg/L | 18 (8.3) | 0 (0.0) | 17 (7.5) | 1 (1.4) |
| >2.9 mg/L | 199 (91.7) | 84 (100.0) | 211 (92.5) | 72 (98.6) | |
| 217 | 84 | 228 | 73 | ||
|
| ≤2.9 mg/L | 127 (58.5) | 3 (3.6) | 113 (49.6) | 17 (23.3) |
| >2.9 mg/L | 90 (41.5) | 81 (96.4) | 115 (50.4) | 56 (76.7) | |
| 217 | 84 | 228 | 73 | ||
|
| <0.20 mg/dL | 4 (1.9) | 0 (0.0) | 4 (1.8) | 0 (0.0) |
| 0.20–1.00 mg/dL | 206 (97.2) | 76 (90.5) | 213 (95.5) | 69 (94.5) | |
| >1.00 mg/dL | 2 (0.9) | 8 (9.5) | 6 (2.7) | 4 (5.5) | |
| 212 | 84 | 223 | 73 | ||
|
| <4.54 | 52 (23.9) | 38 (45.2) | 60 (26.2) | 30 (41.1) |
| 4.54–5.78 | 155 (71.1) | 39 (46.4) | 154 (67.2) | 40 (54.8) | |
| >5.78 | 11 (5.0) | 7 (8.3) | 15 (6.6) | 3 (4.1) | |
| 218 | 84 | 229 | 73 | ||
|
| <15 U/L | 37 (17.1) | 7 (8.3) | 31 (13.7) | 13 (17.8) |
| 15–37 U/L | 160 (74.1) | 47 (56.0) | 164 (72.2) | 43 (58.9) | |
| >37 U/L | 19 (8.8) | 30 (35.7) | 32 (14.1) | 17 (23.3) | |
| 216 | 84 | 227 | 73 | ||
Feature selection results for death and admission to ICU. The table displays the statistical information of the different features filtered according to the logit coefficient shown in the last column, and the p-value for both outcomes.
| Hematochemical Test | Mean ± Std | Median ± IQR | Min–Max | N | Logit Coeff | ||
|---|---|---|---|---|---|---|---|
|
| Overall | 4.2 ± 0.4 | 4.1 ± 0.3 | 3.2–7.7 | 268 | ||
| Survived | 4.2 ± 0.3 | 4.1 ± 0.3 | 3.2–5.4 | 188 | 0.304 | −3.178 | |
| Deceased | 4.2 ± 0.5 | 4.2 ± 0.5 | 3.5–7.7 | 80 | |||
| Not admitted to ICU | 4.1 ± 0.3 | 4.1 ± 0.2 | 3.2–5.4 | 195 | 0.003 | 5.629 | |
| Admitted to ICU | 4.4 ± 0.5 | 4.3 ± 0.4 | 3.6–7.7 | 73 | |||
|
| Overall | 66.9 ± 69.7 | 42.5 ± 76.4 | 2.9–332.0 | 301 | ||
| Survived | 36.8 ± 32.9 | 30.2 ± 38.8 | 2.9–169.4 | 217 | <0.001 | 4.670 | |
| Deceased | 144.7 ± 79.0 | 137.0 ± 94.9 | 3.9–332.0 | 84 | |||
| Not admitted to ICU | 47.3 ± 53.0 | 31.4 ± 49.8 | 2.9–332.0 | 228 | <0.001 | 4.169 | |
| Admitted to ICU | 128.1 ± 79.9 | 119.5 ± 92.3 | 2.9–330.2 | 73 | |||
|
| Overall | 29.1 ± 52.5 | 4.6 ± 19.9 | 2.9–301.0 | 301 | ||
| Survived | 8.0 ± 15.2 | 2.9 ± 3.9 | 2.9–142.0 | 217 | <0.001 | 3.252 | |
| Deceased | 83.4 ± 72.2 | 63.8 ± 119.2 | 2.9–301.0 | 84 | |||
| Not admitted to ICU | 19.4 ± 41.2 | 3.1 ± 7.8 | 2.9–301.0 | 228 | <0.001 | 7.854 | |
| Admitted to ICU | 59.2 ± 70.2 | 19.8 ± 93.5 | 2.9–295.0 | 73 | |||
|
| Overall | 0.47 ± 0.40 | 0.40 ± 0.20 | 0.10–5.90 | 296 | ||
| Survived | 0.41 ± 0.20 | 0.40 ± 0.20 | 0.10–1.60 | 212 | <0.001 | 2.999 | |
| Deceased | 0.62 ± 0.66 | 0.50 ± 0.30 | 0.20–5.90 | 84 | |||
| Not admitted to ICU | 0.43 ± 0.24 | 0.40 ± 0.20 | 0.10–1.60 | 223 | 0.009 | 4.104 | |
| Admitted to ICU | 0.58 ± 0.69 | 0.40 ± 0.20 | 0.20–5.90 | 73 | |||
|
| Overall | 4.5 ± 0.6 | 4.6 ± 0.8 | 2.6–6.8 | 302 | ||
| Survived | 4.6 ± 0.5 | 4.6 ± 0.6 | 3.1–6.6 | 218 | 0.005 | 2.908 | |
| Deceased | 4.4 ± 0.8 | 4.3 ± 0.9 | 2.6–6.8 | 84 | |||
| Not admitted to ICU | 4.6 ± 0.6 | 4.6 ± 0.7 | 2.6–6.8 | 229 | 0.588 | 4.105 | |
| Admitted to ICU | 4.5 ± 0.6 | 4.5 ± 0.8 | 3.3–6.2 | 73 | |||
|
| Overall | 26.8 ± 15.0 | 23.0 ± 15.0 | 7.0–115.0 | 300 | ||
| Survived | 23.5 ± 10.5 | 21.0 ± 11.3 | 7.0–74.0 | 216 | <0.001 | 3.313 | |
| Deceased | 35.3 ± 20.7 | 31.0 ± 22.3 | 8.0–115.0 | 84 | |||
| Not admitted to ICU | 25.9 ± 14.0 | 22.0 ± 14.0 | 9.0–115.0 | 227 | 0.279 | 7.477 | |
| Admitted to ICU | 29.4 ± 17.6 | 24.0 ± 20.0 | 7.0–89.0 | 73 | |||
Figure 3Scatter plot of low dimensionality feature embedding (death outcome). A 2D visualization of hematochemical parameters with PCA and t-SNE. Different colors are used for survived and deceased patients. (Top left) PCA starting from the selected features; (top right) t-SNE from the selected features; (bottom left) PCA starting from all features; (bottom right) t-SNE starting from all features.
Figure 4Scatter plot of low dimensionality features embeddings (admission to ICU outcome). A 2D visualization of hematochemical parameters with PCA and t-SNE. Different colors are used for patients, who were (or not) transferred to the ICU. (Top left) PCA starting from the selected features; (top right) t-SNE from the selected features; (bottom left) PCA starting from all features; (bottom right) t-SNE starting from all features.
Figure 5Violin plots of the distribution of the selected laboratory features considering mortality as outcome. C-reactive protein (CRP) mean, CRP min, Total bilirubin min, Erythrocyte max, AST min proved to be statistically significant according to the Mann–Whitney U test.
Figure 6Violin plots of the distribution of the selected laboratory features considering the admission to ICU as outcome. Ionized calcium max, CRP mean, CRP min, Total bilirubin min proved to be statistically significant, according to the Mann–Whitney U test.
Risk factors for both outcomes: Cox regression analysis. For each feature, the first row refers to the mortality risk, whereas the second row refers to the admission to ICU.
| Hematochemical Test | Normality Range | log(HR) | 95% CI log(HR) | HR | 95% CI HR |
|
|---|---|---|---|---|---|---|
|
| <2.9 mg/L | Not significant | ||||
| 1.061 | [−0.957, 3.080] | 2.890 | [0.384, 21.757] | 0.303 | ||
|
| <2.9 mg/L | 2.888 | [1.879, 3.897] | 17.963 | [6.548, 49.277] | <0.001 |
| 0.582 | [0.000, 1.163] | 1.789 | [1.000, 3.200] | 0.050 | ||
|
| 4.54–5.78 | 0.568 | [0.132, 1.004] | 1.765 | [1.141, 2.729] | 0.011 |
| 0.393 | [−0.111, 0.897] | 1.481 | [0.895, 2.452] | 0.127 | ||
|
| 0.20–1.00 mg/dL | 0.435 | [−0.317, 1.188] | 1.545 | [0.728, 3.279] | 0.257 |
| 0.321 | [−0.712, 1.355] | 1.379 | [0.491, 3.876] | 0.542 | ||
|
| 15–37 U/L | 0.281 | [−0.161, 0.722] | 1.324 | [0.851, 2.059] | 0.213 |
| 0.192 | [−0.290, 0.674] | 1.211 | [0.748, 1.962] | 0.436 | ||
|
| 4.6–5.3 mg/dL | 0.098 | [−0.497, 0.692] | 1.103 | [0.609, 1.998] | 0.747 |
| −1.293 | [−1.843, −0.744] | 0.274 | [0.158, 0.475] | <0.001 | ||
Figure 7Cox regression coefficients for mortality risk (top) and risk of admission to ICU (bottom). Hazard ratio (HR) is plotted with the 95% confidence interval (CI).
Figure 8Predictive model performances for mortality prediction. Model performances for the mortality prediction displayed as bar plots for accuracy, precision, recall, and ROC-AUC.
Figure 9Predictive model performances for ICU prediction. Models performances for the ICU admission prediction displayed as bar plots for accuracy, precision, recall, and ROC-AUC.
Figure 10ROC curve of decision tree for mortality prediction.
Figure 11ROC curve of support vector machines for ICU admission prediction.