| Literature DB >> 35501359 |
Susana Garcia-Gutiérrez1,2,3, Cristobal Esteban-Aizpiri4, Iratxe Lafuente5,6, Irantzu Barrio7,8,6, Raul Quiros9,6, Jose Maria Quintana5,8,6, Ane Uranga10,11.
Abstract
Despite the publication of great number of tools to aid decisions in COVID-19 patients, there is a lack of good instruments to predict clinical deterioration. COVID19-Osakidetza is a prospective cohort study recruiting COVID-19 patients. We collected information from baseline to discharge on: sociodemographic characteristics, comorbidities and associated medications, vital signs, treatment received and lab test results. Outcome was need for intensive ventilatory support (with at least standard high-flow oxygen face mask with a reservoir bag for at least 6 h and need for more intensive therapy afterwards or Optiflow high-flow nasal cannula or noninvasive or invasive mechanical ventilation) and/or admission to a critical care unit and/or death during hospitalization. We developed a Catboost model summarizing the findings using Shapley Additive Explanations. Performance of the model was assessed using area under the receiver operating characteristic and prediction recall curves (AUROC and AUPRC respectively) and calibrated using the Hosmer-Lemeshow test. Overall, 1568 patients were included in the derivation cohort and 956 in the (external) validation cohort. The percentages of patients who reached the composite endpoint were 23.3% vs 20% respectively. The strongest predictors of clinical deterioration were arterial blood oxygen pressure, followed by age, levels of several markers of inflammation (procalcitonin, LDH, CRP) and alterations in blood count and coagulation. Some medications, namely, ATC AO2 (antiacids) and N05 (neuroleptics) were also among the group of main predictors, together with C03 (diuretics). In the validation set, the CatBoost AUROC was 0.79, AUPRC 0.21 and Hosmer-Lemeshow test statistic 0.36. We present a machine learning-based prediction model with excellent performance properties to implement in EHRs. Our main goal was to predict progression to a score of 5 or higher on the WHO Clinical Progression Scale before patients required mechanical ventilation. Future steps are to externally validate the model in other settings and in a cohort from a different period and to apply the algorithm in clinical practice.Registration: ClinicalTrials.gov Identifier: NCT04463706.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35501359 PMCID: PMC9059444 DOI: 10.1038/s41598-022-09771-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Flow chart.
Characteristics of the patients included in derivation and validation cohorts.
| Variable | Development and internal validation (n = 1568) | External validation (n = 956) | p-value | Missing |
|---|---|---|---|---|
| Sex (male) | 0.01 | |||
| Age* | 67.42 (16) | 65.75 (20) | 0.03 | |
| Temperature MAX | 37.28 (0.92) | 37.07 (0.87) | < 0.0001 | 193/160 |
| SpO2 MIN | 94.27 (2.87) | 95.18 (2.42) | < 0.0001 | 815/458 |
| Glucose | 122 (44.29) | 131.6 (59.17) | < 0.0001 | 3/12 |
| Urea | 45.55 (35) | 45 (34.2) | 0.60 | 4/15 |
| Sodium | 137.5 (4.27) | 138.3 (4.21) | < 0.0001 | 10/14 |
| Potassium | 4.13 (0.51) | 4.06 (0.51) | 0.0017 | 46/36 |
| Dimer D | 1777 (4793) | 1317 (2590) | 0.0049 | 316/153 |
| Prothrombin time | 83.7 (21.07) | 85.33 (19.64) | 0.05 | 106/23 |
| LDH | 308 (127) | 277 (269) | < 0.0001 | 237/170 |
| C-reactive protein | 83 (76) | 73.6 (69.35) | 0.0022 | 13/27 |
| Procalcitonine | 0.37 (2.58) | 0.52 (4.12) | 0.37 | 230/189 |
| 365 (23.28) | 191 (19.98) | 0.05 | ||
| VMK 100% | 255 (16.26) | 79 (8.26) | < 0.0001 | |
| Optiflow | 87 (5.55) | 73 (7.64) | 0.04 | |
| NIMV | 45 (2.87) | 15 (1.57) | 0.0374 | |
| ICU admission | 78 (4.97) | 54 (5.65) | 0.46 | |
| Death | 180 (11.48) | 110 (11.51) | 0.98 | |
NSAIDS non-steroidal anti-inflammatory drugs, MAX maximun value, MIN minimun value, SpO2 pulse oximetric saturation, PaO2 partial arterial oxygen concentration, ALT alanine aminotransferase, LDH Lactate dehydrogenase, hs-cTnT high-sensitivity cardiac troponin T, RDW red blood cell distribution width, VMK100% standard-high-flow-oxygen-facemask with reservoir-bag at least during 6 h and need for more intensive therapy afterwards, Optiflow (TM) high-flow-nasal-cannula, NIMV nor invasive mechanical ventilation, ICU intensive care unit.
Data are given as frecuencies and percentages except for *, expressed as means and standard deviation.
Univariate analysis, relationship between predictors and outcome in development-internal validation and external validation datasets.
| Variable | Missing | Deteroration | Deterioration | p-value | Missing | Deteroration | Deterioration | p-value |
|---|---|---|---|---|---|---|---|---|
| Sex (male) | 681 (56.61) | 252 (69.04) | < 0.0001 | 406 (53.07) | 114 (59.7) | 0.10 | ||
| Age* | 65.29 (16) | 74.53 (14) | < 0.0001 | 62.94 (19.8) | 77 (14.7) | < 0.0001 | ||
| Temperature MAX | 153/40 | 37.23 (0.9) | 37.46 (1) | 0.0002 | 127/33 | 37.07 (0.83) | 37.04 (1.02) | 0.75 |
| SpO2 MIN | 539/276 | 94.53 (2.5) | 92.3 (4.33) | < 0.0001 | 322/136 | 95.35 (2.3) | 93.70 (2.62) | < 0.0001 |
| Glucose | 2/1 | 118.1 (40.33) | 134.3 (53.56) | < 0.0001 | 12/0 | 128.1 (56.41) | 145.5 (67.32) | 0.001 |
| Urea | 10/4 | 41 (28) | 60.65 (47.33) | < 0.001 | 3/0 | 40.21 (27.53) | 62.85 (48.89) | < 0.00011/0 |
| Sodium | 7/4 | 137.4 (3.70) | 137.5 (5.77) | 0.79 | 2/0 | 138.3 (3.91) | 138.2 (5.24) | 0.73 |
| Potassium | 33/13 | 4.11 (0.50) | 4.19 (0.55) | 0.01 | 40/6 | 4.06 (0.49) | 4.07 (0.57) | 0.70 |
| Dimer D | 226/90 | 1516.2 (3814.6) | 2703.2 (7207) | 0.0090 | 133/20 | 1185.5 (2264.3) | 1803.2 (3509.5) | 0.03 |
| Prothrombin time | 79/27 | 85 (20.31) | 79.33 (23) | < 0.0001 | 21/2 | 86.39 (18.94) | 81.18 (21.72) | 0.0028 |
| LDH | 176/61 | 291 (95.61) | 365 (187) | < 0.0001 | 148/22 | 269.1 (96.66) | 304.6 (130.3) | 0.001 |
| C-reactive protein | 12/1 | 71.07 | 120.6 (87.3) | < 0.0001 | 15/2 | 66.90 (65.69) | 99.64 (76.93) | < 0.0001 |
| Procalcitonine | 182/48 | 0.30 (2.74) | 0.60 (1.95) | 0.03 | 165/24 | 0.51 (4.56) | 0.56 (1.76) | 0.81 |
| Red blood cells | 3/0 | 4.63 (0.61) | 4.5 (0.68) | 0.0017 | 7/1 | 4.6 (0.64) | 4.42 (0.73) | 0.0018 |
| Haemoglobin | 3/0 | 13.74 (1.79) | 13.5 (2.02) | 0.0467 | 7/9 | 13.56 (1.9) | 13.11 (2.02) | 0.0038 |
| Haematocrit | 3/0 | 42.13 (5.21) | 41.79 (6.03) | 0.33 | 7/1 | 41.21 (5.41) | 40.13 (6.03) | 0.01 |
| Mean corpuscular volume | 3/0 | 91.26 (5.96) | 93.11 (6.69) | < 0.0001 | 7/1 | 89.94 (6.30) | 91.39 (6.6) | 0.0052 |
| RDW | 3/0 | 13.18 (1.60) | 13.91 (1.89) | < 0.0001 | 7/1 | 13.49 (1.73) | 14.18 (2.13) | < 0.0001 |
| Platelets | 3/0 | 201.5 (82.43 | 171.6 (64.93) | < 0.0001 | 7/1 | 206.3 (82.70) | 174.3 (72.93) | < 0.0001 |
| Leucocytes | 3/0 | 6.71 (2.88) | 7.7 (6) | 0.0023 | 7/0 | 6.81 (3.43) | 7.15 (3.57) | 0.24 |
| Limphocytes | 6/2 | 1.18 (0.62) | 1.33 (4.81) | < 0.0001 | 22/6 | 1.22 (0.65) | 0.95 (0.60) | < 0.0001 |
| Neutrophils | 6/2 | 4.92 (2.59) | 5.85 (3.33) | < 0.0001 | 22/6 | 4.86 (2.85) | 5.48 (3.16) | 0.0098 |
| Basophils | 6/2 | 0.02 (0.02) | 0.02 (0.04) | 0.86 | 22/6 | 0.02 (0.0006) | 0.02 (0.00148) | 0.17 |
| Monoytes | 6/2 | 0.54 (0.31) | 0.48 (0.47) | 0.03 | 22/6 | 0.52 (0.28) | 0.50 (0.47) | 0.67 |
| Eosinophils | 6/2 | 0.04 (0.14) | 0.02 (0.04) | < 0.0001 | 22/6 | 0.05 (0.27) | 0.02 (0.03) | 0.0009 |
NSAIDS non-steroidal anti-inflammatory drugs, MAX maximum value, MIN minimum value, SpO2 pulse oximetric saturation, PaO2 partial arterial oxygen concentration, ALT alanine aminotransferase, LDH Lactate dehydrogenase, hs-cTnT high-sensitivity cardiac troponin T, RDW red blood cell distribution width, VMK100% standard-high-flow-oxygen-facemask with reservoir-bag at least during 6 h and need for more intensive therapy afterwards, Optiflow (TM) high-flow-nasal-cannula, NIMV nor invasive mechanical ventilation, ICU intensive care unit.
Figure 2Main predictors in catboost model in (a) derivation-internal validation and (b) external validation datasets. PO2-A partial arterial oxygen concentration, PCR C-reactive proteine, PCT procalcitonine, Edad age, LDH lactate dehydrogenase, PLT platelets, ADE RED blood cell distribution width, CREA creatinine, Mon%A Total count of monocytes, CK creatine kinase, Dimer D dimer, EOS%A Percentage of eoshinophils, AO2_12 Antacids in the last 12 months, N05_12 neuroleptics in the last 12 months, MON#A total count of monocytes, GLU glucose, EOS#A TOTAL count of eosynophils, lin#A total count of lynphocites, Neu%A percentage of neutrophils, CO3_12 diuretics in the last 12 months.
Figure 3Predictive performance of the catboost model in (a) derivation and (b) validation sets.
Figure 4Calibration performance in (a) derivation and (b) validation sets.
Sensitivity, specificity, and positive and negative predictive values according to different cutoff points in both, (a) derivation-internal validation and (b) external validation datasets.
| Risk groups | No patients | Derivation cohort | No (%) deteriorated in complementary risk groups | |||
|---|---|---|---|---|---|---|
| Sensitivity | Specificity | Positive predictive value | Negative predictive value | |||
| Score > 0.04 | 1424 | 1 | 0.12 | 0.26 | 0.99 | 1 (0.69%) |
| Score > 0.13 | 851 | 0.98 | 0.59 | 0.42 | 0.99 | 9 (1.26%) |
| Score > 0.23 | 530 | 0.90 | 0.83 | 0.62 | 0.96 | 38 (3.66%) |
| Score > 0.47 | 227 | 0.59 | 0.99 | 0.94 | 0.89 | 151 (11.26%) |
| Score > 0.82 | 31 | 0.08 | 1 | 1 | 0.78 | 334 (21.73%) |
Risk cut-off values were defined by the total point score for an individual, which represented low (< 2% mortality rate), intermediate (2–14.9%), or high risk (≥ 15%) groups, similar to commonly used pneumonia risk stratification scores.
Figure 5Differences between traditional development and machine-learning based prediction models.