| Literature DB >> 34331717 |
Stanislas Werfel1, Carolin E M Jakob2,3, Stefan Borgmann4, Jochen Schneider5,6, Christoph Spinner5,6, Maximilian Schons2, Martin Hower7, Kai Wille8, Martina Haselberger9, Hanno Heuzeroth10, Maria M Rüthrich11, Sebastian Dolff12, Johanna Kessel13, Uwe Heemann1, Jörg J Vehreschild2,3,13, Siegbert Rieg14, Christoph Schmaderer1.
Abstract
Scores to identify patients at high risk of progression of coronavirus disease (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), may become instrumental for clinical decision-making and patient management. We used patient data from the multicentre Lean European Open Survey on SARS-CoV-2-Infected Patients (LEOSS) and applied variable selection to develop a simplified scoring system to identify patients at increased risk of critical illness or death. A total of 1946 patients who tested positive for SARS-CoV-2 were included in the initial analysis and assigned to derivation and validation cohorts (n = 1297 and n = 649, respectively). Stability selection from over 100 baseline predictors for the combined endpoint of progression to the critical phase or COVID-19-related death enabled the development of a simplified score consisting of five predictors: C-reactive protein (CRP), age, clinical disease phase (uncomplicated vs. complicated), serum urea, and D-dimer (abbreviated as CAPS-D score). This score yielded an area under the curve (AUC) of 0.81 (95% confidence interval [CI]: 0.77-0.85) in the validation cohort for predicting the combined endpoint within 7 days of diagnosis and 0.81 (95% CI: 0.77-0.85) during full follow-up. We used an additional prospective cohort of 682 patients, diagnosed largely after the "first wave" of the pandemic to validate the predictive accuracy of the score and observed similar results (AUC for the event within 7 days: 0.83 [95% CI: 0.78-0.87]; for full follow-up: 0.82 [95% CI: 0.78-0.86]). An easily applicable score to calculate the risk of COVID-19 progression to critical illness or death was thus established and validated.Entities:
Keywords: COVID-19; logistic models; machine learning; risk factors
Mesh:
Substances:
Year: 2021 PMID: 34331717 PMCID: PMC8426905 DOI: 10.1002/jmv.27252
Source DB: PubMed Journal: J Med Virol ISSN: 0146-6615 Impact factor: 20.693
Figure 2Patient flow diagram (A) and months of COVID‐19 diagnosis (B) for the different data sets
Characteristics of patients in the derivation and validation data sets
| Predictor | Deriv. | Valid. | Test, f. | Test, l. | Predictor | Deriv. | Valid. | Test, f. | Test, l. |
|---|---|---|---|---|---|---|---|---|---|
| Total patients | CRP (mg/L) | ||||||||
| 1297 | 649 | 682 | 219 | <3 | 181 (14%) | 101 (16%) | 97 (14%) | 37 (17%) | |
| Event during follow‐up (7d/all) | 3–29 | 454 (35%) | 222 (34%) | 250 (37%) | 80 (37%) | ||||
| No | 1095/1036 | 555/522 | 613/597 | 198/190 | 30–69 | 266 (21%) | 132 (20%) | 140 (21%) | 47 (21%) |
| (84%/80%) | (86%/80%) | (90%/88%) | (90%/87%) | 70–119 | 166 (13%) | 85 (13%) | 92 (13%) | 28 (13%) | |
| Yes | 202/261 | 94/127 | 69/85 | 21/29 | 120–179 | 124 (10%) | 55 (8%) | 67 (10%) | 18 (8%) |
| (16%/20%) | (14%/20%) | (10%/12%) | (10%/13%) | 180–249 | 52 (4%) | 26 (4%) | 18 (3%) | 6 (3%) | |
| Type of patient care (not used for analyses) | >249 | 32 (2%) | 17 (3%) | 6 (1%) | 0 (0%) | ||||
| Outpatient | 16 (1%) | 11 (2%) | 9 (1%) | 1 (0%) | Missing | 22 (2%) | 11 (2%) | 12 (2%) | 3 (1%) |
| Inpatient | 1255 (97%) | 627 (97%) | 648 (95%) | 207 (95%) | PCT (ng/ml) | ||||
| Missing | 26 (2%) | 11 (2%) | 25 (4%) | 11 (5%) | <0.005 | 78 (6%) | 28 (4%) | 27 (4%) | 12 (5%) |
| Age (year) | 0.005–0.5 | 562 (43%) | 282 (43%) | 367 (54%) | 161 (74%) | ||||
| ≤25 | 22 (2%) | 17 (3%) | 36 (5%) | 9 (4%) | 0.51–2 | 58 (4%) | 35 (5%) | 28 (4%) | 10 (5%) |
| 26–35 | 78 (6%) | 42 (6%) | 64 (9%) | 29 (13%) | 2.1–10 | 0 (0%) | 0 (0%) | 13 (2%) | 5 (2%) |
| 36–45 | 105 (8%) | 50 (8%) | 86 (13%) | 29 (13%) | >10 | 10 (1%) | 6 (1%) | 4 (1%) | 1 (0%) |
| 46–55 | 189 (15%) | 98 (15%) | 104 (15%) | 38 (17%) | Missing | 589 (45%) | 298 (46%) | 243 (36%) | 30 (14%) |
| 56–65 | 244 (19%) | 117 (18%) | 120 (18%) | 45 (21%) | D‐dimer (LN) | ||||
| 66–75 | 214 (16%) | 118 (18%) | 89 (13%) | 25 (11%) | Normal | 232 (18%) | 123 (19%) | 158 (23%) | 83 (38%) |
| 76– 85 | 317 (24%) | 140 (22%) | 133 (20%) | 30 (14%) | >1x, ≤2x | 211 (16%) | 109 (17%) | 126 (18%) | 72 (33%) |
| >85 | 110 (8%) | 59 (9%) | 47 (7%) | 13 (6%) | >2x, ≤5x | 159 (12%) | 69 (11%) | 72 (11%) | 34 (16%) |
| Missing | 18 (1%) | 8 (1%) | 3 (0%) | 1 (0%) | >5x, ≤10x | 39 (3%) | 27 (4%) | 24 (4%) | 9 (4%) |
| Sex | >10x, ≤20x | 20 (2%) | 11 (2%) | 8 (1%) | 2 (1%) | ||||
| Male | 768 (59%) | 360 (55%) | 390 (57%) | 133 (61%) | >20x | 21 (2%) | 12 (2%) | 6 (1%) | 4 (2%) |
| Female | 529 (41%) | 289 (45%) | 292 (43%) | 86 (39%) | Missing | 615 (47%) | 298 (46%) | 288 (42%) | 15 (7%) |
| Disease phase | Neutrophils (×1000/μl) | ||||||||
| Uncompl. | 876 (68%) | 430 (66%) | 488 (72%) | 162 (74%) | <0.1 | 11 (1%) | 3 (0%) | 4 (1%) | 1 (0%) |
| Compl. | 421 (32%) | 219 (34%) | 194 (28%) | 57 (26%) | 0.1 to <0.3 | 14 (1%) | 3 (0%) | 2 (0%) | 0 (0%) |
| Any cardiovascular comorbidity | 0.3 to <0.5 | 22 (2%) | 10 (2%) | 2 (0%) | 0 (0%) | ||||
| Yes | 727 (56%) | 370 (57%) | 346 (51%) | 104 (47%) | 0.5 to <2 | 118 (9%) | 62 (10%) | 47 (7%) | 15 (7%) |
| No | 545 (42%) | 262 (40%) | 326 (48%) | 113 (52%) | 2 to <5 | 524 (40%) | 262 (40%) | 275 (40%) | 105 (48%) |
| Missing | 25 (2%) | 17 (3%) | 10 (1%) | 2 (1%) | 5 to <9 | 262 (20%) | 139 (21%) | 144 (21%) | 54 (25%) |
| Malignant neoplasia | ≥9 | 71 (5%) | 40 (6%) | 39 (6%) | 6 (3%) | ||||
| No | 1263 (97%) | 635 (98%) | 678 (99%) | 218 (100%) | Missing | 275 (21%) | 130 (20%) | 169 (25%) | 38 (17%) |
| Yes | 34 (3%) | 14 (2%) | 4 (1%) | 1 (0%) | Lymphocytes (×1000/μl) | ||||
| LDH (LN) | <0.1 | 16 (1%) | 8 (1%) | 7 (1%) | 1 (0%) | ||||
| <Normal | 0 (0%) | 0 (0%) | 8 (1%) | 2 (1%) | 0.1 to <0.3 | 56 (4%) | 30 (5%) | 18 (3%) | 1 (0%) |
| Normal | 439 (34%) | 218 (34%) | 249 (37%) | 98 (45%) | 0.3 to <0.5 | 95 (7%) | 43 (7%) | 33 (5%) | 9 (4%) |
| >1x, ≤2x | 596 (46%) | 312 (48%) | 305 (45%) | 95 (43%) | 0.5 to <0.8 | 230 (18%) | 124 (19%) | 118 (17%) | 39 (18%) |
| >2x, ≤5x | 87 (7%) | 51 (8%) | 38 (6%) | 11 (5%) | 0.8 to <1.5 | 421 (32%) | 212 (33%) | 231 (34%) | 94 (43%) |
| >5x | 4 (0%) | 1 (0%) | 3 (0%) | 2 (1%) | 1.5 to <3 | 198 (15%) | 104 (16%) | 100 (15%) | 34 (16%) |
| Missing | 171 (13%) | 67 (10%) | 79 (12%) | 11 (5%) | ≥3 | 15 (1%) | 13 (2%) | 17 (2%) | 4 (2%) |
| Urea (LN) | Missing | 266 (21%) | 115 (18%) | 158 (23%) | 37 (17%) | ||||
| <Normal | 8 (1%) | 9 (1%) | 33 (5%) | 8 (4%) | |||||
| Normal | 846 (65%) | 408 (63%) | 445 (65%) | 173 (79%) | |||||
| >1x, ≤2x | 195 (15%) | 106 (16%) | 89 (13%) | 26 (12%) | |||||
| >2x | 63 (5%) | 32 (5%) | 30 (4%) | 8 (4%) | |||||
| Missing | 185 (14%) | 94 (14%) | 85 (12%) | 4 (2%) | |||||
Abbreviations: 7d, event (critical phase or COVID‐19‐related death) within 7 days of diagnosis; CRP, C‐reactive protein; LDH, lactate dehydrogenase; LN, laboratory normal range, “x” indicates multiples of the upper limit of the normal range; PCT, procalcitonin; Test, f., full test set (as shown in Figure 2); Test, l., limited test set (as shown in Figure 2).
Figure 1Definition of COVID‐19 disease phases in the LEOSS registry. Patients were assigned to the highest phase for which at least one characteristic was fulfilled. ALT, alanine transaminase; AST, aspartate transaminase; INR, international normalized ratio of prothrombin time; PaO2, partial pressure of oxygen in arterial blood; qSOFA, quick sequential organ failure assessment score; sO2, blood oxygen saturation; ULN, upper limit of normal
Summary of the predictive performances of the analyzed models
| AUC, 7d (imp. range) | AUC, all (imp. range) | |||||
|---|---|---|---|---|---|---|
| Selection | Model | N pr. | Derivation | Validation | Derivation | Validation |
| All pr. | RF | 104 | 0.83 (0.82–0.83) | 0.83 (0.82–0.83) | 0.83 (0.82–0.83) | 0.83 (0.82–0.83) |
| Binomial ridge | 104 | 0.88 (0.87–0.89) | 0.81 (0.80–0.81) | 0.86 (0.86–0.87) | 0.81 (0.80–0.82) | |
| RF Boruta | RF | 5 | 0.74 (0.72–0.75) | 0.73 (0.71–0.76) | 0.73 (0.72–0.75) | 0.74 (0.73–0.77) |
| Binomial ridge | 5 | 0.80 (0.80–0.80) | 0.81 (0.81–0.81) | 0.80 (0.80–0.80) | 0.81 (0.81–0.81) | |
| Score | 5 | 0.80 (0.80–0.80) |
| 0.80 (0.80–0.80) |
| |
| 95% CI, 0.77–0.83 |
| 95%CI, 0.77–0.83 |
| |||
| Validation on the full test set |
|
| ||||
|
|
| |||||
| Validation on the limited test set |
|
| ||||
|
|
| |||||
Note: Initial derivation and validation analyses were performed on the respective data sets (n = 1297 and 649, respectively) as summarized in Figure 2. As indicated, the final score was additionally independently validated on the full and the limited test sets (n = 682 and 219, as described in Figure 2). Indicated are the median values and the full range for the imputed data sets (in brackets). AUC values were calculated for an event within 7 days of diagnosis (“7d”) and for all time points (“all”). 95% confidence intervals (95% CI) were calculated for score predictions using bootstrapping with equal contributions of the imputed data sets. Results of the performance of the final score (median AUC and 95% CI) in the resprective validation and test datasets are highlighted in bold.
Abbreviations: AUC, area under the receiver operating characteristic (ROC) curve; imp., imputation; N pr., number of predictors in the model; pr., predictors; RF, random forest for classification.
Results of the ridge‐penalized binomial regression on the five variables selected by RF Boruta
| Predictor | Ridge |
| Weight |
|---|---|---|---|
| Age | 0.07 | 0.024 | 1 |
| Disease phase | 0.40 | 0.003 | 5 |
| Urea | 0.26 | 0.013 | 3 |
| CRP | 0.14 | 0.002 | 2 |
| D‐dimer | 0.09 | 0.041 | 1 |
Note: Indicated are β coefficients from binomial ridge regression (outcome: event within 7 days) and the resulting weights per step increase in the respective predictor group (all groups are listed in Table 4). p values were calculated using ridge regression on the derivation data set with permutations of the outcome.
Abbreviation: CRP, C‐reactive protein.
Calculation of the CAPS‐D score
| Predictor | Score | Predictor | Score |
|---|---|---|---|
| Age (year) | CRP (mg/L) | ||
| ≤25 | ‐ | <3 | ‐ |
| 26–35 | +1 | 3–29 | +2 |
| 36–45 | +2 | 30–69 | +4 |
| 46–55 | +3 | 70–119 | +6 |
| 56–65 | +4 | 120–179 | +8 |
| 66–75 | +5 | 180–249 | +10 |
| 76–85 | +6 | >249 | +12 |
| >85 | +7 | Disease phase | |
| D‐dimer (LN) | Uncomplicated | ‐ | |
| Normal | ‐ | Complicated | +5 |
| >1x, ≤2x | +1 | Urea (LN) | |
| >2x, ≤5x | +2 | <Normal | ‐ |
| >5x, ≤10x | +3 | Normal | +3 |
| >10x, ≤20x | +4 | >1x, ≤2x | +6 |
| >20x | +5 | >2x | +9 |
| Maximum score | 38 |
Abbreviations: CRP, C‐reactive protein; LN, laboratory normal range, “x” indicates multiples of the upper limit of the normal range.
Figure 3Summary of key characteristics of the score for predicting the combined endpoint of critical phase or COVID‐19‐related death (A) within 7 days of the diagnosis or (B) at any time point during follow‐up in the validation and test cohorts. Color codes distinguish the different data sets as indicated. Sensitivity and NPV are indicted by continuous lines and the corresponding y axis scaling on the left, while specificity and PPV are indicated by dashed lines and y axis scaling on the right side of the respective panels. Bottom panels show cumulative fractions of patients meeting respective score cut‐offs for the combined validation and full test set (combined n = 1331). For all panels, the median score (rounded to the next whole integer) of the imputations was calculated for patients with missing values
Score characteristics at the selected cut‐off of ≥17
| Validation set (7d/all) | Full test set (7d/all) | Combined (7d/all) | |
|---|---|---|---|
| Sensitivity | 0.73/0.73 | 0.74/0.72 | 0.74/0.73 |
| Specificity | 0.72/0.75 | 0.77/0.79 | 0.75/0.77 |
| PPV | 0.31/0.41 | 0.27/0.32 | 0.29/0.37 |
| NPV | 0.94/0.92 | 0.96/0.95 | 0.95/0.94 |
| LR+ | 2.6/2.9 | 3.3/3.3 | 2.9/3.1 |
| LR− | 0.37/0.36 | 0.34/0.36 | 0.35/0.36 |
| %score < cut‐off | 65% | 72% | 69% |
Abbreviations: 7d, event (critical disease or COVID‐19‐related death) within 7d of diagnosis; all, all events during follow‐up; LR+/−, positive/negative likelihood ratio; NPV, negative predictive value; PPV, positive predictive value; %score < cut‐off, percentage of patients with scores below the cut‐off value (≤16).