Literature DB >> 34089403

Unbiased identification of clinical characteristics predictive of COVID-19 severity.

Elliot H Akama-Garren¹, Jonathan X Li^2,3.

Abstract

There is currently limited clinical ability to identify COVID-19 patients at risk for severe outcomes. To unbiasedly identify metrics associated with severe outcomes in COVID-19 patients, we conducted a retrospective study of 835 COVID-19 positive patients at a single academic medical center between March 10, 2020 and October 13, 2020. As of December 1, 2020, 656 (79%) patients required hospitalization and 149 (18%) died. Unbiased comparisons of all clinical characteristics and mortality revealed that abnormal pH (OR 8.54, 95% CI 5.34-13.6), abnormal creatinine (OR 6.94, 95% CI 4.22-11.4), and abnormal PTT (OR 4.78, 95% CI 3.11-7.33) were most significantly associated with mortality. Correlation with ordinal severity scores confirmed these associations, in addition to associations between respiratory rate (Spearman's rho = -0.56), absolute neutrophil count (Spearman's rho = -0.5), and C-reactive protein (Spearman's rho = 0.59) with disease severity. Unsupervised principal component analysis and machine learning model classification of patient demographics, laboratory results, medications, comorbidities, signs and symptoms, and vitals are capable of separating patients on the basis of COVID-19 mortality (AUC 0.82). This retrospective analysis identifies laboratory and clinical metrics most relevant to predict COVID-19 severity.

Entities: Chemical

Keywords: COVID-19; Laboratory results; Machine learning; Prediction

Mesh：

Year: 2021 PMID： 34089403 PMCID： PMC8178667 DOI： 10.1007/s10238-021-00730-y

Source DB: PubMed Journal: Clin Exp Med ISSN： 1591-8890 Impact factor: 5.057

Introduction

As the number of COVID-19 deaths approaches 3.5 million worldwide as of May 11, 2021, there is increasing need to better understand what disease mechanisms and clinical correlates lead to poor outcomes. SARS-CoV-2 infection may result in a spectrum of severity ranging from asymptomatic disease to hospitalization requiring mechanical ventilation [1-7], making identification of patients at risk for severe COVID-19 at initial presentation imperative yet complex. Case series of hospitalized COVID-19 patients during the early pandemic identified key risk groups of severe COVID-19 [8-17], including patients with diabetes, obesity, chronic kidney disease, liver disease, and patients above 65 years old. Cytokine profiling [18] and multi-dimensional flow cytometry [19-22] have identified hematologic profiles associated with severe COVID-19. Over the course of the pandemic, these advances along with improvements in supportive care such as prone positioning [23-25] have led to reductions in disease mortality [26, 27]. Despite these advances, clinical prediction of COVID-19 prognosis at the time of initial presentation remains imperfect [28]. A better understanding of the clinical correlates of COVID-19 severity would improve prognostic and therapeutic approaches to disease assessment. With an accumulating number of SARS-CoV-2 positive patients with a range of clinical outcomes, we are increasingly able to perform unbiased analyses across more diverse multi-dimensional clinical metrics, in order to identify novel associations with COVID-19 severity. We sought to leverage these data to determine which clinical characteristics are most useful to predict COVID-19 severity. Here, we perform analyses of over 1,700 clinical metrics including laboratory results, vitals, demographics, medications, and disease outcomes in 835 COVID-19 positive patients to identify correlates of disease severity.

Methods

Study design

This study was conducted at the Beth Israel Deaconess Medical Center (BIDMC) in Boston. The BIDMC Institutional Review Board approved this retrospective cohort study (2020P000699) as minimal risk using data collected during routine clinical care and waived the requirement for informed consent. BIDMC patients who presented for care and with confirmed SARS-CoV-2 infection by positive result of nasopharyngeal sample polymerase chain reaction between March 10, 2020 and October 13, 2020, and who had available past medical history, were included. Data were obtained from the BIDMC COVID-19 Observational Research Effort (CORE) Data Registry REDCap database and BIDMC InSIGHT CORE service. Laboratory values were obtained from inpatient data acquired over the course of an individual patient’s admission. When multiple laboratory draws were present over the course of a patient’s admission, mean, maximum, and minimum laboratory values for each test collected were calculated for each patient. Time to follow-up was determined by the number of days between the earliest COVID-19 test date and date of death or December 1, 2020, the final date of follow-up, if still alive. COVID-19 severity was graded by the NIH Ordinal Severity Scale. Patients were stratified into eight groups with lower scores corresponding to greater severity: (1) death, (2) invasive mechanical ventilation, (3) noninvasive ventilation, (4) supplemental oxygen, (5) no supplemental oxygen but requiring medical care, (6) no supplemental oxygen and not requiring medical care, (7) limitation in activities, or (8) no limitation in activities.

Principal component analysis (PCA)

Outcome metrics including mortality, hospitalization length and status, ICU length and status, ventilation and renal replacement therapy requirement, NIH Ordinal Severity Score, pathology results, and medications prescribed after COVID-19 diagnosis were excluded to allow for unsupervised PCA. Patients and metrics with missing data were excluded from analysis, and categorical factor variables were converted to dummy numerical variables. Data were scaled to unit variance and principal component analysis was performed using factoextra (version 1.0.7). The top two principal components were used for two-dimensional mapping of patient data and variable eigenvectors.

Machine learning classification

Mortality status was added to the data set used for PCA to allow for construction of a supervised machine learning classifier. All machine learning analyses were performed in R (version 3.6.1). Training and test data sets were created using the createDataPartition function in caret (version 6.0), with 75% of patients allocated to the training data set. Training data were preprocessed by centering and scaling and training was performed using ten separate tenfold repeated cross-validations for resampling. A gradient boosting machine model [29, 30] was built using 100 trees, a tree complexity of 2, and a learning rate of 0.1 using the train function in caret. Training performance was measured using area under the ROC curve, and variable importance was calculated using the varImp function in caret. Model performance was tested on the test data set and evaluated using MLeval (version 0.3).

Statistical analysis

All statistical analyses were performed in R (version 3.6.1). Bar graphs and violin plots were created using ggpubr (version 0.4.0), correlation plots were created using corrplot (version 0.84), Kaplan–Meier plots were created using survminer (version 0.4.8) and survival (version 3.2–7), and scatter plots and forest plots were created using ggplot2 (version 3.3.0). Heatmaps and hierarchical clustering were performed using pheatmap (version 1.0.12). Volcano plots were generated using EnhancedVolcano (version 1.4.0), and significant differences (absolute logFC > 0.2 and P-val < 0.05) were highlighted in red. When data were missing, these patients were not included in a given univariate analysis, eliminating potential confounding due to the presence or absence of a given clinical metric. When multiple comparisons were made, p values were corrected by the Benjamini–Hochberg procedure and a false discovery rate < 0.05 was considered significant.

Results

Demographics, comorbidities, and outcomes of COVID-19 patients

A total of 835 patients with PCR confirmed SARS-CoV-2 infection were included (Table 1). The median age was 64 years (IQR, 50–76 years; range, 17–102 years) and 438 (52%) were female. Of these patients, 363 (43%) were white and 253 (30%) were black. Past medical history was available for 549 patients and among these patients, common comorbidities included hypertension (347; 63%), diabetes (224; 41%), obesity (157; 30%), chronic kidney disease (144; 26%), and cancer (131; 24%). Active prescriptions at time of COVID-19 diagnosis were available for 697 patients, and among these the most common categories of prescribed drugs included antihypertensive drugs (500; 72%), antihistamines (324; 46%), and antiglycemic drugs (241; 35%). Most patients had an elevated temperature (median Tmax 100; IQR 99–100) and were tachypneic (median 19; IQR 18–21) but had normal heart rates (median 85; IQR 76–94). As of December 1, 2020, 656 (79%) patients required hospitalization, 336 (40%) required supplemental oxygen, 310 (37%) required intensive care unit (ICU) stays, and 196 (23%) required mechanical ventilation. Among patients who were hospitalized the median total length of stay was 9 days (IQR, 2–5 days) and among patients treated in the ICU the median length of stay in the ICU was 8 days (IQR, 3–17 days). NIH Ordinal Scoring was available for 322 patients, and mean ordinal score was 3.7 (SD 1.7). Overall, 149 (18%) patients died at the time of censoring.

Table 1

Demographics, comorbidities, and outcomes of COVID-19 patients

	Overall	Alive	Dead	P value
	(N = 835)	(N = 686)	(N = 149)	P value
Gender				0.117
Female	438 (52%)	369 (54%)	69 (46%)
Male	397 (48%)	317 (46%)	80 (54%)
Age	64 (50–76)	61 (47–73)	73 (63–84)	< 0.001
Race				0.0919
Native American	1 (0%)	1 (0%)	0 (0%)
Asian	31 (4%)	23 (3%)	8 (5%)
Black	253 (30%)	206 (30%)	47 (32%)
Declined	1 (0%)	1 (0%)	0 (0%)
Native Hawaiian	2 (0%)	1 (0%)	1 (1%)
Other	90 (11%)	84 (12%)	6 (4%)
Unknown	94 (11%)	73 (11%)	21 (14%)
White	363 (43%)	297 (43%)	66 (44%)
ABO Type				0.487
A	71 (9%)	43 (6%)	28 (19%)
AB	11 (1%)	8 (1%)	3 (2%)
B	40 (5%)	29 (4%)	11 (7%)
O	104 (12%)	63 (9%)	41 (28%)
Missing	609 (72.9%)	543 (79.2%)	66 (44.3%)
BMI	29 (25–34)	29 (25–34)	30 (24–36)	0.905
Comorbidities available	549 (66%)	459 (67%)	90 (60%)
Hypertension	347 (63%)	278 (61%)	69 (77%)	0.00549
Chronic kidney disease	144 (26%)	107 (23%)	37 (41%)	< 0.001
Diabetes	224 (41%)	172 (37%)	52 (58%)	< 0.001
Obesity	167 (30%)	136 (30%)	31 (34%)	0.434
Rheumatologic disease	127 (23%)	100 (22%)	27 (30%)	0.12
Autoimmune disease	49 (9%)	43 (9%)	6 (7%)	0.535
Cancer	131 (24%)	98 (21%)	33 (37%)	0.00287
Immunosuppressive Disease	128 (23%)	103 (22%)	25 (28%)	0.338
COPD	72 (13%)	54 (12%)	18 (20%)	0.0517
Asthma	81 (15%)	66 (14%)	15 (17%)	0.691
Coronary artery disease	130 (24%)	97 (21%)	33 (37%)	0.00241
Cerebrovascular disease	67 (12%)	46 (10%)	21 (23%)	< 0.001
Medications available	697 (83%)	568 (83%)	129 (87%)
Corticosteroid	179 (26%)	140 (25%)	39 (30%)	0.231
Calcineurin inhibitors	16 (2%)	12 (2%)	4 (3%)	0.726
Antirheumatic therapy	9 (1%)	7 (1%)	2 (2%)	1
Immunosuppressive therapy	46 (7%)	32 (6%)	14 (11%)	0.0361
Chemotherapy	26 (4%)	19 (3%)	7 (5%)	0.385
Antiglycemic therapy	241 (35%)	192 (34%)	49 (38%)	0.424
Asthma therapy	227 (33%)	178 (31%)	49 (38%)	0.177
Biologics	1 (0%)	1 (0%)	0 (0%)	1
Osteoporosis therapy	13 (2%)	9 (2%)	4 (3%)	0.43
Antihypertensive therapy	500 (72%)	392 (69%)	108 (84%)	0.00119
Labs
Absolute lymphocyte count (10⁶/mL)	1.2 (0.83–1.6)	1.2 (0.89–1.6)	0.97 (0.67–1.3)	< 0.001
C-Reactive protein (mg/L)	94 (52–150)	82 (42–130)	140 (100–180)	< 0.001
Creatinine (mg/dL)	1.0 (0.73–1.7)	0.91 (0.70–1.3)	1.8 (1.1–2.8)	< 0.001
Ferritin (ng/mL)	680 (300–1500)	570 (260–1200)	1400 (570–2900)	< 0.001
D-Dimer (ng/mL FEU)	1300 (720–2700)	1100 (650–2200)	2400 (1200–4800)	< 0.001
Creatine kinase (IU/L)	150 (69–380)	140 (65–360)	170 (82–530)	0.0455
INR	1.2 (1.1–1.4)	1.2 (1.1–1.3)	1.3 (1.2–1.5)	< 0.001
Lactate dehydrogenase (IU/L)	330 (260–430)	320 (240–400)	420 (310–560)	< 0.001
pH	7.1 (6.7–7.3)	7.0 (6.5–7.3)	7.2 (7.0–7.3)	0.00105
Platelet count (10⁶/mL)	230 (180–310)	250 (190–320)	190 (140–260)	< 0.001
PT (s)	13 (12–15)	13 (12–15)	14 (13–17)	< 0.001
PTT (s)	35 (30–55)	33 (29–47)	53 (35–70)	< 0.001
Absolute neutrophil count (10⁶/mL)	5.5 (3.8–8.2)	5.0 (3.6–7.3)	7.8 (5.1–12)	< 0.001
A1c (%)	7.7 (6.4–9.3)	7.6 (6.3–9.3)	7.8 (7.2–8.8)	0.629
Vitals
Respiratory rate	19 (18–21)	19 (18–20)	23 (20–25)	< 0.001
Heart rate	85 (76–94)	84 (74–93)	89 (82–98)	< 0.001
Tmax	100 (99–100)	100 (99–100)	100 (100–100)	< 0.001
SBP (minimum)	99 (91–110)	99 (92–110)	95 (84–110)	0.0275
DBP (minimum)	57 (49–65)	57 (50–65)	54 (44–63)	0.00543
Status				< 0.001
Inpatient	656 (79%)	510 (74%)	146 (98%)
Outpatient	179 (21%)	176 (26%)	3 (2%)
Outcomes
Supplemental O₂	336 (40%)	263 (38%)	73 (49%)	0.0208
Mechanical ventilation	196 (23%)	106 (15%)	90 (60%)	< 0.001
Total encounters	1.0 (1.0–1.0)	1.0 (1.0–1.0)	1.0 (1.0–1.0)	0.261
Length admission	9.0 (5.0–18)	8.0 (4.0–19)	12 (7.0–17)	< 0.001
Ordinal score	4.0 (2.0–5.0)	4.0 (4.0–5.0)	1.0 (1.0–1.0)	< 0.001
ICU admission	133 (16%)	87 (13%)	46 (31%)	< 0.001
ICU days	8.0 (3.0–17)	8.0 (2.0–19)	9.0 (4.0–14)	< 0.001

Continuous data presented as mean (95% CI). P values computed by Chi-squared test for categorical data and Wilcoxon signed-rank test for continuous data

Demographics, comorbidities, and outcomes of COVID-19 patients Continuous data presented as mean (95% CI). P values computed by Chi-squared test for categorical data and Wilcoxon signed-rank test for continuous data

Clinical predictors of COVID-19 outcomes

To validate our ability to identify risk factors for COVID-19 severity, we compared mortality rates among currently recognized comorbidities for COVID-19 (Fig. 1A). In our cohort, hypertension (OR 2.14, 95% CI 1.27–3.60), chronic kidney disease (OR 2.30, 95% CI 1.44–3.68), cardiovascular disease (OR 2.73, 95% CI 1.54–4.84), diabetes (OR 2.28, 95% CI 1.44–3.60), coronary artery disease (OR 2.16, 95% CI 1.33–3.50), and cancer (OR 2.13, 95% CI 1.32–3.45) were associated with COVID-19 mortality. Risks for hospitalization included hypertension (OR 2.42, 95% CI 1.64–3.57), male gender (OR 1.69, 95% CI 1.21–2.38), diabetes (OR 2.17, 95% CI 1.43–3.29), chronic kidney disease (OR 2.42, 95% CI 1.44–3.68), coronary artery disease (OR 2.59, 95% CI 1.51–4.42), and COPD (OR 3.63, 95% CI 1.65–7.96), whereas risks for ICU admission only included male gender (OR 2.17, 95% CI 1.42–3.31) and diabetes (OR 2.27, 95% CI 1.35–3.81). Notably, male gender was not significantly associated with mortality among COVID-19 patients in our cohort (OR 1.35, 95% CI 0.95–1.93).

Fig. 1

Univariate analyses identify key laboratory parameters associated with mortality in COVID-19 patients. a Forest plot comparing odds ratios of selected comorbidities with mortality, hospitalization, and ICU admission in COVID-19 patients. Horizontal lines indicate 95% CI. b Volcano plots of odds ratios of laboratory results, demographics, medications, comorbidities, and signs and symptoms with mortality, hospitalization, and ICU admission in COVID-19 patients. P values corrected for multiple comparisons by Benjamini–Hochberg procedure and significant metrics (P-adj < 0.05) indicated in red. c Heatmap of adjusted p values from Mann–Whitney U tests for continuous laboratory values and demographic information between patients requiring or not requiring ICU admission, supplement oxygen, mechanical ventilation, hospitalization, and death. Metrics significantly altered between alive and dead patient cohorts are shown and arranged by increasing adjusted p value. d Violin plots of the most significantly altered clinical metrics alive and dead patient cohorts. Mann–Whitney U test p value shown In order to unbiasedly compare the relative association of clinical characteristics with COVID-19 outcomes, we calculated the odds ratios among binary categorical clinical metrics measured, including laboratory results, demographics, medications, comorbidities, and signs and symptoms (Fig. 1B). Mortality was most significantly associated with abnormal pH (OR 8.54, 95% CI 5.34–13.6), abnormal creatinine (OR 6.94, 95% CI 4.22–11.4), and abnormal PTT (OR 4.78, 95% CI 3.11–7.33). Hospitalization was most significantly associated with abnormal D-dimer (OR 8.87, 95% CI 4.18–18.8), NSAID use (OR 0.24, 95% CI 0.15–0.38), and abnormal C-reactive protein (OR 6.43, 95% CI 3.30–12.5), and ICU admission was associated with requiring supplemental oxygen at admission (OR 8.34, 95% CI 4.91–14.1), abnormal pH (OR 13.1, 95% CI 7.71–22.5), and abnormal PTT (OR 7.36, 95% CI 4.42–12.2). We next sought to compare the relative association between continuous variables and COVID-19 outcomes. Mann–Whitney U tests between mortality and laboratory values and demographic information revealed that elevated creatinine was most significantly associated with mortality (average maximum creatinine 3.97 in dead vs 1.97 in alive, adjusted P-val < 2 × 10–16) (Fig. 1C). Other significant associations with mortality included decreased albumin (average minimum albumin 2.50 in dead vs 3.22 in alive, adjusted P-val < 2 × 10–16), decreased lymphocyte count (average minimum lymphocytes 7.53 in dead vs 13.56 in alive, adjusted P-val < 2 × 10–16), elevated phosphate (average maximum phosphate 6.50 in dead vs 4.61 in alive, adjusted P-val < 2 × 10–16), and older age (average age 71.9 years in dead vs 59.5 in dead, adjusted P-val = 8.6 × 10–16) (Fig. 1D). Comparisons in hospitalization, ventilation, oxygen requirement, and ICU admission patient groups revealed similar associations between abnormal creatinine, albumin, lymphocytes, and phosphate and COVID-19 outcomes (Fig. 1C). These results suggest that laboratory abnormalities might be more informative in predicting outcomes from COVID-19 than patient demographic information including comorbidities. To quantify and rank the effects of clinical metrics on time to death following COVID-19 diagnosis, we performed Kaplan–Meier analysis of patient survival using positive COVID-19 test date and date of death. Among the 149 (18%) of patients that died, the median survival time after COVID-19 diagnosis was 13 days (IQR, 7–28 days) (Fig. 2A). Regression analysis of demographics, laboratory results, medications, comorbidities, and vitals against survival probability revealed that abnormal pH (HR 6.5, 95% CI 4.2–10), stratified age groups (HR = 1.5, 95% CI 1.3–1.7), abnormal albumin (HR 3.6, 95% CI 2.4–5.5), and abnormal phosphate (HR 4.7, 95% CI 2.7–8.1) were most significantly associated with increased risk of COVID-19 death (Fig. 2B). These risks are greater than those associated with currently accepted comorbidities for severe COVID-19 in our cohort, such as hypertension (HR 2.0, 95% CI 1.2–3.3), diabetes (HR 2.1, 95% CI 1.4–3.3), and chronic kidney disease (HR 2.2, 95% CI 1.4–3.3) (Fig. 2C). Both race (HR 0.99, 95% CI 0.92–1.1) and gender (HR 1.3, 95% CI 0.91–1.7) were not significantly associated with decreased survival following COVID-19 diagnosis in our cohort.

Fig. 2

Unbiased identification of metrics most associated with increased risk of dying following COVID-19 diagnosis. A Kaplan–Meier plot of patient survival following COVID-19 diagnosis. B Volcano plot of hazard ratios (HR) calculated from unbiased Cox regression analysis between all measured patient metrics and patient survival following COVID-19 diagnosis. P values were calculated using the Wald test statistic and corrected for multiple comparisons by Benjamini–Hochberg procedure. Significant metrics (P-adj < 0.05) indicated in red. C Kaplan–Meier plots of patient survival following COVID-19 diagnosis stratified by indicated patient demographic or laboratory result. Log rank test p value indicated on plots and 95% CI indicated by shading

Clinical correlates of COVID-19 severity

To examine associations between clinical metrics and COVID-19 severity beyond binary categorical outcomes, we measured the correlation of each metric with NIH ordinal severity scores and total length of stay per patient (Fig. 3A). Ordinal score was most significantly correlated with maximum respiratory rate (Spearman’s rho = −0.56), maximum absolute neutrophil count (Spearman’s rho = −0.5), maximum C-reactive protein (Spearman’s rho = −0.52), and minimum albumin (Spearman’s rho = 0.5) (Fig. 3B). The total length of admission was most significantly correlated with maximum temperature (Spearman’s rho = 0.62), maximum phosphate (Spearman’s rho = 0.60), minimum hemoglobin (Spearman’s rho = −0.58), and minimum systolic blood pressure (Spearman’s rho = −0.53) (Fig. 3C). These results confirm our previous findings, suggesting that hematologic laboratory results are not only indicative of mortality in COVID-19 patients, but are also correlated with disease severity. These results also quantify the relative association of vitals such as respiratory rate and temperature with COVID-19 severity.

Fig. 3

Correlation between continuous clinical metrics and COVID-19 severity. A Ranked order plots of Spearman correlation coefficients between all clinical metrics and NIH ordinal score and total length of admission. Selected significant associations indicated on plot. B–C Scatter plots of correlation of selected clinical metrics and NIH ordinal score (B) or length of admission (C). Spearman correlation coefficient and p value indicated on plot, and regression line and 95% confidence interval indicated in blue To determine relationships between multiple categorical and numerical outcomes and metrics, we performed correlation analysis across patient demographics, selected laboratory results, medications, comorbidities, vitals, and outcomes including continuous metrics of COVID-19 severity (Fig. 4). In addition to the associations noted previously, this analysis revealed significant correlations between COVID-19 outcomes and clinical interventions such as ICU admission and mechanical ventilation. As expected, comorbidities were highly correlated with prescriptions for appropriate medications (e.g., diabetes and antiglycemic drugs) as well as corresponding laboratory results (e.g., chronic kidney disease and mean creatinine). Notably, comorbidities were more closely associated with corresponding medications than COVID-19 outcomes, whereas laboratory values and vitals were more closely associated with COVID-19 outcomes than corresponding comorbidities. Overall, this correlation analysis revealed the heterogeneity of COVID-19 patient presentation, and the relative utility of a spectrum of patient information in predicting COVID-19 severity.

Fig. 4

Correlation analysis reveals heterogeneity and associations among COVID-19 patient characteristics and outcomes. Correlation plot of Spearman correlation coefficients between indicated clinical metrics and measures of disease outcomes among COVID-19 patients. Matrix display order was determined by angular order of eigenvectors. *P < 0.05, **P < 0.01, ***P < 0.001

Principal component analysis and machine learning classification segregates COVID-19 patients by mortality

To determine whether COVID-19 patients can be stratified by severity based on clinical metrics typically present at admission to the emergency department, we performed unsupervised principal component analysis (PCA). We excluded metrics of COVID-19 outcomes and severity and metrics that would not be known at admission, such as pathology results and medications placed after COVID-19 diagnosis. Only patients for whom full demographic, laboratory, medication history, comorbidities, past medical history, and vitals were available were included, leaving 237 metrics across 209 patients. PCA distilled these 237 metrics into two dimensions, which were most defined by immunosuppression and anemia in Dimension 1, and by AST, LDH, ALT, and ferritin in Dimension 2 (Fig. 5A). The eigenvectors for mean AST and maximum ferritin were orthogonal to the eigenvector for immunosuppression (Fig. 5B), suggesting that these metrics capture independent meta-characteristics of COVID-19 patients.

Fig. 5

Multivariate analyses segregate COVID-19 patients by disease severity. A Bar plot indicating contributions of the top ten metrics to the top two principal components identified by unsupervised principal component analysis (PCA) of COVID-19 patients. B Biplot of principle component scores of COVID-19 patients (dots) and variable loadings (vectors). The top four metrics with the greatest contribution to variability are shown. C PCA plots of COVID-19 patients according to the top two principal components and colored according to the indicated metric. D Receiver operator curve (left) and calibration plot (right) to assess ability of a supervised gradient boosting machine model to classify COVID-19 patient mortality using demographic, laboratory, medication history, comorbidities, past medical history, and vitals. Classification performance assessed by area under the curve (AUC). E Ranked plot of the importance scores of top 20 clinical metrics in the machine learning classifier constructed in (D) We next plotted the 209 patients present in our PCA in two-dimensional space. There was no clear distribution of COVID-19 patients in PCA space on the basis of demographic information such as gender, race, and age (Fig. 5C). However, when we visualized mortality, which was not a variable included in our PCA, there was a separation among COVID-19 patients in PCA space. Similar trajectories could be appreciated for COVID-19 severity and outcomes metrics, such as length of stay, mechanical ventilation requirement, and ordinal score (Fig. 5C). Trajectories of COVID-19 severity in PCA space were orthogonal to the eigenvector for immunosuppression, suggesting that although immunosuppression contributes to variability among COVID-19 patients, it likely does not contribute to disease severity. Given our ability to segregate patients by COVID-19 severity using unsupervised PCA, we next sought to design a machine learning classifier to predict patient mortality. Using mortality in addition to the 237 variables used for PCA above, we partitioned our COVID-19 patient cohort into a training set of 157 patients and a test set of 52 patients. The training set of patients was used to build a supervised gradient boosting machine model to classify patient mortality. Our model achieved a sensitivity of 0.53 (95% CI 0.39–0.67), specificity of 0.88 (95% CI 0.81–0.93), and area under curve (AUC) for the ROC curve of 0.87 (95% CI 0.80–0.94) based on the training data (Fig. 5D). When applied to the test set, our model correctly identified 6 of 15 patients who died following COVID-19 diagnosis, achieving an accuracy of 0.77 (95% CI 0.63–0.87), a sensitivity of 0.92, specificity of 0.40, and AUC ROC of 0.82. Variable importance scores extracted from the gradient boosting machine model revealed that absolute neutrophil count, PTT, and patient age were the most contributory to model prediction (Fig. 5E). Together our PCA and machine learning classifier suggest that COVID-19 severity and outcomes can be correlated with clinical characteristics known at the time of admission and confirm the importance of laboratory data over demographic information in predicting disease outcome.

Discussion

Here, we unbiasedly profile over 1700 unique clinical metrics in 835 COVID-19 patients to identify correlates of disease outcomes and severity. We observed similar odds ratios for COVID-19 mortality risk from comorbidities previously reported, such as increased age [11, 17, 31–33], hypertension [12], diabetes [8, 11–13], and chronic kidney disease [16]. Univariate, correlation, and multivariate analyses revealed strong associations between key laboratory parameters and COVID-19 severity. Several of these associations have been previously reported, such as elevated creatinine [34], decreased lymphocyte count [19, 20], elevated CRP [34], decreased hemoglobin [20], abnormal pH [35], decreased albumin [36], and elevated PTT [20]. Notably, through unbiased comparisons across all clinical metrics, we observed that these laboratory abnormalities are more strongly associated with mortality in COVID-19 patients than patient age, gender, comorbidities, or prescribed medications. As this was a retrospective cohort study of associations with COVID-19 outcomes, it remains unclear whether the metrics identified here predispose patients to worse outcomes or are a consequence of severe COVID-19 itself. Abnormal pH and increased respiratory rate in patients with severe COVID-19 is likely reflective of the eventual acute respiratory distress syndrome and tissue malperfusion experienced by these patients [5], whereas the elevated inflammatory markers we observed are characteristic of the systemic inflammation observed in some case of severe COVID-19 [3, 37, 38]. Some laboratory perturbations such as prolonged PTT might reflect interventions employed preferentially in COVID-19 patients such as anticoagulants. Other laboratory parameters such as decreased lymphocytes and albumin might represent a unique inflammatory phenotype that predisposes patients to severe COVID-19 [19]. Regardless of the root cause of the clinical associations we describe, we have identified key clinical metrics that may be obtained at emergency department admission to identify overall risk for COVID-19 mortality. We observed a mortality rate of 18% and hospitalization rate of 79%, in contrast to currently estimated case fatality rates of 0.9–7.2% [17, 33, 39, 40] for SARS-CoV-2. This is likely due to sampling bias as only patients who sought care at an academic medical center, obtained a laboratory confirmed COVID-19 diagnosis, and had available medication or past medical history were included. Alternatively, this might reflect the evolving mortality rate of the course of this pandemic, as our ability to diagnose and treat COVID-19 has improved the past year [41]. Nevertheless, a range of clinical presentations and disease severity scores are represented in our patient cohort, including outpatients and patients with asymptomatic disease. COVID-19 remains a great threat to society relative to other respiratory viral diseases due to its case fatality rate and its striking range of clinical presentations and severity [17, 42, 43]. This study offers an unbiased retrospective approach to identify potential associations with this fatality rate and spectrum of disease severity. Our data suggest that increased absolute neutrophil count, decreased albumin, and decreased lymphocytes are key correlates of severe COVID-19 and are clinical characteristics available at initial admission that might be informative of disease prognosis. By identifying which COVID-19 patients are most at risk for severe disease, we may be better able to provide early and targeted therapeutic interventions, thereby combatting the current pandemic in an orthogonal but complementary approach to the preventative approaches currently being pursued across the world. Below is the link to the electronic supplementary material. Supplementary file1 (DOCX 23 kb)

40 in total

1. An inflammatory cytokine signature predicts COVID-19 severity and survival.

Authors: Diane Marie Del Valle; Seunghee Kim-Schulze; Hsin-Hui Huang; Noam D Beckmann; Sharon Nirenberg; Bo Wang; Yonit Lavin; Talia H Swartz; Deepu Madduri; Aryeh Stock; Thomas U Marron; Hui Xie; Manishkumar Patel; Kevin Tuballes; Oliver Van Oekelen; Adeeb Rahman; Patricia Kovatch; Judith A Aberg; Eric Schadt; Sundar Jagannath; Madhu Mazumdar; Alexander W Charney; Adolfo Firpo-Betancourt; Damodara Rao Mendu; Jeffrey Jhang; David Reich; Keith Sigel; Carlos Cordon-Cardo; Marc Feldmann; Samir Parekh; Miriam Merad; Sacha Gnjatic
Journal: Nat Med Date: 2020-08-24 Impact factor: 53.440

2. Feasibility and physiological effects of prone positioning in non-intubated patients with acute respiratory failure due to COVID-19 (PRON-COVID): a prospective cohort study.

Authors: Anna Coppo; Giacomo Bellani; Dario Winterton; Michela Di Pierro; Alessandro Soria; Paola Faverio; Matteo Cairo; Silvia Mori; Grazia Messinesi; Ernesto Contro; Paolo Bonfanti; Annalisa Benini; Maria Grazia Valsecchi; Laura Antolini; Giuseppe Foti
Journal: Lancet Respir Med Date: 2020-06-19 Impact factor: 30.700

3. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors: Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal: Lancet Date: 2020-01-24 Impact factor: 79.321

4. Compromised Humoral Functional Evolution Tracks with SARS-CoV-2 Mortality.

Authors: Tomer Zohar; Carolin Loos; Stephanie Fischinger; Caroline Atyeo; Chuangqi Wang; Matthew D Slein; John Burke; Jingyou Yu; Jared Feldman; Blake Marie Hauser; Tim Caradonna; Aaron G Schmidt; Yongfei Cai; Hendrik Streeck; Edward T Ryan; Dan H Barouch; Richelle C Charles; Douglas A Lauffenburger; Galit Alter
Journal: Cell Date: 2020-11-03 Impact factor: 41.582

5. Risk Factors Associated With In-Hospital Mortality in a US National Sample of Patients With COVID-19.

Authors: Ning Rosenthal; Zhun Cao; Jake Gundrum; Jim Sianis; Stella Safo
Journal: JAMA Netw Open Date: 2020-12-01

6. Clinical characteristics of novel coronavirus cases in tertiary hospitals in Hubei Province.

Authors: Kui Liu; Yuan-Yuan Fang; Yan Deng; Wei Liu; Mei-Fang Wang; Jing-Ping Ma; Wei Xiao; Ying-Nan Wang; Min-Hua Zhong; Cheng-Hong Li; Guang-Cai Li; Hui-Guo Liu
Journal: Chin Med J (Engl) Date: 2020-05-05 Impact factor: 2.628

7. Predictors of in-hospital COVID-19 mortality: A comprehensive systematic review and meta-analysis exploring differences by age, sex and health conditions.

Authors: Arthur Eumann Mesas; Iván Cavero-Redondo; Celia Álvarez-Bueno; Marcos Aparecido Sarriá Cabrera; Selma Maffei de Andrade; Irene Sequí-Dominguez; Vicente Martínez-Vizcaíno
Journal: PLoS One Date: 2020-11-03 Impact factor: 3.240

8. Obesity and Mortality Among Patients Diagnosed With COVID-19: Results From an Integrated Health Care Organization.

Authors: Sara Y Tartof; Lei Qian; Vennis Hong; Rong Wei; Ron F Nadjafi; Heidi Fischer; Zhuoxin Li; Sally F Shaw; Susan L Caparosa; Claudia L Nau; Tanmai Saxena; Gunter K Rieg; Bradley K Ackerson; Adam L Sharp; Jacek Skarbinski; Tej K Naik; Sameer B Murali
Journal: Ann Intern Med Date: 2020-08-12 Impact factor: 25.391

9. Outcomes and Mortality Among Adults Hospitalized With COVID-19 at US Medical Centers.

Authors: Ninh T Nguyen; Justine Chinn; Jeffry Nahmias; Sarah Yuen; Katharine A Kirby; Sam Hohmann; Alpesh Amin
Journal: JAMA Netw Open Date: 2021-03-01

10. Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention.

Authors: Zunyou Wu; Jennifer M McGoogan
Journal: JAMA Date: 2020-04-07 Impact factor: 56.272

1 in total

Review 1. Impact of asthma on COVID-19 mortality in the United States: Evidence based on a meta-analysis.

Authors: Xueya Han; Jie Xu; Hongjie Hou; Haiyan Yang; Yadong Wang
Journal: Int Immunopharmacol Date: 2021-11-22 Impact factor: 4.932

1 in total