Literature DB >> 35958514

Quantitative chest computed tomography combined with plasma cytokines predict outcomes in COVID-19 patients.

Guillermo Carbonell^1,2,3,4, Diane Marie Del Valle^5,6,7, Edgar Gonzalez-Kozlova^5,6,7,8, Brett Marinelli¹, Emma Klein², Maria El Homsi^1,9, Daniel Stocker^2,10, Michael Chung¹, Adam Bernheim¹, Nicole W Simons⁸, Jiani Xiang¹¹, Sharon Nirenberg^8,11, Patricia Kovatch^8,11, Sara Lewis^1,2, Miriam Merad^5,6,7, Sacha Gnjatic^5,6,7,12, Bachir Taouli^1,2,7.

Abstract

Despite extraordinary international efforts to dampen the spread and understand the mechanisms behind SARS-CoV-2 infections, accessible predictive biomarkers directly applicable in the clinic are yet to be discovered. Recent studies have revealed that diverse types of assays bear limited predictive power for COVID-19 outcomes. Here, we harness the predictive power of chest computed tomography (CT) in combination with plasma cytokines using a machine learning and k-fold cross-validation approach for predicting death during hospitalization and maximum severity degree in COVID-19 patients. Patients (n = 152) from the Mount Sinai Health System in New York with plasma cytokine assessment and a chest CT within five days from admission were included. Demographics, clinical, and laboratory variables, including plasma cytokines (IL-6, IL-8, and TNF-α), were collected from the electronic medical record. We found that CT quantitative alone was better at predicting severity (AUC 0.81) than death (AUC 0.70), while cytokine measurements alone better-predicted death (AUC 0.70) compared to severity (AUC 0.66). When combined, chest CT and plasma cytokines were good predictors of death (AUC 0.78) and maximum severity (AUC 0.82). Finally, we provide a simple scoring system (nomogram) using plasma IL-6, IL-8, TNF-α, ground-glass opacities (GGO) to aerated lung ratio and age as new metrics that may be used to monitor patients upon hospitalization and help physicians make critical decisions and considerations for patients at high risk of death for COVID-19.

Entities: Chemical

Keywords: COVID-19; Chest CT; Cytokines; Radiology; SARS-CoV-2

Year: 2022 PMID： 35958514 PMCID： PMC9356575 DOI： 10.1016/j.heliyon.2022.e10166

Source DB: PubMed Journal: Heliyon ISSN： 2405-8440

Introduction

Despite extraordinary international efforts to reduce the spread of SARS-CoV-2, new variants continue to appear and outbreaks persist. Vaccine coverage remains uneven, and the virus is expected to circulate throughout the global population, possibly for years to come [1, 2]. The exceptional burden placed on healthcare resources by the SARS-CoV-2 pandemic has highlighted the need for accurate predictors of disease outcomes to enable effective patient and resource management. Early identification of patients at risk for severe disease may ensure healthcare resources are properly allocated to provide maximal benefit to the population. The World Health Organization (WHO) clinical management guidelines suggest that imaging, including computed tomography (CT), may help diagnose and assess complications in COVID-19 patients. CT imaging provides a noninvasive, rapid method for assessing COVID-19 disease severity and diagnosing complications such as pulmonary embolism (using CT angiogram) [3, 4]. Patients infected with SARS-CoV-2 may present abnormal chest CT findings, including ground-glass opacity (GGO) and local or bilateral consolidation within 1–3 weeks of infection [5]. Chest CT has played a critical role in the clinical care of COVID-19 patients, specifically in patient stratification and prognosis when discrepancies between clinical and chest X-rays are noted [6]. CT qualitative scores, based on the radiologist's evaluation of the images and CT quantitative methods, have been used to calculate pneumonia lesions burden and degree of lung involvement as potential imaging biomarkers [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]. Furthermore, pneumonia lesion assessment on CT images has been shown to be predictive of severity and outcomes, potentially increasing patient stratification accuracy [4, 8, 9, 10, 11, 12, 13, 15, 16, 17]. The hyperinflammatory response associated with COVID-19 is known to be a major contributor of disease severity and mortality [20]. We previously established the prognostic and predictive value of measuring cytokine levels (IL-6, IL-8, and TNF-α) in the blood of COVID-19 patients upon presentation [21]. These cytokines are independently predictive of survival and of serious disease outcomes, including acute respiratory distress syndrome (ARDS) and multi-organ failure linked to cytokine release storms [21, 22]. Integrating chest CT findings with clinical data and laboratory tests including CRP, procalcitonin, lymphocyte, and neutrophil counts has shown promising results in predicting COVID-19 outcomes [10, 14, 15, 16, 19]. However, the AUC for these predictors range widely and may also be limited by biases in patient selection, model overfitting, and sometimes unclear methods [23]. In an effort to develop a robust tool for patient risk stratification and care prioritization, we selected maximum disease severity during hospitalization and hospital death as appropriate outcomes. We hypothesized that a combination of plasma cytokines and CT measurements would have higher predictive power of COVID-19 outcomes than either method independently. To test this hypothesis, we used our previously developed cytokine panels [21] in combination with CT measurements [10, 11, 12, 13, 14, 15, 16, 17, 18, 19] to compare and evaluate the performance of predictive models for death and maximum severity. Here, we applied a data-driven machine learning approach to test whether a combination of plasma cytokines and CT measurements outperform each method alone. Additionally, we also built a nomogram predicting risk of COVID-19 related death using a combination of plasma cytokines and CT variables.

Methods

Cohort design

This retrospective single-center study was approved by the Mount Sinai Health System Institutional Review Board (IRB STUDY: 20–00429). A waiver of informed consent was granted by the IRB to query patients’ electronic medical records (EMRs). All research methods were carried out in accordance with relevant human subject research guidelines and regulations. Between March and September 2020, 207 patients who presented to the Emergency Department of the Mount Sinai Health System with suspicion of COVID-19 underwent a PCR for SARS-CoV-2 infection, an ELLA panel for plasma cytokines, and a chest CT to evaluate for pneumonia lesions and/or to rule out pulmonary embolism. These 207 patients were identified by querying the EPIC EMR in collaboration with the Mount Sinai Data Warehouse. Samples for the RT–PCR SARS-CoV-2 lab test were collected via nasopharyngeal or oropharyngeal swab at one of 53 different Icahn School of Medicine at Mount Sinai (ISMMS) locations, representing outpatient, urgent care, emergency, and inpatient facilities. Blood specimens for ELLA were collected via venipuncture. Chest CT scans were all performed within the Mount Sinai Health System. All specimens and imaging were collected and tested as part of standard of care. The initial cohort of 207 patients underwent further evaluation by manual chart review. Inclusion criteria for this study were: (1) hospitalized for COVID-19, (2) plasma cytokine assessment within 48 h upon hospital admission, and (3) chest CT up to five days apart from plasma cytokine assessment. We excluded patients with (1) time gap between plasma cytokine assessment and chest CT greater than five days (n = 41), (2) CT scans with severe artifacts (n = 12), and (3) patients with acute conditions overlapping with COVID-19 that may have affected the cytokine assessment (n = 2, a patient with acute pancreatitis, and a patient with acute cholecystitis who underwent laparoscopic cholecystectomy). Thus, our final cohort included 152 patients (Figure 1).

Figure 1

Flowchart of the cohort design. From the initial population on 207 patients who met our inclusion criteria, 152 were used in the following analysis.

Clinical and laboratory data

Demographic and clinical data was extracted from the Epic electronic health record (Verona, WI) for the identified patients using Epic Hyperspace (August 2019), Epic Clarity (February 2020) and Epic Caboodle (February 2020) databases via connecting to Oracle (18c Enterprise Edition Release 18.0.0.0.0) and SQL server (Microsoft SQL Server 2016 (SP2-CU11) (KB4527378) - 13.0.5598.27 (X64)) databases, respectively. Additional data elements included lab results, vital signs, need for O2 therapy, O2 saturation, chest imaging reports, clinical outcomes, and medications administered during hospitalization. Data was merged from the various data sources using R version 3.6.1 (Vienna Austria). The tables were read-in and written using R packages tidyverse (v 1.3), reshape2 (v 1.4), rms (v 6.1), glmnet (4.1), ggplot2 (v 2.0) (29) and readxl (v 1.3.1). The date of first symptoms was extracted from the electronic records via manual chart review by two independent investigators (G.C., and D.M.D.V).

CT image acquisition

Chest CT studies were performed using a variety of vendors and systems: SOMATOM Definition AS (Siemens Healthineers, Erlangen, Germany [n = 50]); SOMATOM Edge Plus (Siemens Healthineers, Erlangen, Germany [n = 35]); SOMATOM Perspective (Siemens Healthineers, Erlangen, Germany [n = 2]); LightSpeed VCT (GE Healthcare, Chicago, United States [n = 33]); Revolution HD (GE Healthcare, Chicago, United States [n = 13]); Revolution EVO (GE Healthcare, Chicago, United States [n = 6]); and Aquilion Prime (Canon Medical Systems, Otawara, Japan [n = 13]). A non-contrast chest CT was performed on patients with COVID-19 symptoms to evaluate for potential pneumonia lesions (n = 46), and a chest CT angiogram with iodine contrast [100–200 mL of Iopamidol (Isovue, Bracco Diagnostics), depending on patient's weight, administered by bolus injection] was performed in patients in whom pulmonary embolic disease was suspected (n = 106). CT acquisition parameters are listed in Supplementary Table 1.

CT qualitative score

Image analysis was performed by two independent experienced readers (M.C. and A.B., fellowship-trained cardio-thoracic radiologists, both with five years of experience) who were blinded to the clinical and laboratory data, but aware of COVID-19 diagnosis. Each reader assessed half of the cohort, with an additional overlap of 40 cases to assess inter-observer reproducibility. A CT qualitative score was calculated according to the percentage of lung parenchyma affected by ground-glass opacities (GGO) and/or consolidations [24, 25]. Each lobe was classified as: none (0%), minimal (1–25%), mild (26–50%), moderate (51–75%), and severe (76–100%) involvement. Lung lobes with no involvement were scored as 0, minimal involvement as 1, mild involvement as 2, moderate involvement as 3, and severe involvement as 4. An overall CT qualitative score was obtained by summing the ordinal value for each of the five lung lobes and yielding a final score of 0–20. Additionally, the readers assessed several qualitative variables in accordance with the Fleischner society definitions [25] (Supplementary Table 2).

CT quantitative assessment

Using the open-source software 3D slicer (www.slicer.org) [26] and the Chest Imaging Platform plug-in (chestimagingplatform.org), semiautomated segmentation of the lungs and intensity thresholding was performed to define the following lung regions: (1) aerated lung: <-500HU (Hounsfield Units); (2) GGOs: -500 to -100HU; and (3) consolidations: -100 to 100HU. Mediastinum, hilar structures, and pleural effusions were not included in the segmentations. All segmented images were reviewed by a single observer (G.C., a post-doctoral fellow with eight years of experience) to evaluate the segmentation task, and manual corrections were performed if needed. A second reader (B.M., PGY-5 radiology resident) performed manual corrections on 15 patients randomly selected to assess inter-observer variability. Both observers were blinded to outcome. This quantitative analysis yielded the following variables: (1) total lung volume (mL); (2) well-aerated lung volume (mL); (3) GGO volume (mL); (4) consolidation volume (mL); and (5) GGO to aerated lung ratio.

ELLA cytokine platform

The ELLA platform is a rapid cytokine detection system based on four parallel singleplex microfluidics ELISA assays run in triplicate within cartridges following the manufacturer's instructions. In March 2020, as the number of COVID-19 cases was increasing in New York City, we transferred the ELLA methodology to the Clinical Laboratories at Mount Sinai Hospital, which allowed the ELLA cytokine test to be coded into our electronic health record ordering system as part of a COVID-19 diagnostic panel. This panel measures key cytokines, IL-6, IL-8 and TNF-α used for predicting patient outcomes in the setting of COVID-19 [21]. When binarizing data for each cytokine level as high vs. low, we applied previously established cutoffs using the following thresholds above which a cytokine was considered high (in pg. ml−1): >70 for IL-6, >50 for IL-8, and >35 for TNF-α [21]. All patients who died (n = 26) had at least one cytokine level above the threshold to be considered “high”, while 85.71% of individuals who survived had at least one cytokine level above the threshold (Table 1).

Table 1

Number of patients above and below cytokine cut-off value by survival status.

	Cutoff (pg ml−1)	Alive	Deceased
IL-6	>70	50	19
IL-6	≤70	76	7
IL-8	>50	36	18
IL-8	≤50	90	8
TNF-ɑ	>35	11	10
TNF-ɑ	≤35	115	16
IL-1β	>0.5	42	11
IL-1β	≤0.5	40	8

Number of patients above and below cytokine cut-off value by survival status.

Study endpoints

The endpoints were: (1) COVID-19-related death during hospitalization (hospital death) and (2) maximum severity score attained during hospitalization. We applied the WHO ordinal scale (from 0 to 7 points) to assess disease severity (prior to death) [27].

Statistical analysis

To test associations with outcomes, we performed Wilcoxon rank, Fisher exact, Spearman correlation [28] and Fisher independence tests [29] for each variable. Next, to assess the probability of survival, we performed univariate and multivariate Cox proportional hazard models for cytokines (IL-6, IL-8, TNF-α), demographic variables (age, sex, race/ethnicity), BMI, minimum O2 saturation upon presentation, CT qualitative score and CT quantitative variables in the coxph and survminer package on R[30]. The variables IL-6, IL-8 and TNF-α, were converted to binary variables based on thresholds previously described in the methods section [21]. The threshold used for accepting null hypothesis was set to adjusted p-value<0.05 after false discovery rate correction for multi-observation correction. Additionally, for Fisher independence test, we binned numeric variables into four quartiles. To further simplify the severity endpoint, we binned patients into mild (3–4) and severe (5–7) groups according to the WHO ordinal scale. Maximum severity was set as the highest degree of severity at any point during hospitalization. Severity was capped at 7, prior to death when applicable.

Model building, prediction performance and k-fold cross-validation

We used elastic net regression available on glmnet package [31] to build a predictive model of outcomes: (1) death during hospitalization and (2) maximum severity score. To validate the predictions, we used a combination of k-fold cross-validation and testing/training proportions. Briefly, we maintained the proportions of cases/controls while randomly selecting 100 times samples for each of the ratios: 9/1, 8/2, 7/3, 6/4 and 5/5 of training and testing, respectively. Next, within each randomly selected subset we tested one model for each, mixing coefficient for ridge and lasso regression (alpha values of 0 or ridge to 1 or lasso), producing 500 significant (p-value <0.05) models per scenario. Additionally, we filtered variables that had a statistically significant (p-value<0.05) effect using elastic net regression coefficient selection. Model performance was evaluated using ROC analysis [32]. The comparisons between different models’ AUCs were done using Wilcoxon rank sum test, bootstrap and DeLong statistics on default settings available in the pROC package [33]. Finally, the variables defined by the optimized model were used to build a nomogram by translating the model statistics into probabilities using the rms package [34, 35].

Results

Cohort characteristics

We obtained health information, imaging, and laboratory results as part of standard clinical care from 152 patients with confirmed COVID-19 diagnosis as seen at the Mount Sinai Health System. The median time from hospital admission to chest CT was 0.72 days (IQR 0.0–1.0) and to cytokine testing was 0.28 days (IQR 0.0–1.1). The median time between chest CT and cytokine testing was 1.0 days (IQR 0.0–2.0). The hospital mortality rate for our cohort was 17.1% (Table 2). Additional patient characteristics and clinical features are listed in Table 2. The median levels for IL-6, IL-8, and TNF-α, upon presentation, were 61 pg/mL, 35.0 pg/mL, and 18.5 pg/mL, respectively (Table 3). Patients who died had a higher maximum severity score (WHO ordinal scale), and lower O2 saturation at presentation compared to those who survived (adjusted p-value <0.0001) (Figure 4 & Suppl. Table 4). There were no significant differences in sex (adjusted p-value = 0.91), race (adjusted p-value = 0.39), ethnicity (adjusted p-value = 0.97), or age (adjusted p-value = 0.32) between patients who died vs. those who survived (by Wilcoxon rank sum test) (Suppl. Figure 1, Suppl. Table 3).

Table 2

Patient demographics, clinical and outcome data (n=152). Data are numbers of patients with percentages between parentheses.

Median age (IQR) - years	61 (48–71)
Sex (Male)	90 (59.2%)
Race/ethnicity
Hispanic	37/152 (24.3%)
African American	35/152 (23.0%)
Asian	12/152 (8.0%)
White	35/152 (23.0%)
Other	70/152 (46.1%)
Obesity (BMI ≥30)	51 (33.6%)
Oxygen saturation at presentation
Normal (³95%)	50/152 (32.9)
Abnormal (<95%)	102/152 (67.1)
Comorbidities
Asthma	16/151 (10.6%)
Atrial Fibrillation	12/151 (7.94%)
Cancer (active)	26/151 (17.2%)
Chronic Kidney Disease	16/151 (10.6%)
Congestive Heart Failure	14/151 (9.27%)
COPD	13/151 (8.61%)
Diabetes	43/151 (28.5%)
HIV	6/151 (4.00%)
Hypertension	52/151 (34.4%)
Obstructive Sleep Apnea	5/151 (3.31%)
Smoking
Current	12 (9.09%)
History	39 (61.4%)
Symptoms
Anosmia/Ageusia	1/152 (0.658%)
Congestion/Runny Nose	10/152 (6.58%)
Cough	78/152 (51.3%)
Diarrhea	26/152 (17.1%)
Fatigue	49/152 (32.2%)
Fever	83/152 (55%)
Headache	9/152 (5.92%)
Myalgias	20/152 (13.2%)
Nausea/Vomiting	42/152 (27.6%)
Shortness of breath	105/152 (69.1%)
Sore throat	4/152 (2.63%)
Worst WHO score achieved (capped at 7)
Mild (3–4)	78/150 (52.0%)
Severe (5–7)	72/150 (48%)
Clinical Outcomes
ICU admission	43/152 (65.4%)
Acute Respiratory Distress Syndrome	10/151 (15.1%)
Died during hospital admission	26/152 (17.1%)

Table 3

Cytokine assessment, CT qualitative score and CT quantitative analysis. Data are medians with interquartile ranges (IQR) in parentheses. GGO: ground-glass opacities.

Median and IQR
Cytokine assessment
IL-6 (pg/mL)	61.0 (22.8–146.3)
IL-8 (pg/mL)	35.0 (20.0–59.9)
TNF-α (pg/mL)	18.5 (13.0–27.4)
CT qualitative score	9 (5–14)
CT quantitative analysis
Total lung volume (mL)	2713 (2081–3505)
Well aerated lung volume (mL)	1982 (1420–2903)
GGO volume (mL)	344.5 (208.2–509.6)
Consolidation volume (mL)	98.20 (49.68–252.6)
GGO/well-aerated lung ratio	0.1774 (0.0767–0.3673)

Figure 4

Overview of the computational analysis. (A) Shows starting point composed by 4 different scenarios (cytokines, CT qualitative, CT quantitative, combined) with the endpoints of death and maximum severity. (B) Statistical approaches: Correlations, Fisher exact, independence, Wilcoxon rank test and Cox proportional hazard models. (C) Prediction of patients that survive or not per scenario using elastic net regression using a combination of random testing/training sets and k-fold cross-validation to identify the predictive value of each scenario.

Patient demographics, clinical and outcome data (n=152). Data are numbers of patients with percentages between parentheses. Cytokine assessment, CT qualitative score and CT quantitative analysis. Data are medians with interquartile ranges (IQR) in parentheses. GGO: ground-glass opacities.

CT qualitative score and CT quantitative analysis

CT qualitative scores were higher in patients who died vs. those who survived, but this difference did not reach significance after multiple observation correction (adjusted p-value = 0.07) (Figure 3 & Suppl. Table 3). CT quantitative variables: GGO volume (adjusted p-value = 0.01), consolidation volume (adjusted p-value = 0.01), and GGO to aerated lung ratio (adjusted p-value = 0.005) were all significantly higher (Figure 3 and Suppl. Table 3), while well-aerated lung volume was significantly lower (adjusted p-value = 0.035) in patients who died. A representative example of CT qualitative score and CT quantitative assessment is shown in Figure 2. Both CT qualitative score and CT quantitative analysis had excellent inter-observer reproducibility, with ICC of 0.998 (95% confidence interval: 0.996–0.999) and 0.986 (95% confidence interval: 0.960–0.995), respectively. Additionally, CT measurements and features were significantly correlated to each other, and to severity (Rho values between 0.4 and 0.7, adjusted p-values<0.05).

Figure 3

Key differences between patients who died and patients who survived. We conducted a Wilcoxon rank test for the variables collected. The number of ∗ indicates significance (∗<0.05. ∗∗<0.01, ∗∗∗<0.001, ∗∗∗∗<0.0001).

Figure 2

CT qualitative score and CT quantitative analysis. 29-year-old male patient with COVID-19. (A) CT demonstrates multifocal ground-glass opacities and regions of consolidation in the right lower. The qualitative score established by a radiologist is based on the percentage of lung involvement per lobe (shown on the right, range 0–20). (B) CT quantitative analysis using segmentation software. Quantitative analysis extracts volumetric measurements (shown on the right) representing the aerated lung, the ground glass opacities (GGO) volume, the consolidation volume and the GGO to aerated lung ratio. RUL: right upper lobe; RLL: right lower lobe; ML: middle lobe; LUL: left upper lobe; LLL: left lower lobe. Key differences between patients who died and patients who survived. We conducted a Wilcoxon rank test for the variables collected. The number of ∗ indicates significance (∗<0.05. ∗∗<0.01, ∗∗∗<0.001, ∗∗∗∗<0.0001). Overview of the computational analysis. (A) Shows starting point composed by 4 different scenarios (cytokines, CT qualitative, CT quantitative, combined) with the endpoints of death and maximum severity. (B) Statistical approaches: Correlations, Fisher exact, independence, Wilcoxon rank test and Cox proportional hazard models. (C) Prediction of patients that survive or not per scenario using elastic net regression using a combination of random testing/training sets and k-fold cross-validation to identify the predictive value of each scenario.

Cytokine analysis

IL-6 (adjusted p-value = 0.005) and IL-8 (adjusted p-value = 0.006) levels were significantly higher in patients who died, while there was no significant difference for TNF-α (adjusted p-value = 0.119) (Figure 3 and Suppl. Table 3). Also, Cytokines (IL-6, TNF-α, IL-8) were correlated to severity (Rho values between 0.3 and 0.6, adjusted p-values<0.05).

Association between variables and risk of death

We investigated the association between clinically relevant variables and the probability of death using univariate and multivariate Cox proportional hazard models and identified a significant association with O2 saturation, IL-6, IL-8, TNF-α, Neutrophils, Monocytes, D-dimer, and GGO to aerated lung ratio, among others (Suppl. Tables 4, 5, and 6, Suppl. Figures 2, 3, 4, 5, 6, 7, 8, 9, and 10). These results also indicate that oxygen saturation and demographic variables had poor power in predicting death despite producing significant prognostic models in assessing risk of death.

Prediction of COVID-19-related death

To test the hypothesis of whether cytokines or CT measurements were predictive of clinical outcomes, we used elastic net regression and k-fold cross-validation, we developed four “scenarios” to cover all variables within a category (Cytokines or CTs). The variables selection for each of these scenarios was informed by clinical expertise, previous research, and data availability. The variables in each scenario were: (1) “Cytokines” (IL-6, IL-8, TNF-α and age), (2) “CT-Qualitative” (CT qualitative score and age), (3) “CT-Quantitative” (GGO volume (mL), well aerated lung volume (mL), Total volume (mL) and age), (4) “Combined” (All features from Scenarios (1, 2 and 3)). The workflow chart highlights the steps in the construction of the predictive models is showed in Figure 4. In parallel, to identify the best predictors of outcomes, we developed a fifth “Optimized” scenario with the aim of selecting the minimum number of variables. Thus, we used all available variables and performed a stepwise coefficient selection to choose the variables that had the highest effect on the model. Age was the only demographic variable that survived the coefficient selection process whereas race/ethnicity, BMI and sex were not selected as they did not have a significant effect on the predictive model (Figure 4). The models built with cytokines showed an average AUC of 0.70 with CI 95% (65–75), while CT qualitative, CT quantitative and combined models had average AUCs of 0.61 with CI 95% (57–62), 0.66 with CI 95% (60–70) and 0.75 with CI 95% (69–80), respectively (Figures 5A, 5B). All models were significantly (adjusted p-values<0.05) different from each other (Figure 5B). These results show that a combined model increases the predictive power of death prediction of CT based on additional information from cytokine assays.

Figure 5

Power of chest CT and cytokines for prediction of death and maximum severity score. We tested the 5 scenarios to evaluate their relevance for prediction of death and maximum severity. (A) Average ROC curves derived using risk of death are showed for each scenario. (B) Boxplot showing the AUC values for all significant (p-value<0.05) models build per scenario for risk of death. (C) Average ROC curves derived for maximum severity per scenario. (D) Boxplot showing the AUC values for all significant (p-value<0.05) models build per scenario for severity. The optimized prediction scenario contained IL-6, IL-8, TNF-α, GGO to aerated lung ratio, and age. This model showed an AUC of ∼0.78 with CI 95% (72–84), combining cytokines (IL-6, IL-8, TNF-α) and CT quantitative measurements (GGO to aerated lung ratio). The optimized scenario was significantly higher (adjusted p-values<0.05) than previous models using Wilcoxon rank test (Figures 5B, 5D).212120

Prediction of COVID-19 maximum severity score

To test the predictive power of these scenarios for maximum COVID-19 severity (according to the WHO scale), we used elastic net regression and k-fold cross-validation. These results show that the combined scenario (AUC: 0.84 with CI 95% [80-88]) performed better (adjusted p-values<0.05) than the optimized scenario (AUC: 0.82 with CI 95% [78-86]) (adjusted p-value<0.05), CT quantitative (AUC: 0.81 with CI 95% [77-86]), CT qualitative (AUC: 0.74 with CI 95% [70-78]) (Adjusted p-values<0.05) (Figures 5C, 5D) and cytokines (AUC: 0.70 with CI 95% [66-75]) (Adjusted p-values<0.05). Of note, the CT quantitative performed better (Adjusted p-values<0.05) than CT qualitative and cytokines (Figure 5D). Finally, to simplify our findings we took advantage of our elastic net regression interpretability to distillate probabilistic model for scoring risk or nomogram. The nomogram uses the glmnet selected variables GGO to aerated lung ratio, age, TNF-α, IL-6 and IL-8 to provide a score for risk of death (Suppl. Figure 11) [19, 20, 21].

Discussion

Previous studies predicting COVID-19 patient outcomes using clinical, laboratory, or radiologic findings have shown overly optimistic predictive performance, partially due to risk of bias [23]. Moreover, these models did not utilize internationally validated classifications, such as the WHO scale, MODS, APACHE II, or SOFA. Thus, accurate prognostic models aimed at predicting COVID-19 outcomes are necessary. In our study, we evaluated the performance of chest CT features and plasma cytokines from plasma alone and in combination in predicting death and maximum severity degree of hospitalized COVID-19 patients. These predictive models are important so as to inform physicians of potential outcomes and thus helping with risk stratification, clinical decision-making, and selecting treatment options. While cytokine assessment was more useful in prediction of death, CT features showed higher predictive performance for COVID-19 disease severity. In addition, the CT quantitative method, using volumetric variables extracted by manual segmentation outperformed CT qualitative score based on the radiologists' assessment. Our findings reveal that a combined model using a CT quantitative feature (GGO to aerated lung ratio), demographics (age), and serum cytokines (IL-6, IL-8 and TNF-α) represents an accurate non-invasive tool for predicting risk of death and severity degree in hospitalized COVID-19 patients. We investigated the probability of death using demographics, plasma cytokines (IL-6, IL-8, TNF-α), and CT measurements by Cox proportional hazard models. Although demographics variables showed significance (adjusted p-value<0.05) in univariate and multivariate analysis, they failed to predict outcomes with more than 0.6 AUC using elastic net regression. Further, we found that the models based on a combination of cytokines (IL-6, IL-8, and TNF-α) and age had a fair prediction power for death (AUC of 0.70) and maximum COVID-19 severity degree (AUC of 0.70). These results are concordant with the study carried out by Del Valle et al. [21] that showed IL-6, IL-8, and TNF-α to be strong and independent predictors of patient death. We assessed two approaches to evaluate CT images: one based on the radiologist's qualitative assessment and another based on quantitative measurement of lesion burden using manual segmentation based on HU thresholding. Previous studies using either qualitative or quantitative image analysis methods have shown high performance in predicting adverse outcomes, such as mechanical ventilation, ICU admission, death, or severity degree [8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 36]. However, several of these studies used non-internationally validated severity degree scales. Our models based on the CT qualitative and quantitative methods had an AUC of 0.61/0.66 for predicting death, respectively, underperforming when compared to cytokines (AUC 0.71). However, both CT models outperformed the cytokines model in predicting maximum severity score, with the CT quantitative model performing better than the CT qualitative model (AUCs of 0.81 and 0.74, respectively). These findings reveal that a quantitative assessment may more accurately assess lung involvement as stated by Avila et al. [7]. A model based on CT analyses could be a potential approach for COVID-19 stratification. In addition, both CT evaluation methods showed excellent inter-observer agreement between two independent readers with ICC >0.9 in our study, concordant with other studies [9, 12, 17]. The potential of combining CT features with clinical and laboratory data (CRP, lymphocyte count and lymphocyte to neutrophil ratio among others) has been previously described [10, 14, 16, 19]. However, the combination of CT features with plasma cytokines had not yet been evaluated. Our combined and optimized models, using both plasma cytokines and CT features, showed the best performances in predicting death (AUC 0.74–0.78) and maximum severity degree (AUC of 0.82–0.84). This noninvasive approach may help clinicians in risk stratification and in choosing individualized therapeutic strategies. Our study design prevents overly broad conclusions. First, the sample size is too small to control for all types of potential biases. Our limited sample size was dictated by our strict inclusion/exclusion criteria. Second, the CT quantitative analysis could overestimate the GGO volume and consolidation volume due to partial volume effect and the inclusion of small vascular structures. However, this error affects every study and could be considered a systematic error that does not affect the final output. Additionally, we did not assess patient immune-suppressed status and immune-suppressive treatments due to treatment heterogeneity at the beginning of the surge. Finally, our population consists mainly of moderate to severe cases, as mild cases were not admitted. Our study demonstrates that a combination of chest CT imaging analysis and plasma cytokine assessment can be used to profile COVID-19 patients and potentially triage them into risk groups. Here we show this combination to be a powerful tool in predicting the degree of COVID-19 mortality and severity. More importantly, we show that robust and simple predictive models can support clinical decision-making to stratify patients based on risk and aid in individualized therapeutic strategy development. Plasma cytokine assessment alone may help to predict survival, while CT analysis may aid in patient severity degree stratification. Additionally, CT quantitative analysis represents a potential tool to evaluate pneumonia lesions. Combining selected cytokine and CT quantitative features in an optimized model was shown to outperform either measurement alone. Ultimately, these data and methods provide novel metrics that may be used to monitor patients upon hospitalization and help physicians make critical decisions and considerations for patients at high risk of death from COVID-19. These markers should be prospectively analyzed in relation to therapy choice, in particular with costly treatments such as monoclonal antibodies that could potentially be aimed at those patients with unfavorable scores at presentation.

Declarations

Author contribution statement

Guillermo Carbonell, Diane Marie Del Valle, Edgar Gonzalez-Kozlova, Sacha Gnjatic, Bachir Taouli: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper. Brett Marinelli: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data. Emma Klein, Daniel Stocker, Michael Chung, Adam Bernheim, Miriam Merad: Conceived and designed the experiments; Contributed reagents, materials, analysis tools or data. Maria El Homsi: Conceived and designed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data. Nicole W. Simons: Contributed reagents, materials, analysis tools or data; Wrote the paper. Jiani Xiang, Sharon Nirenberg, Patricia Kovatch: Contributed reagents, materials, analysis tools or data. Sara Lewis: Conceived and designed the experiments; Analyzed and interpreted the data.

Funding statement

Sacha Gnjatic, Edgar Gonzalez-Kozlova, Diane Marie Del Valle and Miriam Merad were supported by grants U24 (CA224319) and U01 (DK124165).

Data availability statement

We are in the process of uploading the data to import now that the manuscript has been accepted for publication.

Declaration of interest's statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

29 in total

1. Fleischner Society: glossary of terms for thoracic imaging.

Authors: David M Hansell; Alexander A Bankier; Heber MacMahon; Theresa C McLoud; Nestor L Müller; Jacques Remy
Journal: Radiology Date: 2008-01-14 Impact factor: 11.105

2. Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors: Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal: J Stat Softw Date: 2010 Impact factor: 6.440

3. An inflammatory cytokine signature predicts COVID-19 severity and survival.

Authors: Diane Marie Del Valle; Seunghee Kim-Schulze; Hsin-Hui Huang; Noam D Beckmann; Sharon Nirenberg; Bo Wang; Yonit Lavin; Talia H Swartz; Deepu Madduri; Aryeh Stock; Thomas U Marron; Hui Xie; Manishkumar Patel; Kevin Tuballes; Oliver Van Oekelen; Adeeb Rahman; Patricia Kovatch; Judith A Aberg; Eric Schadt; Sundar Jagannath; Madhu Mazumdar; Alexander W Charney; Adolfo Firpo-Betancourt; Damodara Rao Mendu; Jeffrey Jhang; David Reich; Keith Sigel; Carlos Cordon-Cardo; Marc Feldmann; Samir Parekh; Miriam Merad; Sacha Gnjatic
Journal: Nat Med Date: 2020-08-24 Impact factor: 53.440

4. pROC: an open-source package for R and S+ to analyze and compare ROC curves.

Authors: Xavier Robin; Natacha Turck; Alexandre Hainard; Natalia Tiberti; Frédérique Lisacek; Jean-Charles Sanchez; Markus Müller
Journal: BMC Bioinformatics Date: 2011-03-17 Impact factor: 3.307

5. Well-aerated Lung on Admitting Chest CT to Predict Adverse Outcome in COVID-19 Pneumonia.

Authors: Davide Colombi; Flavio C Bodini; Marcello Petrini; Gabriele Maffi; Nicola Morelli; Gianluca Milanese; Mario Silva; Nicola Sverzellati; Emanuele Michieletti
Journal: Radiology Date: 2020-04-17 Impact factor: 11.105

6. CT Quantitative Analysis and Its Relationship with Clinical Features for Assessing the Severity of Patients with COVID-19.

Authors: Dong Sun; Xiang Li; Dajing Guo; Lan Wu; Ting Chen; Zheng Fang; Linli Chen; Wenbing Zeng; Ran Yang
Journal: Korean J Radiol Date: 2020-07 Impact factor: 3.500

7. CT Imaging Features of 2019 Novel Coronavirus (2019-nCoV).

Authors: Michael Chung; Adam Bernheim; Xueyan Mei; Ning Zhang; Mingqian Huang; Xianjun Zeng; Jiufa Cui; Wenjian Xu; Yang Yang; Zahi A Fayad; Adam Jacobi; Kunwei Li; Shaolin Li; Hong Shan
Journal: Radiology Date: 2020-02-04 Impact factor: 11.105

8. Early prediction of disease progression in COVID-19 pneumonia patients with chest CT and clinical characteristics.

Authors: Zhichao Feng; Qizhi Yu; Shanhu Yao; Lei Luo; Wenming Zhou; Xiaowen Mao; Jennifer Li; Junhong Duan; Zhimin Yan; Min Yang; Hongpei Tan; Mengtian Ma; Ting Li; Dali Yi; Ze Mi; Huafei Zhao; Yi Jiang; Zhenhu He; Huiling Li; Wei Nie; Yin Liu; Jing Zhao; Muqing Luo; Xuanhui Liu; Pengfei Rong; Wei Wang
Journal: Nat Commun Date: 2020-10-02 Impact factor: 14.919

9. QIBA guidance: Computed tomography imaging for COVID-19 quantitative imaging applications.

Authors: Ricardo S Avila; Sean B Fain; Chuck Hatt; Samuel G Armato; James L Mulshine; David Gierada; Mario Silva; David A Lynch; Eric A Hoffman; Frank N Ranallo; John R Mayo; David Yankelevitz; Raul San Jose Estepar; Raja Subramaniam; Claudia I Henschke; Alex Guimaraes; Daniel C Sullivan
Journal: Clin Imaging Date: 2021-02-25 Impact factor: 2.420