Literature DB >> 32935114

Longitudinal immune profiling of mild and severe COVID-19 reveals innate and adaptive immune dysfunction and provides an early prediction tool for clinical progression.

Andre F Rendeiro, Joseph Casano, Charles Kyriakos Vorkas, Harjot Singh, Ayana Morales, Robert A DeSimone, Grant B Ellsworth, Rosemary Soave, Shashi N Kapadia, Kohta Saito, Christopher D Brown, JingMei Hsu, Christopher Kyriakides, Steven Chui, Luca Cappelli, Maria Teresa Cacciapuoti, Wayne Tam, Lorenzo Galluzzi, Paul D Simonson, Olivier Elemento, Mirella Salvatore, Giorgio Inghirami.

Abstract

With a rising incidence of COVID-19-associated morbidity and mortality worldwide, it is critical to elucidate the innate and adaptive immune responses that drive disease severity. We performed longitudinal immune profiling of peripheral blood mononuclear cells from 45 patients and healthy donors. We observed a dynamic immune landscape of innate and adaptive immune cells in disease progression and absolute changes of lymphocyte and myeloid cells in severe versus mild cases or healthy controls. Intubation and death were coupled with selected natural killer cell KIR receptor usage and IgM+ B cells and associated with profound CD4 and CD8 T cell exhaustion. Pseudo-temporal reconstruction of the hierarchy of disease progression revealed dynamic time changes in the global population recapitulating individual patients and the development of an eight-marker classifier of disease severity. Estimating the effect of clinical progression on the immune response and early assessment of disease progression risks may allow implementation of tailored therapies.

Entities: Disease Gene Species

Year: 2020 PMID： 32935114 PMCID： PMC7491529 DOI： 10.1101/2020.09.08.20189092

Source DB: PubMed Journal: medRxiv

Introduction

Coronavirus disease-2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is a global pandemic that (as of August 2020) has infected over 25 million people worldwide, caused over 840,000 deaths, and strains health systems on an unprecedented scale. COVID-19 has heterogeneous clinical manifestation, ranging from mild symptoms such as cough and low-grade fever to severe conditions including respiratory failure and death[1,2]. While most patients with mild disease develop an appropriate immune response that culminates with viral clearance[3], severe disease manifestations have been linked to lymphopenia and immune hyperresponsiveness leading to cytokine release syndrome (CRS)[1,2,4,5]. The most effective therapeutic approaches developed so far for severe cases involve either general immunosuppression with glucocorticoids[6] or selective neutralization of interleukin 6 (IL-6) with tocilizumab[7], a monoclonal antibody used to manage CRS in indications such as rheumatoid arthritis[8]. The efficacy of these therapies strongly support a key role for immune dysregulation in the pathogenesis of COVID-19. However, neither treatment has achieved high clinical remission rates in patients with severe COVID-19[9,10], suggesting that other immunological or immune-independent attributes may contribute to severity, treatment failure, and ultimately patient death. Thus, in-depth characterization of immune responses to SARS-CoV-2 infection is urgently needed. Recent characterization efforts have uncovered broad dysregulation of the innate immune system[11] coupled with altered inflammatory responses[12] and impaired adaptive immunity[13]. Specifically, the adaptive immune compartment of COVID-19 patients exhibits marked lymphopenia[4,14,15], polarization of T cells toward a memory phenotype[14] and functional exhaustion[16-18], demonstrating that SARS-CoV-2 infection induces both cellular[19,20] and humoral responses[21]. However, the molecular and cellular mechanisms through which SARS-CoV-2 infection induces these broad immunological derangements in only some patients remains to be elucidated. Further, little is known about the role of the innate immune responses that constitute the first defense against SARS-CoV-2 infection. Moreover, the degree of interaction between various immune compartments and demographic factors and medical comorbidities is unclear. The most prominent risk factors for severe disease and death by COVID-19 include age, cardiovascular or oncological comorbidities, and immunosuppression[22-24]. Additionally, men appear to be at significantly higher risk for severe COVID-19 than women[25]. While mortality rates are estimated at 4–6% in the general population, high-risk populations experience mortality rates >60%[26]. Clarifying the early immunological alterations associated with mild versus severe COVID-19 may not only offer therapeutically actionable targets, but also enable the identification of cases at highest risk for clinical deterioration and death. The development of an effective clinical decision-making tool rooted in immunological monitoring has the potential to optimize patient care and resource utilization. By profiling mild and severe COVID-19 patients and healthy donors with flow cytometry, we demonstrate that SARS-CoV-2 is associated with broad dysregulation of the circulating immune system, characterized by the relative loss of lymphoid cells coupled to expansion of myeloid cells. Severe cases demonstrated enrichment of natural killer (NK) cells expressing the immunosuppressive receptor killer cell immunoglobulin-like receptor, two Ig domains and short cytoplasmic tail 4 (KIR2DS4; CD158i), as well as alterations in the B cell compartment marked by reduced CD19, CD20, and IgM+ cells. These immune profiles enable reconstruction of a hierarchy of disease progression with pseudo-temporal modeling, which allows estimation of dynamic longitudinal changes within individual patients. Our approach also estimates the effect of clinical factors on immune dysregulation and thus establishes an immune-monitoring tool for disease progression.

Results

SARS-CoV-2 infection causes major changes in the circulating immune system

We conducted an observational study of 45 individuals with COVID-19 that were treated at New York Presbyterian Hospital and Lower Manhattan Hospitals, Weill Cornell Medicine (IRB 20–03021645) as in- or outpatients between April and July, 2020. The disease was categorized as “mild” if the patient was not admitted or required <6 L non-invasive supplemental oxygen to maintain SpO2 >92% (n = 21). Patients with “severe” disease required hospitalization and received >6 L supplemental oxygen or mechanical ventilation (n = 15). Blood samples were collected at enrollment and, when permissibsible, approximately every 7 days thereafter. Samples were also collected from non-hospitalized individuduals who had recovered from mild, laboratory-confirmed SARS-CoV-2 infection (“convalescent” group, n = 9). and from healthy COVID-19 negative donors (n = 12) (Figure 1a).The median age of COVID-19 patients was 65 years, which was significantly higher than healthy donors (30 years) (Supplementantary Tables 1–2; Supplementary Figure 1).

Figure 1:

Immuno-profiling of COVID-19 patients reveals a disarrayed immune system.

a) Composition of the study cohort. b) Description of immune panels and their target epitopes. c) Composition of major immune compartments as a percentage of all live CD45+ cells. d) Abundance of major lymphphoid compartments as a percentage of all lymphocytes. For (c) and (d), the upper panels divide patients by general disease status and three lower panels further divide the study subjects by clinical intervention or outcome. Significance was assessed using Mann-Whitney U tests and corrected for multiple testing with the Benjamini-Hochberg false discovery rate (FDR). **, FDR-adjusted p-value <0.01; *, FDR-adjusted p-value of 0.01–0.05.

We performed high-dimensional immune cell profiling of circulating blood by flow cytometry based seven independent fluorochrome-conjugated antibody panels, each targeting a specific surface protrotein marker of T, B, NK, and myeloid-derived suppressor cells (MDSCs) (Figure 1b, Supplementary Figure 2 and Supplementary Tables 3–7). Longitudinal sampling was performed in eight patients, one in the “mild” and seven in the “severe” group, and included at least three samples per patient, making a complete dataset including 102 samples from 57 individuals. Consistent with previous reports[12,15,27,14], we observed global loss of lymphocytes among CD45+ cells and enrichment of the myeloid cell compartment in the peripheral blood of COVID-19 patients compared with healthy donors (Figure 1c, top). This was exacerbated in patients with severe disease, particularly among hospitalized patients who required mechanical ventilation or died, compared with individuals with mild disease (Figure 1c, bottom). This lymphocyte depletion was primarily observed in the T and NK cell compartments (Figure 1d). There was no difference in the abundance of B cells between mild and severe groups. These results highlight a major shift in peripheral immune cell absolute abundance from the lymphoid to myeloid lineage (Supplementary Tables 6–7).

SARS-CoV-2 infection causes imbalances in the naive and memory T cell compartments and induces exhaustion

We next profiled CD4+ and CD8+ T-cells in COVID-19 patients and healthy donors (Supplementary Figure 2). The CD4/CD8 ratio correlated positively with disease severity (Figure 2a)[14,19]. There was also an expansion of memory T cells (CD45RO+) with reciprocal contraction of the naive compartment (CD45RA+) in severe cases relative to mild disease or healthy donors (Figure 2b). We next quantified the abundance of populations expressing C-C motif chemokine receptor 7 (CCR7; CD197), selectin L (SELL; CD62L), and FAS cell surface death receptor (CD95). Within CD45RA+ cells, effector CCR7− (TEFF) populations were increased in COVID-19 patients and those with severe disease, especially in the CD8+ compartment (Figure 2c). Conversely, there was significant depletion of CD8+CD45RO+CD95− T cells in patients, which was exacerbated with severe disease.

Figure 2:

T cells from COVID-19 patients have high levels of CD25, FAS, and exhaustion markers.

a) The ratio of CD4 to CD8 cells is dependent on disease state and clinical intervention. b) The abundance of CD45RA/RO cells in either CD4+ or CD8+ compartments is dependent on disease state or clinical intervention. c) Abundance of immune populations changes significantly between disease states. d) Uniform Manifold Approximation and Projection (UMAP) projection of all cells colored by either surface receptor expression, cluster assignment, or disease severity. e) Immune phenotype of each cluster (top) and its composition in disease severity (bottom). f) Expression levels of CD25 and FAS receptors in the UMAP projection. g) FAS expression across all clusters depending on disease severity (left) and the proportion of cells not expressing FAS for each sample (right). h) Scatter plot of CD25 and FAS expression for each cell according to disease severity. i) Abundance of CD4+ CXCR5+ PD-1+ TFH by disease severity. j) Immune populations with significantly different amounts of cells expressing immune checkpoint receptors by disease severity.Significance was assessed by Mann-Whitney U tests and corrected for multiple testing with the Benjamini-Hochberg false discovery rate (FDR). **, FDR-adjusted p-value <0.01; *, FDR-adjusted p-value 0.01–0.05.

To characterize these populations more objectively and independently of manual gating, we analyzed the expression of eight surface proteins at the single-cell level by jointly embedding CD3+ cells from all samples with the Uniform Manifold Approximation and Project (UMAP) method and clustering them (Figure 2d). Not all clusters contained cells from all severity groups proportionally. Specifically, clusters 12, 18, and 21, which are characterized by reduced FAS expression, were enriched for cells from healthy donors (Figure 2e). Moreover, there was increased expression of CD95 in samples from COVID-19 patients that correlated with disease severity (Figure 2f–g), and FAS– cells were particularly depleted in all patients (Figure 2g). Indeed, CD95+CD25+ T cells were increased in severe cases, while no difference was observed between convalescent patients and healthy donors (Figure 2h). Next, we assessed the frequency of CD4 regulatory T cells (TREG, characterized by CD127dimCD25bright). As markers of follicular helper T cells (T ), we also measured CD4+, CXCR5+ PD1+, ICOS+ TFH, which are critical to B cells in the initiation and maintenance of humoral immune responses[28]. We found a significant but modest increase in TFH in mild and severe COVID-19 cases, with their presence in convalescent patients similar to in healthy donors (Supplementary Figure 4a/b). However, upon considering a broader spectrum of TFH cells regardless of ICOS expression (Supplementary Figure c), CD4+, CXCR5+, PD-1+ TFH were most abundant in COVID-19 patients with mild disease (Figure 2i). In TREG, severe COVID-19 patients showed significant increase compared with healthy donors (Supplementary Figure 4c), while previous reports showed an increase in patients with mild course[5,29]. To investigate T cell functional phenotypes, we assessed the expression of co-inhibitory T cell receptors. We observed sustained increase of Programmed Cell Death 1 (PD-1) in COVID-19 patients compared with healthy donors in both CD4 and CD8 compartments. At the same time, V-set immunoregulatory receptor (VISTA) and lymphocyte-activating gene 3 (LAG3) were upregulated in mild cases (Figure 2j). Exhausted T cell phenotypes, with high expression of VISTA and LAG-3, can be encountered in chronic viral diseases[30], including chronic SARS-CoV-2 infection[17]. This phenotype suggests that these inhibitory receptors may operate at least partially via non-overlapping immunosuppressive signals that negatively regulate T cell responses during chronic viral infection[16]. These results highlight a shift toward an activated T cell memory phenotype in COVID-19 patients, with a potential role for CD95-mediated cell death. By and large, convalescent patients and healthy donors displayed similar immunotypes in comparison with COVID-19 patients. However, we did identify populations such as CD45RA+, CCR7+, CD62L−, FAS− CD8+ TEFF cells, which remained significantly different to healthy donors up to ~2 months into recovery (Figure 2c). These cells may represent “T stem memory (TSM) cells” with poor expansion potential[31] and/or aberrant terminally differentiated effector memory (TEM) cells[32]

SARS-CoV-2 induces expansion of polymorphonuclear MDSCs and biases NK KIR usage

Having observed myeloid expansion in COVID-19 patients (Figure 1c), we next investigated the abundance of the MDSC subset. These elements are activated by interleukin 6 (IL-6)[33] and have immunomodulatory functions in cancer[34,35] and viral infections[36]. Our flow cytometry panel considered CD3−, CD56−, CD19−, HLA-DR−/dim, CD33+, CD11b+ cells and focused on distinguishing CD14−, CD15+ granulocytic cells (G-MDSCS); CD14+, CD15−/dim monocytic-like cells (M-MDSC); and CD14−, CD15−/dim immature cells (I-MDSC) from each other. G-MDSCs were rarely detected in healthy donors but were prevalent in mild and severe COVID-19 patients (Figure 3a/b). Convalescent patients showed numbers of G- and M-MDSCs closer to healthy donors, with a non-significant increase in I-MDSCs compared with healthy donors. Conversely, I-MDSC cells, while relatively rare as a fraction of all immune cells, were further reduced with disease (Figure 3a/b). Since neutrophils are phenotypically similar to MDSCs, we compared their abundance with MDSCs. While there was a positive correlation between G-MDSCs and a high neutrophil count, neither could account entirely for the other (Supplementary Figure 3).

Figure 3:

Emergence of granulocytic MDSCs and preferential expression of specific NK cell receptors in the innate immune system of COVID-19 patients.

a) Abundance of MDSCs as a percentage of all immune cells according to disease severity. b) Uniform Manifold Approximation and Projection (UMAP) projection of all cells from all patients colored by the expression levels of surface receptors, derived clusters, or disease severity among all patients. c) Immune profile of each cluster from (b) based on the expression of surface markers (top) and composition in disease severity (bottom). d) Expression levels of CD15 dependent on disease severity (left) and quantification of cells expressing it (right) according to CD16, CD3, and CD33 expression. e) Abundance of cells expressing various KIR receptors as a percentage of NK cells according to disease severity. f) UMAP projection of all cells from all patients colored by the expression of surface receptors, derived clusters, or disease severity. g) Immune profile of each cluster from (f) based on the expression of surface markers (top) and composition in disease severity (bottom). h) Expression levels of all four measured KIR receptors in each disease state. Significance was assessed using Mann-Whitney U tests and corrected for multiple testing with the Benjamini-Hochberg false discovery rate (FDR). **, FDR-adjusted p-value <0.01; *, FDR-adjusted p-value 0.01–0.05.

Next, we created a joint embedding of 2.4 million CD16+ cells from all samples using the UMAP method, deriving clusters based on similar cells (Figure 3c). Clusters containing CD15+ cells were disproportionately enriched in samples from COVID-19 patients, while clusters with CD3+, IL4R (CD124) were mostly composed of cells from healthy donors (Figure 3d). In addition, CD15 expression was most prominent in COVID-19 patients, particularly in severe cases, but when selecting for CD3− or CD3− CD33+ cells, convalescent patients possessed a number of CD15+ cells more similar to patients with active disease than healthy donors. Next, we focused on innate lymphoid cells and determined the expression of KIR receptors in CD56+, CD16bright NK cells. While we observed no significant differences in the relative abundance of KIR receptors among COVID-19 patients with mild disease and healthy controls (Figure 3f), a significantly higher proportion of cells expressed CD158i (NKG2A) in severe patients compared with mild or convalescent individuals. Moreover, we observed fewer CD158e (KIR3DL1) cells in patients with mild disease compared with severe patients and a lower proportion of cells not expressing any of the measured receptors (KIR–) in patients with severe disease. To further explore NK cell subsets independent of conventional gating, we harnessed single-cell analysis and integrated >500,000 cells in a UMAP representation, identifying cell clusters based on surface marker expression (Figure 3g). Clusters significantly enriched in CD158i expressing cells were paucicellular in healthy donors compared with COVID-19 patients (Figure 3g, bottom), and the relative frequency of CD158i-expressing cells was lower in healthy donors, regardless of the expression of other KIR receptors (Figure 3h). Since the expression of KIR variants is stochastic, the apparent selection of KIR-expressing cells in severe COVID-19 patients could indicate that a viral antigen presented by major histocompatibility complex (MHC) class I molecules with higher affinity for CD158i could select for NK cells expressing this receptor.

B cells of COVID-19 patients show distinct patterns of immunoglobulin expression associated with disease severity

Since B cells play a critical role in adaptive immunity, we investigated the expression levels of surface CD19, CD20, IgM, and IgG in circulating cells. Despite the backdrop of a relative decrease in B cell numbers as disease progresses, we observed only a mild, non-significant increase in plasmacytoid cells in patients with severe COVID-19 compared with healthy donors (Figure 4a). However, the number of IgM+ CD19+ CD20+ B cells was decreased in patients with severe disease compared with mild, while IgG+, CD19+, CD20+ cells remained comparable across all patients (Figure 4b). Next, we visualized single cells from all patients in a common UMAP plot and assigned clusters based on surface marker expression (Figure 4c). This approach identified two distinct groups based on the expression of surface IgM, with the total number of IgM+ cells within clusters increased in severe COVID-19 patients (Figure 4d). Conversely, healthy donors displayed B cells with high expression of surface CD19+ and CD20+ antigens (Figure 4d). Closer inspection of CD19 and CD20 expression identified two distinct populations that differ in CD20 levels (Figure 4e). This approach also revealed that the relative abundance of circulating CD19 and CD20bright B cells was lower in COVID-19 patients compared with healthy individuals regardless of disease severity.

Figure 4:

B cells of COVID-19 patients are marked by a shift toward a plasmocytic IgM phenotype.

a-b) The abundance of total B cells, plasma, and IgG+ and IgG+ cells between disease states. c) Uniform Manifold Approximation and Projection (UMAP) projection of all cells colored by surface receptor expression, cluster assignment, or disease severity. d) Immunophenotype of each cluster (top) and its composition by disease severity (bottom). e) Identification and quantification of five populations of B cells dependent on CD20 and CD19 expression. f) Comparison of the abundance of the populations identified in e) between disease states. Significance was assessed using Mann-Whitney U tests and corrected for multiple testing with Benjamini-Hochberg FDR. **, FDR-adjusted p-value <0.01; *, FDR-adjusted p-value 0.01–0.05.

To shed light on the functional relevance of these different B cell subsets, we quantified the expression of IgG and IgM in each population identified based on CD19 and CD20 co-staining (Figure 4f). Circulating CD19low B cells (populations A and B) were enriched for IgG+ cells in patients with mild and severe COVID-19 and IgM+ cells in severe COVID-19 patients, while convalescent patients resembled healthy donors. No such difference was observed with CD19+ and CD20bright B cells (population C) and CD19+ CD20+ and CD19+ CD20− B cells (populations D and E). Overall, despite dwindling numbers of B cells overall, specific subsets of B cells, especially those with lower CD19 expression, have distinct immunoglobulin expression patterns in COVID-19 patients, with severe patients more frequently bearing IgM+ B cells. We speculate that these findings may be related to the plasmacytoid differentiation and immunoglobulin switching programs, which may be dysfunctional due to SARS-CoV-2 infection.

Pseudo-temporal modeling unveils a highly dynamic immune cell landscape of COVID-19 over time

Having characterized the main circulating compartments of the immune system, we next sought to leverage the high-dimensionality of the dataset and relate individual immune fingerprints of different patients using hierarchical clustering (Figure 5a). Not only did individuals with similar clinilinical conditions tend to cluster together, but clinical factors such as hospitalization or intubation also appeared to be linked to immunotypes. We hypothesized that this underlying data structure would be useful for reconstructing the clinical course of COVID-19. Thus, we employed pseudotime inference to reconstruct an underlying latent space from a healthy state to a severe disease state (Figure 5b/c)

Figure 5:

Pseudo-temporal reconstitution of disease progression reveals a hierarchy of immune changes in COVID-19 disease.

a) Hierarchical clustering of the abundance of immune populations for all samples reveals an organized structure by disease states and clinical factors. b) Projection of immune profiles into a two-dimensional latent space that reconstructs the hierarchy of disease progression. The x-axis represents disease progression in the pseudo-temporal space. c) Distribution of samples grouped by disease state along the pseudo-temporal axis derived in (b). d) Immune populations associated with the pseudo-temporal axis represented by either the absolute change in percentage in their extremes (x-axis) or strength of linear association (y-axis). e) Heatmap of immunotypes and immune populations sorted by their order or relative abundance in the pseudo-temporal axis, respectively. f) Clusters of immune populations based on their abundance along the pseudo-temporal axis. g) Examples of immune populations from each cluster in (f).

Further analysis of the inferred space enabled identification of circulating immune cell populations associated with disease progression. In particular, we identified a space driven by a decrease of lymphocytes, gain of myeloid cells (G-MDSCs in particular), and a terminally activated/exhausted T cell phenotype (Figure 5d). Besides discovering immune signatures associated with each degree of severity, this analysis allows the relative positioning of each time point in relation to the continuos changes characterized by pseudotime (Figure 5e). Variable changes associated with the pseudo-temporal axis could be classified in three clusters (Figure 5f). The first was composed of 68 populations, with an increase toward higher disease severity with representatives such as the fractction of myeloid cells, PD-1+ CD4+ T-cells, and CD62L− cells among CD45RA+, CD8+ T cells (Figure 5g left). The second corresponded to a virtually stable cluster with 52 populations such as IgM+ B cells, with only mild fluctuation in the intermediate stage (Figure 5g, center). The third included a cluster with a steady decrease by disease severity, encompassing the overall lymphoid population as well as B cells and CD45RA+ CD4+ T cells (Figure 5g, right). This effectively establishes a temporal hierarchy of changes as disease progresses in which populations such as CD62L+, CCR7+ , CD45RA+,CD8+ T cells have a steady decline and others such as B cells have a stronger decline toward the severe end of the pseudo-temporal timeline. Additionally, the dynamic character of changes raises the possibility of using flow cytometry to improve COVID-19 patient stratification based on real-time immunological monitoring. Although our observations do not indicate causality, immunologi variations in the pseudo-temporal dimension may offer testable hypotheses on COVID-19 progression mechanisms.

Integration of clinical and demographic factors affecting COVID-19 immunity and stratification of patients by disease severity

Since various clinical and demographic factors influence disease incidence and mortality[1,2,37], we investigated the interaction between SARS-CoV-2 infection, the circulating immune system, and various demographic and clinical factors. Thus, we fit regularized linear models to the proportional flow cytometry data with co-variates such as sex, race, age, disease severity, presence of comorbidities, hospitalization, intubation, and death (Supplementary Figure 5a). We also estimated the interaction of sex with clinical variables such as disease severity, hospitalization, intubation, and death. The resulting network of significant effects identified several clinical factors associated with specific immune cell populations, highlighting how age, sex, and disease severity jointly influence the circulating immune systems in patients with COVID-19 (Figure 6a).

Figure 6:

Factors conditioning the immune response during COVID-19 and predicting disease severity

a) Directed graph of clinical factors (green) and immune populations (pink). Edges represent the association between factors and immune populations and are colored by the direction and strength of association (blue, negative; red, positive). b) Abundance of select immune populations with significantly different responses between sexes dependent on outcome. c) Estimated coefficients of change for severe vs. mild disease (left) o tocilizumab treatment (right) for immune populations that change discordantly. d–e) Abundance of select immune populations with significantly different responses between sexes dependent on tocilizumab treatment (d) or intubation (e). f) Graphical depiction of the machine learning framework for predicting disease severity using the earliest available samples per patient and cross-validation. g–h) Performance of classifiers trained with real or randomly shuffled labels and either all immune populations (g) or with selection for the top most predictive eight populations (h). i) Predicted severity scores over time since symptoms started for immune profiles from patients with at least three longitudinal sampling points. j) Relative expression of CD25, CD45RA and CD45RO over time in four patients from (g). **, FDR-adjusted p-value <0.01; *, FDR-adjusted p-value 0.01–0.05.

As a baseline, we could recover known effects independent of disease, such as a higher CD4:CD8 ratio in females than males and an overall decrease of the lymphoid population with age (Supplementary Figure 5b). Lastly, we found associations between sex and clinical variables such as a significantly higher fraction of CD62L+, CCR7+, CD45RO+, CD4+ T cells in males that died compared with females (Figure 6b, left) and much lower total lymphocyte levels in females that died compared with males (Figure 6b, right). Regarding the effect of tocilizumab on the immune system, we compared post-treatment samples from eight treated severe patients to seven severe untreated patients. While we observed the largest effect in certain subsets of CD4+ T cells, there was also an increased relative abundance of B cells and a decrease in T cells expressing the co-inhibitory receptor hepatitis A virus cellular receptor 2 (HAVCR2; TIM3) (Supplementary Figure 5c). Moreover, the signature associated with severe versus mild patients was broadly counteracted by tocilizumab (Figure 6c). Associations between sex and clinical variables were found, such as a lower fraction of CD62L+, CCR7+, CD45RA+, CD8+ T cells in females treated with tocilizumab compared with males, contrary to the opposing trend in untreated individuals (Figure 6d), or the lower frequency of CD158a NK cells in female intubated patients (Figure 6e). Since there is a need to stratify patients to provide better, more effective, and less costly care, particularly in the earlier stages of disease, we hypothesized that the high dimensionality of the immunotypes would make it possible to train a classifier to predict disease severity early on. A random forest classifier was trained to distinguish patients with mild from severe disease using only the earliest available sample of each patient in a cross-validated manner (Figure 6f). We observed good performance of the classifier (median area under receiving operator curve [ROC AUC], 0.81) compared with one with randomized severity labels (median ROC AUC, 0.49) (Figure 6g), providing good balance between true positive and false positive rates. Since our dataset is composed of immune populations from seven flow cytometry panels, we tested whether a smaller number of variables could discern patients with mild and severe disease courses. With only eight variables, the classifier could distinguish patients with different disease severities, albeit with lower performance (ROC AUC, 0.73 vs. 0.49 with randomized labels) (Figure 6h). Furthermore, we hypothesized that our classifier could be used for real-time immuno-monitoring of COVID-19 patients. Thus, we applied it to subsequent samples of patients with more than three samples collected over the disease course, while withholding those samples from the training set (Figure 6i). Patient 26, who had an overall mild disease course, had all samples classified as mild; severe patients often showed dynamic severity probabilities over time, with at least one timepoint classified as severe disease. To exemplify how this prediction relates back to flow cytometry data, we illustrate the aggregated expression of the activation marker CD25 and CD45RA/RO in single T cells over time (Figure 6j). Patients with lower predicted severity toward the end of their course (e.g., patient 23) tended to have less CD25 expression and increased CD45RA expression, while the opposite was also true (e.g., patient 16). Patients with predictions that were either more stable or dynamic over time (patients 26 and 24, respectively) showed dynamics of expression in accordance to their overall predicted pattern over time. This proof-of-principle work demonstrates our ability to leverage high-content immune profiling to predict overall disease course and provides the basis for real-time immune-monitoring of COVID-19 patients.

Discussion

Here, we describe the circulating immune landscape of COVID-19 patients compared with healthy individuals. Consistent with previous reports[11,14,15,29], we demonstrate that disease progression is dominated by the progressive loss of circulating lymphocytes and gain of myeloid cells. We also detected selective expansion of NK populations and MDSCs, suggesting that the innate compartment may contribute to the immunological disarray of COVID-19 patients. We then harnessed this multi-dimensional dataset to generate a machine-learning classifier that could predict disease severity using a defined flow-cytometric signature. Our work provides a proof-of-concept that an immune-monitoring algorithm could provide a rapid and personalized approach to manage COVID-19. While previous studies have focused on lymphocyte populations[12,14,15,38], to our knowledge the role of innate immune cells is less understood[39]. Our study highlights the expansion of MDSCs, especially G-MDSCs, in severe COVID-19 patients. Unlike their natural counterparts, these elements have suppressive function[40] that impairs immune responses in cancer[34] and derails effective responses against bacterial and viral infections by the adaptive immune system[41,42]. Given the overall depletion of the immune system’s lymphoid branch during COVID-19, an interesting hypothesis is that G-MDSCs and other myeloid cells represent uncontrolled negative feedback. These elements ultimately contribute to the establishment of pan-immunosuppression, leading to dysregulated responses from the adaptive immune system. It will be essential to establish whether they are actively recruited to infected lungs and whether they are causally involved in disease pathogenesis or represent a systemic compensatory response to inflammation. Since MDSCs are virtually absent in healthy individuals, questions arise regarding the mechanisms of their genesis and tissue recruitment and how they interact with lung tissue. At the same time, our novel observation that NK cells expressing the CD158i variant are over-represented in patients with severe disease raises the question of whether this variant and other KIRs implicate NK cells in disease progression (Figure 3). Within the adaptive immune system, the mechanisms leading to severe immune depletion, a landmark seen with disease progression and markedly apparent in autopsy samples, are unknown[43]. To this end, we observed increased CD25+ T cells in COVID-19 patients, indicating a higher state of activation (Figure 2), but also an increase in CD95+ with disease progression. This phenotype was significantly marked in severe patients, consistent with a recent study[38]. FAS has a crucial role in mediating cell death via FAS ligand engagement, as in activation-induced cell death (AICD), or by shifting cells to a more apoptotic-prone phenotype. While FAS is a natural regulatory checkpoint of T cells, it plays a role in autoimmunity[44] and cancer[45], and AICD is involved in loss of CD4+ and CD8+ cells in HIV patients[46]. However, severely exhausted T cells can undergo apoptosis, and virus-specific T cell decline can favor viral escape[47-49]. Indeed, similar to previous reports[17,18,50], T cells displayed an overall exhausted phenotype, with overexpression of VISTA, TIM3, LAG3, TIGIT, and PD-1 co-inhibitory receptors in COVID-19 patient T cell populations. This likely results in inability of the adaptive immune system to keep viral proliferation in check. In the B cell compartment, we observed lower expression of CD19 in COVID-19 patients and higher expression of membrane-bound IgM and IgG in both mild and severe patients. These data suggest that under viral exposure, B cells undergo plasmacytoid maturation and immunoglobulin switching. Remarkably, several patients displayed higher IgM than IgG CD19+CD20+/– cells, suggesting abnormal and delayed maturation of plasma cells (Figure 4). While the implications remain speculative, they do warrant further investigation given the central role of B cells in the development of immunity by COVID-19 patients. Taking advantage of our dataset’s high-dimensional characteristics and pseudo-temporal modeling, we constructed a COVID-19 disease course landscape. This strategy reveals a continuum of disease progression between healthy state, mild disease, and severe disease. Remarkably, convalescent patients displayed immune phenotypes similar to healthy donors, suggesting a possible return to a largely healthy state, as previously suggested based on the exhaustion phenotype in adaptive responses[18]. Conversely, we could speculate that the immune landscape of mild/convalescent patients never achieved the level of disarray observed in severe patients. While there were marked differences between patients with prevalent mild or severe disease, their recognition remains a unique challenge. One interesting open question is whether the changes associated with mild vs. severe disease protect against disease progression or, conversely, which immune populations related to severe illness play a role in the progression to severe disease. While proof-of-principle, our classifier of severe disease shows robustness and overall value in predicting disease progression based on immune profiling and near-real-time disease monitoring. Thus, it may be valuable to inform clinical action like that proposed in chronic diseases with other high-dimensional assays[50-52]. Moreover, we demonstrated that a classifier with a limited number of markers retains good performance. If confirmed in large cohorts, it could provide a useful approach to stratify patients and predict clinical evolution using a rapid and economical assay. Although our study confirmed some findings and provides new data on the innate immune landscape of COVID-19 patients, we recognize limitations that could be overcome with larger sample sizes and matched control populations. Indeed, large population studies may discover new immune populations and find a relationship with disease progression or clinical factors. In this current pandemic, profiling a larger sample of the population and investigating multiple time points may help identify viral adaptation to the host (particularly when coupled to analysis of viral sequence) in patients with different outcomes. These programs may be achieved if an effective institutional organization, multicentric networking, and substantial financial support are available. The targeted nature of flow cytometry interrogates limited sets of immune populations and implies that only certain molecules can be effectively profiled. In our study we used mainly proportional data when comparing the abundance of immune populations between patient groups. While this may not necessarily imply absolute changes in cell numbers, we observed good overall agreement between changes in proportions and absolute counts when comparing severe and mild disease status (Supplementary Figure 6 and Supplementary Table 7). This highlights the importance of studies employing orthogonal modalities such as cytokine profiling[51], single-cell RNA sequencing[52-56], and their integration[57,58]. Nevertheless, even without orthogonal studies, our machine learning approach for predicting disease severity demonstrates predictive potential, although it should be tested in a validation cohort before use in a clinical setting. Lastly, we wish to note the work of others and their complementary findings. For example, Laing et al.[32] used peripheral blood flow cytometry and circulating cytokine measurements to demonstrate apparent immune dysregulation in COVID-19 patients. They highlighted additional interesting, complementary features, including increased IL-6, IL-10, and IP-10 and depletions of basophils, plasmacytoid dendritic cells, TH1 cells, and TH17 cells. Incorporating their most differentiating markers with ours could yield a more complete yet targeted panel of markers with more predictive power to determine which patients will rapidly progress to a severe disease state. This incorporation of additional differentiating markers should be pursued in future studies. Collectively, our study highlights a profound imbalance in the COVID-19 immune landscape, characterized by G-MDSC expansion and T cell exhaustion that may open avenues for clinical translation. Further, our approach provides a powerful tool to predict clinical outcomes and tailor more effective and proactive therapies to COVID-19 patients.

Methods

Study design, sample acquisition, and clinical data

The study was approved by the Institutional Review Board of Weill Cornell Medicine. Participants were recruited from patients hospitalized at New York Presbyterian Hospital from April to July 2020. Some participants in a COVID-19 convalescent plasma donor screening program with prior confirmed diagnosis (by RT-PCR or serology) were given the option to contribute a sample for this research. ARDS was categorized in accordance with the Berlin definition reflecting each subject’s worst oxygenation level and with physicians adjudicating chest radiographs[59]. Informed consent was obtained from all participants.

Flow cytometry

For flow cytometric analysis of circulating leukocytes, peripheral blood was collected in Na-heparin. Except for the MDSC panel, in which PBMCs were prepared by density gradient centrifugation, erythrocytes were lysed with BD Pharm Lyse. Peripheral blood was washed in Dulbecco’s PBS (DPBS), lysed in 1× BD Pharm Lyse, and washed again in DPBS. PBMC cell suspensions were prepared with Ficoll-Paque following the manufacturer’s protocol. Cells were stored briefly in storage medium (10% heat-inactivated fetal bovine serum/1% L-glutamine/1% pen-strep) before staining with antibody cocktails. For each panel, one million cells were stained with specific cocktails of fluorochrome-conjugated antibodies (Supplementary Table 3-4). Cells were washed with DPBS then stained with dead cell dye (BD Fixable Viability Stain 700) before washing with wash buffer (0.5% BSA/DPBS/NaN3). Cells were then treated with 50 μL of Fc-blocking solution (2% normal rabbit serum/10% BD Fc Block/DPBS)before application of a 100-μL antibody cocktail diluted in wash buffer. Samples were stained within 6 hours of sample collection and analyzed on a BD Biosciences FACSCanto flow cytometer within 2 hours of staining. The stopping gate was set to acquire 500,000 viable, nucleated single cells.

Supervised quantification of immune cell populations (gating)

Immune populations were quantified by manual analysis with BD FACSDiva. Absolute counts of populations were exported to CSV and relative population sizes were calculated in Microsoft Excel. Gating for each panel started with a time gate, followed by a singlet gate (FSC-A vs. FSC-H). Next, viable cells (dead cell dye vs. FSC-A) and nucleated cells (FSC-A vs. SSC-A) were gated. Populations of MDSCs were gated sequentially from leukocytes (CD45 vs. SSC-A), then CD3/CD56/CD19 (Lin)- and HLA-DR −/dim cells, followed by CD33+ and CD11b+ cells. From there, granulocytic cells were defined as CD14− and CD15+, monocytic cells as CD14+ and CD15 −/dim and immature forms as CD14– CD15 −/dim. TREG were defined by sequential gating of lymphocytes (CD45 vs. SSC), T cells (CD45 vs. CD3), T helper (CD8 vs. CD4), and finally TREG were defined as CD127 dim and CD25+. The TFH panel was gated the same as in the T cell regulatory panel down to the CD4+ helper gate. Under this gate, CD185+ cells were quantified (CD185 vs. CD8) and the ICOS bright, PD-1 bright (CD278 vs. CD279) cells were gated. The ICOS bright, PD-1 bright TFH gate was placed under the CD185+ gate to identify the population with all the phenotypic markers of TFH lineage in this panel. The analysis of the T cell memory and checkpoint panels started with identifying T cells (CD3 vs. SSC), then the CD4 helper and CD8 cytotoxic subsets. To analyze the T cell checkpoint panel, individual exhaustion markers were gated on histogram plots. The T cell memory panel was further subdivided into CD45RA+/CD45RO− and CD45RA−/CD45RO+ subsets. Under these gates, two quadrant gates were placed on CD62L vs. CCR7 and CD62L vs. FAS. Gating for the B-cell panel began with CD45 vs. CD19 then FSC-A and SSC-A to identify cells of the B lineage. The CD20+ and CD20− subsets were gated (CD19 vs. CD20) and IgG and IgM were quantified within the CD20+ subset (IgM vs. IgG). NK cells were identified by sequential gates on CD56 vs. CD3, FSC-A vs. SSC-A, and CD56 vs. CD16. CD56+, CD16 bright, mature NK cells were then interrogated for their reactivity with individual anti-KIR (CD158) and anti-NKG2A (CD159a) antibodies with gates on histogram plots. KIR-negative NK were identified by sequential gating on CD158a vs. CD158b double-negative, then CD158i and CD158e double-negative subsets.

Statistical testing

Non-parametric Mann-Whitney U tests were employed to assess the significance of pairwise changes in the proportions of immune populations between severity groups using the Pingouin package, version 0.3.7. Multiple test correction was performed with the Benjamini-Hochberg FDR method.

Single-cell analysis of immune cell populations

To select cells from the events, single cells were gated using forward-side scatter height and area, CD45-positivity, viability dye-negativity, and the major marker of each panel (e.g.. CD3 for T cell memory panel). Compensation was applied using FlowKit[60] version 0.5.0, and an inverse hyperbolic transformation (AsinhTransform) was applied with parameters t = 10000, m = 4.5, a = 0. To construct a shared latent representation for all cells, dimensionality reduction was performed with Principal Component Analysis (PCA), a neighbor graph was computed using 15 neighbors per cell, UMAP[61] with default parameters, and Leiden clustering, all using the Scanpy package[62] version 1.5.1. For each discovered single-cell cluster, a proportion of cells was calculated in relation to a specific clinical factor after normalization by the frequency of the same factor in the cohort.

Pseudotime inference and time series modeling of immune cell dynamics during disease progression

To learn a latent manifold of the data, the non-linear method Laplacian Eingenmaps[63] was used as implemented in the “SpectralEmbedding” method of the scikit-learn framework[64] (version 0.23.0) with default parameters. A z-scored matrix of proportional data was input for all immune cell populations (variables) and patient samples (observations). To rank the features by their association with the learned space, Pearson’s correlation was calculated between the first component and each variable, in addition to the fold and absolute change in the variable between the top and bottom 10% of the samples in each extreme of the embedding. The same procedure applied to a Uniform Manifold Approximation and Projection (UMAP) latent representation of the same data yielded similar results, with the exception that the spread of samples according to disease progression was parallel to multiple learned axes rather than single. To rank variables by the amount of change in both real time since the reported start of symptoms for a single patient or over the learned latent space across all patients, GPy package[65] was used to fit Gaussian Process regression models on the learned pseudotime axis (independent variable) and the abundance of each immune cell population (dependent variables). A variable radial basis function (RBF) kernel and a constant kernel (both with an added noise kernel) were fitted and the log-likelihood and standard deviation of the posterior probability of the two were compared as described previously[66]. To cluster the abundance of immune populations based on their dynamics over the pseudotime axis, the same kernels were used to fit a Mixture of Hierarchical Gaussian Processes (MOHGP) as implemented in the GPClust package[67,68] using 8 as an initial guess of number of clusters.

Linear modelling of immune cell type abundances

Due to the proportional nature of the dataset, generalized linear models were fit using a Gamma-distributed noise model with a log-link function. Ridge regularization was employed to ensure robust coefficients given the low abundance of some populations, and the model was fit with ordinary least squares optimization using the statsmodels package[69] version 0.11.1. Categorical variables were one-hot encoded and numeric ones such as age or days since symptoms started were kept as years or days, respectively; the date of acquisition was transformed into days and scaled to the unit interval. Since values for clinical categorical variables and comorbidities were only available to COVID-19 patients, various models were employed that aimed to explore different aspects of immune system change during COVID-19: Comparison of healthy donors to COVID-19 patients: sex + race + age + batch + COVID19 Effect of clinical/demographic factors on COVID-19 patients: sex + race + batch + COVID19 + severity group + hospitalization + intubation + death + diabetes + obesity + hypertension + age in years + days since symptoms start Effect of tocilizumab treatment on severe patients only: sex + age + batch + tocilizumab To generate a graph of interactions between factors and immune populations, significant coefficients (FDR-adjusted p-value <0.05) were used as undirected edges between factors and immune populations. For edges between factors, the Pearson correlation between factors across immune populations was used. Exclusively for visualization, coefficients for the continuous variables “age” and “time since symptoms started” were multiplied by half of the median of the values of that variable (33.0 and 10.8, respectively) to make the range of coefficients comparable with the categorical variables. Visualizations were produced using Gephi version 0.9.2 with the Force Atlas2 layout with parameters “LinLog mode”, “scaling factor” 8.0, and “gravity” 11.0.

Prediction of disease severity from immunotypes

A Random Forest Classifier was trained as implemented in scikit-learn framework[64] (version 0.23.0) to distinguish between cases with “mild” and “severe” disease using 10-fold cross validation. The cross-validation loop was repeated 100 times and models were fit with real or randomized labels. Test set performance was assessed with the ROC-AUC. To investigate the performance of the classifier, feature importance was averaged across cross validation folds and iterations and the log fold importance of the real models over the randomized labels was calculated. A sign was added to the feature importance depending on the sign of the Pearson correlation of each variable with each class. Only the earliest temporal sample of each patient was used to ensure lack of data leakage (avoid training/testing on samples from the same patient without stratified cross validation) and to maximize the utility of the model. The same cross-validation scheme was used to develop a classifier using a subset of features but including feature selection using mutual information inside the cross-validation loop. To predict severity longitudinally for single patients, a model was trained on the initial samples from all other patients and tested on the samples of the patient in question.

Data availability

Quantification of immune cell populations is available as a Supplementary Table file. Hierarchical data format files with single cell data (h5ad) are available as indicated in the repository with source code for the study (https://github.com/ElementoLab/covid-flowcyto) and raw FCS files will be made available upon post-peer review publication.

Code availability

The source code for the analysis is available on GitHub at the following URL: https://github.com/ElementoLab/covid-flowcyto

57 in total

1. Tocilizumab in patients with severe COVID-19: a retrospective cohort study.

Authors: Giovanni Guaraldi; Marianna Meschiari; Alessandro Cozzi-Lepri; Jovana Milic; Roberto Tonelli; Marianna Menozzi; Erica Franceschini; Gianluca Cuomo; Gabriella Orlando; Vanni Borghi; Antonella Santoro; Margherita Di Gaetano; Cinzia Puzzolante; Federica Carli; Andrea Bedini; Luca Corradi; Riccardo Fantini; Ivana Castaniere; Luca Tabbì; Massimo Girardis; Sara Tedeschi; Maddalena Giannella; Michele Bartoletti; Renato Pascale; Giovanni Dolci; Lucio Brugioni; Antonello Pietrangelo; Andrea Cossarizza; Federico Pea; Enrico Clini; Carlo Salvarani; Marco Massari; Pier Luigi Viale; Cristina Mussini
Journal: Lancet Rheumatol Date: 2020-06-24

2. Histopathology and ultrastructural findings of fatal COVID-19 infections in Washington State: a case series.

Authors: Benjamin T Bradley; Heather Maioli; Robert Johnston; Irfan Chaudhry; Susan L Fink; Haodong Xu; Behzad Najafian; Gail Deutsch; J Matthew Lacy; Timothy Williams; Nicole Yarid; Desiree A Marshall
Journal: Lancet Date: 2020-07-16 Impact factor: 79.321

3. Deep immune profiling of COVID-19 patients reveals distinct immunotypes with therapeutic implications.

Authors: Divij Mathew; Josephine R Giles; Amy E Baxter; Derek A Oldridge; Allison R Greenplate; Jennifer E Wu; Cécile Alanio; Leticia Kuri-Cervantes; M Betina Pampena; Kurt D'Andrea; Sasikanth Manne; Zeyu Chen; Yinghui Jane Huang; John P Reilly; Ariel R Weisman; Caroline A G Ittner; Oliva Kuthuru; Jeanette Dougherty; Kito Nzingha; Nicholas Han; Justin Kim; Ajinkya Pattekar; Eileen C Goodwin; Elizabeth M Anderson; Madison E Weirick; Sigrid Gouma; Claudia P Arevalo; Marcus J Bolton; Fang Chen; Simon F Lacey; Holly Ramage; Sara Cherry; Scott E Hensley; Sokratis A Apostolidis; Alexander C Huang; Laura A Vella; Michael R Betts; Nuala J Meyer; E John Wherry
Journal: Science Date: 2020-07-15 Impact factor: 47.728

4. Marked T cell activation, senescence, exhaustion and skewing towards TH17 in patients with COVID-19 pneumonia.

Authors: Sara De Biasi; Marianna Meschiari; Lara Gibellini; Caterina Bellinazzi; Rebecca Borella; Lucia Fidanza; Licia Gozzi; Anna Iannone; Domenico Lo Tartaro; Marco Mattioli; Annamaria Paolini; Marianna Menozzi; Jovana Milić; Giacomo Franceschi; Riccardo Fantini; Roberto Tonelli; Marco Sita; Mario Sarti; Tommaso Trenti; Lucio Brugioni; Luca Cicchetti; Fabio Facchinetti; Antonello Pietrangelo; Enrico Clini; Massimo Girardis; Giovanni Guaraldi; Cristina Mussini; Andrea Cossarizza
Journal: Nat Commun Date: 2020-07-06 Impact factor: 14.919

5. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors: Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal: Lancet Date: 2020-01-24 Impact factor: 79.321

6. Impaired type I interferon activity and inflammatory responses in severe COVID-19 patients.

Authors: Jérôme Hadjadj; Nader Yatim; Darragh Duffy; Frédéric Rieux-Laucat; Solen Kernéis; Benjamin Terrier; Laura Barnabei; Aurélien Corneau; Jeremy Boussier; Nikaïa Smith; Hélène Péré; Bruno Charbit; Vincent Bondet; Camille Chenevier-Gobeaux; Paul Breillat; Nicolas Carlier; Rémy Gauzit; Caroline Morbieu; Frédéric Pène; Nathalie Marin; Nicolas Roche; Tali-Anne Szwebel; Sarah H Merkling; Jean-Marc Treluyer; David Veyer; Luc Mouthon; Catherine Blanc; Pierre-Louis Tharaux; Flore Rozenberg; Alain Fischer
Journal: Science Date: 2020-07-13 Impact factor: 47.728