| Literature DB >> 35269470 |
Lena F Schimke1, Alexandre H C Marques1, Gabriela Crispim Baiocchi1, Caroline Aliane de Souza Prado2, Dennyson Leandro M Fonseca2, Paula Paccielli Freire1, Desirée Rodrigues Plaça2, Igor Salerno Filgueiras1, Ranieri Coelho Salgado1, Gabriel Jansen-Marques3, Antonio Edson Rocha Oliveira2, Jean Pierre Schatzmann Peron1, Gustavo Cabral-Miranda1, José Alexandre Marzagão Barbuto1,4, Niels Olsen Saraiva Camara1, Vera Lúcia Garcia Calich1, Hans D Ochs5, Antonio Condino-Neto1, Katherine A Overmyer6,7, Joshua J Coon6,7,8,9, Joseph Balnis10,11, Ariel Jaitovich10,11, Jonas Schulte-Schrepping12,13, Thomas Ulas13,14, Joachim L Schultze12,13,14, Helder I Nakaya2,15,16, Igor Jurisica17,18,19, Otávio Cabral-Marques1,2,20.
Abstract
Severe COVID-19 patients present a clinical and laboratory overlap with other hyperinflammatory conditions such as hemophagocytic lymphohistiocytosis (HLH). However, the underlying mechanisms of these conditions remain to be explored. Here, we investigated the transcriptome of 1596 individuals, including patients with COVID-19 in comparison to healthy controls, other acute inflammatory states (HLH, multisystem inflammatory syndrome in children [MIS-C], Kawasaki disease [KD]), and different respiratory infections (seasonal coronavirus, influenza, bacterial pneumonia). We observed that COVID-19 and HLH share immunological pathways (cytokine/chemokine signaling and neutrophil-mediated immune responses), including gene signatures that stratify COVID-19 patients admitted to the intensive care unit (ICU) and COVID-19_nonICU patients. Of note, among the common differentially expressed genes (DEG), there is a cluster of neutrophil-associated genes that reflects a generalized hyperinflammatory state since it is also dysregulated in patients with KD and bacterial pneumonia. These genes are dysregulated at the protein level across several COVID-19 studies and form an interconnected network with differentially expressed plasma proteins that point to neutrophil hyperactivation in COVID-19 patients admitted to the intensive care unit. scRNAseq analysis indicated that these genes are specifically upregulated across different leukocyte populations, including lymphocyte subsets and immature neutrophils. Artificial intelligence modeling confirmed the strong association of these genes with COVID-19 severity. Thus, our work indicates putative therapeutic pathways for intervention.Entities:
Keywords: COVID-19; acute inflammatory states; integrative analysis of omics data; neutrophil activation; systems biology; transcriptome profile
Mesh:
Year: 2022 PMID: 35269470 PMCID: PMC8909161 DOI: 10.3390/cells11050847
Source DB: PubMed Journal: Cells ISSN: 2073-4409 Impact factor: 7.666
Dataset Information and sample size used for transcriptome analysis.
| Data-Base | Dataset ID | Seq. Method | Sample Type | Disease Type of Patients (Sample Size) | Type of Controls | Original Study |
|---|---|---|---|---|---|---|
| GEO | GSE152418 | bulk-RNA seq | PBMC | COVID-19 ( | healthy controls ( | Arunachalam et al., 2020 [ |
| GEO | GSE157103 | bulk-RNA seq | PBL | COVID-19_ICU ( | SARS-CoV-2 negative ICU ( | Overmyer et al., 2020 [ |
| GEO | GSE152075 | bulk-RNA seq | nph swab | COVID-19 ( | SARS-CoV-2 negative ( | Liebermann et al., 2020 [ |
| GEO | GSE156063 | bulk-RNA seq | nph swab | COVID-19 ( | NIRD ( | Mick et al., 2020 [ |
| EGA | EGAS | scRNA seq | PBL/ | Cohort1: | healthy controls ( | Schulte-Schrepping et al., 2020 [ |
| GEO | GSE26050 | microarray | PBMC | HLH ( | healthy controls ( | Sumegi et al., 2011 [ |
| GEO | GSE163151 | bulk-RNA seq | nph swab | COVID-19 ( | healthy controls ( | Ng et al., 2021 [ |
| GEO | GSE152641 | bulk-RNA seq | PBL | COVID-19 ( | healthy controls ( | Thair et al., 2021 [ |
| GEO | GSE161731 | bulk-RNA seq | PBL | COVID-19 ( | healthy controls ( | McClain et al., 2021 [ |
| GEO | GSE178388 | bulk-RNA seq | PBL | MIS-C ( | healthy controls ( | Beckmann et al., 2021 [ |
| GEO | GSE73461 | microarray | PBL | KD ( | healthy controls ( | Wright et al., 2018 [ |
nph, nasopharyngeal; PBMC, peripheral blood mononuclear cells; PBL, peripheral blood leucocytes; HLH, hemophagocytic lymphohistiocytosis; bact. pneum., bacterial pneumonia; CoV, coronavirus other than SARS-CoV-2; MIS-C, multisystem inflammatory syndrome in children; KD, Kawasaki disease.
Figure 1Transcriptional overlap between COVID-19 and HLH. (A) Number of differentially expressed genes (DEGs, up- and down-regulated) by dataset. (B) Circos plot showing 237 common up-regulated DEGs between HLH and the different COVID-19 datasets (red lines: number at the end of each line indicates exact number of shared DEGs), divided into three overlapping subgroups (detailed in Supplementary Tables S2 and S3). The thickness of each line represents the number of genes shared between the different datasets. (C) Protein-protein interaction network among the 237 transcripts and seven genes causing HLH due to inborn errors of immunity (IEI). Node colours denote Gene Ontology Biological Processes. The label (gene name) colours represent transcripts from Overlap 1 (green), Overlap 2 (red), and Overlap 3 (blue). The center circle and side circles represent common molecules across all three or two overlapping datasets, respectively. The upper left subnetwork represents the interactions between the seven genes associated with HLH and those from overlaps are in bold. The circle on upper left (gene names not shown) contains 1329 proteins connected by 217 interactions with the seven HLH/IEI-associated genes. The full network comprises 1538 proteins and 2522 direct physical interactions obtained from the IID database ver. 2021-05 [47].
Figure 2Cytokine/chemotaxis and neutrophil-associated transcriptional signatures predominate in the COVID-19 and HLH overlap. (A) Dot plot showing the most significant biological processes (BP) enriched by the 237 common up-regulated transcripts of COVID-19 and HLH datasets. The dot size is proportional to the number of genes enriching the gene ontology (GO) term and color proportional to adjusted p-value (green > significant than blue). (B) Network highlighting genes and cellular component (CC) associations. Only enriched terms with adjusted p-value < 0.05 are shown by small grey circles. The degree of associations is displayed by edge color and thickness (e.g., lighter colors and thinner edges signify fewer connections). Node color represents different GO CCs. Both enriched CCs and BPs were analyzed using ClusterProfiler with R programming. (C–E) Bubble heatmaps showing the hierarchical clustering based on Euclidian distance of expression patterns of genes associated to (C) cytokine signaling, (D) chemotaxis, and (E) neutrophil-mediated immunity in COVID-19 and HLH datasets. The color of circles corresponds to log2 fold change (log2FC). Pleiotropic genes belonging to more than one category are bold (Supplementary Table S8).
Figure 3Infection with SARS-CoV-2 impacts the correlation between cytokine/chemotaxis- and neutrophil-mediated immunity genes. (A,B) Estimated correlations of cytokine signaling/chemotaxis and neutrophil-mediated immunity molecules ranging from −1 to 1 versus their corresponding first two canonical variates (x-CV1 and x-CV2 for cytokine/chemotaxis-related genes; y-CV1 and y-CV2 for neutrophil-mediated immunity genes) in (A) controls and (B) COVID-19 patients. Cytokine/chemotaxis and neutrophil-mediated immunity genes with a Spearman rank correlation of ≥0.7 are colored in green and blue, respectively, while those with a Spearman rank correlation of <0.7 are gray in both groups. (C) Scatter plots with marginal boxplots display the relationship between variables (genes). Correlation coefficient (ρ) and significance level (p-value) for each correlation are shown within each graph.
Figure 4Transcripts stratifying severe COVID-19 from other respiratory diseases and HLH from healthy controls. (A) Schematic overview of study design and patient classification of dataset GSE157103 reported by Overmyer et al. [26]. Created with BioRender.com. (B) Protein-protein interaction (PPI) network highlighting interactions among the 158 proteins and the 25 genes significant for severe COVID-19_ICU, while keeping their other interacting partners (n = 9921) in the middle circle. The node colour denotes Gene Ontology Biological Process terms. The left circle shows 123 proteins and 554 interactions, the upper right half circle shows 21 proteins and 29 interactions, and the lower right side half circle shows 25 proteins and 65 interactions. Molecules involved in neutrophil-mediated immunity are highlighted with a blue node outline. (C) Correlation matrices of the 25 DEGs in controls and COVID-19 groups (Controls, left matrix; COVID-19_nonICU, middle matrix; and COVID-19_ICU, right matrix). The color scale bar represents the Pearson’s correlation coefficient, containing negative and positive correlations from −1 to 1, respectively. (D) Scatter plots with marginal boxplots display the relationship between the eight genes stratifying severe COVID-19. Correlation coefficient (ρ) and significance level (p-value) for each correlation is shown within each graph. (E) Principal Component Analysis (PCA) with spectral decomposition shows the stratification of COVID-19_ICU from COVID-19_nonICU and other respiratory diseases (Control_nonICU and Control_ICU). Variables with positive correlation are pointing to the same side of the plot, contrasting with negative correlated variables, which point to opposite sides. Confidence ellipses are shown for each group/category. Bar plots associated with the PCA represent the sample distribution across the PCA axes. (F) PCA displaying the stratification of HLH patients and healthy controls is based on the same 25 DEGs as in (E).
Figure 5Severe COVID-19 shares a common neutrophil activation signature with other acute inflammatory states. (A) Schematic overview of the additional datasets included to evaluate the modulation of the 25 DEGs strongly associated with COVID-19_ICU. Created with BioRender.com. (B) Bubble heatmap showing the hierarchical clustering based on one minus spearman rank correlation of the expression pattern of these 25 DEGs across different datasets. Cluster 1 comprises genes associated with neutrophil degranulation and neutrophil-mediated immunity enriched terms, while cluster 2 includes genes enriched in inflammatory response and cytokine-mediated signaling pathway gene ontology (GO) categories. The color of the circles corresponds to the up- and downregulation according to the log2 fold change (log2FC) of each DEG, while the circle size denotes the significant level of each DEG according to the adjusted p-value. HC, healthy controls; COV, seasonal coronavirus other than SARS-CoV-2.
Figure 6Multi-layered transcriptomic analysis associates neutrophil activation signature with COVID-19 severity. (A) Schematic overview of sample cohort and classification of scRNA seq dataset obtained by Schulte-Schrepping et al. [29] and used for the following analysis. Created with BioRender.com. (B) Heatmap showing scRNA seq expression of differentially expressed genes (DEGs) associated with disease severity. Cells and cohorts (controls, mild and severe COVID-19) are indicated by different colors in the legends. (C) Box plots of scRNA seq expression demonstrating that 11 of the 21 genes identified in (B) are up-regulated when comparing severe and mild COVID-19 patients. (D) Box plots of the 11 genes stratifying COVID-19_ICU patients from COVID-19_nonICU patients obtained from the bulk-RNA seq dataset from Overmyer et al. [26]. Significant differences between groups are indicated by asterisks (* p ≤ 0.05, ** p ≤ 0.01, *** p ≤ 0.001 and **** p < 0.0001). (E) Box plots of microarray data illustrating that the disease severity association of COVID-19 detected by scRNA seq corresponds to the expression differences of these genes between HLH patients and controls obtained from the dataset published by Sumegi et al. [30]. Significant differences between groups are indicated by asterisks (* p ≤ 0.05, ** p ≤ 0.01 and *** p ≤ 0.001).
Figure 7Random Forest prediction analysis suggests potential biomarkers for severe COVID-19. (A) Receiver operating characteristics (ROC) curve of 11 genes from COVID-19_ICU compared to COVID-19_nonICU patients with an area under the curve (AUC) of 82.4% for both groups. 1 = COVID-19_nonICU; 2 = COVID-19_ICU. (B) Stable curve showing number of trees and error rate (out of bag or OOB) with medium of 27.03%. 1 = COVID-19_nonICU; 2 = COVID-19_ICU. (C) Variable importance scores plot based on Gini decrease and number (no) of nodes for each variable showing which variables are more likely to be essential in the random forest’s prediction. (D) Ranking of the top 10 variables according to mean minimal depth (vertical bar with the mean value in it) calculated using trees. The blue color gradient reveals the min and max minimal depth for each variable. The range of the x-axis is from zero to the maximum number of trees for the feature. (E) Mean minimal depth variable interaction plot showing the most frequent occurring interactions between the variables on the left side with light blue color, and least frequent occurring interactions on the right side of the graph with dark blue color. The red horizontal line indicates the smallest mean minimum depth and the black lollipop represents the unconditional mean minimal depth of a variable.