Literature DB >> 35991542

COVID-19-specific transcriptomic signature detectable in blood across multiple cohorts.

Tommi Välikangas1, Sini Junttila1, Kalle T Rytkönen1,2, Anu Kukkonen-Macchi1, Tomi Suomi1, Laura L Elo1,2.   

Abstract

The coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is spreading across the world despite vast global vaccination efforts. Consequently, many studies have looked for potential human host factors and immune mechanisms associated with the disease. However, most studies have focused on comparing COVID-19 patients to healthy controls, while fewer have elucidated the specific host factors distinguishing COVID-19 from other infections. To discover genes specifically related to COVID-19, we reanalyzed transcriptome data from nine independent cohort studies, covering multiple infections, including COVID-19, influenza, seasonal coronaviruses, and bacterial pneumonia. The identified COVID-19-specific signature consisted of 149 genes, involving many signals previously associated with the disease, such as induction of a strong immunoglobulin response and hemostasis, as well as dysregulation of cell cycle-related processes. Additionally, potential new gene candidates related to COVID-19 were discovered. To facilitate exploration of the signature with respect to disease severity, disease progression, and different cell types, we also offer an online tool for easy visualization of the selected genes across multiple datasets at both bulk and single-cell levels.
Copyright © 2022 Välikangas, Junttila, Rytkönen, Kukkonen-Macchi, Suomi and Elo.

Entities:  

Keywords:  COVID-19; RNA-seq; infection; severity; signature; single-cell

Year:  2022        PMID: 35991542      PMCID: PMC9388772          DOI: 10.3389/fgene.2022.929887

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.772


Introduction

The coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has flared into a worldwide pandemic. Despite ongoing massive vaccination efforts, the disease is still actively spreading in many parts of the world. Although a large proportion of the SARS-CoV-2-infected individuals remain asymptomatic or experience only mild symptoms, an estimated 6%–15% of them undergo severe symptoms (Guan et al., 2020; Wu and McGoogan, 2020). The case fatality rate is estimated to be from 1.2% to 2.6% (Russell et al., 2020; Verity et al., 2020; Wu and McGoogan, 2020), with severe and fatal outcomes substantially more pronounced in older patients (Verity et al., 2020; Wu and McGoogan, 2020; Yang et al., 2020). Severe forms of the disease are often associated with a hyperinflammatory state, the so-called cytokine storm, where increased levels of many proinflammatory cytokines and lymphopenia have been observed (Huang and Pranata, 2020; Ong et al., 2020; Pedersen and Ho, 2020). Overall, the high mortality in COVID-19 is a consequence of alveolar damage and pneumonia, cardiovascular complications, and multi-organ failure (Brunetta et al., 2021). Several studies have sought to elucidate the human host factors and immune mechanisms related to COVID-19, its severity, and post-infection recovery. Beyond studies of the respiratory microenvironment, a number of bulk and single-cell RNA-sequencing (RNA-seq) studies have focused on responses observed in the periphery using whole blood (Ng et al., 2021), peripheral blood mononuclear cells (PBMCs) (Lee et al., 2020; Wilk et al., 2020; Liu et al., 2021), or isolated subsets of blood cells (Brunetta et al., 2021; Overmyer et al., 2021). However, most of these studies have focused on comparing the transcriptome profiles of COVID-19 patients with healthy controls (e.g., Lee et al., 2020; Liu et al., 2021), while much fewer studies have compared the COVID-19 signatures with other common respiratory infections, such as influenza (Lee et al., 2020; McClain et al., 2021; Ng et al., 2021). To establish a robust transcriptomic signature specific to COVID-19 and to deepen the understanding of the disease-related host processes related specifically to COVID-19, we reanalyzed transcriptome data from nine independent cohort studies (Arunachalam et al., 2020; Lee et al., 2020; Wilk et al., 2020; Brunetta et al., 2021; Combes et al., 2021; Liu et al., 2021; McClain et al., 2021; Ng et al., 2021; Overmyer et al., 2021), covering over 500 individual profiles, including patients with common respiratory infections (influenza, seasonal coronavirus, and bacterial pneumonia) together with COVID-19 patients and healthy controls. We discovered a COVID-19 specific signature that appeared systematically across the cohorts. Additionally, we explored the association of the signature genes with disease severity, disease progression, and different cell types. To facilitate easy investigation of the signature, we also present an online tool for easy visualization of the selected genes across multiple datasets at both bulk and single-cell levels (https://elolab.shinyapps.io/COVID19).

Results

To establish a robust COVID-19 specific transcriptomic signature, we analyzed a total of nine previously published COVID-19 bulk or single-cell RNA-seq datasets, containing a total of 511 individuals (Table 1). Two of the datasets (Lee et al., 2020; McClain et al., 2021) were used for signature identification, while the other seven datasets (Arunachalam et al., 2020; Wilk et al., 2020; Brunetta et al., 2021; Combes et al., 2021; Liu et al., 2021; Ng et al., 2021; Overmyer et al., 2021) were used for validating the signal. All of the datasets were preprocessed as similarly as possible and the reproducibility optimized test statistic (ROTS) was used to detect the COVID-19 specific signal (Seyednasrollah et al., 2016; Suomi et al., 2017). To facilitate further use of the data and the results, we also compiled all the datasets and the associated clinical and other information as an online resource for visualizing the COVID-19 specific transcriptomic signatures. The tool is freely available at https://elolab.shinyapps.io/COVID19/.
TABLE 1

Datasets used in this study.

NameSample typeNumber of samplesCohortMethodAccessionReferences
McClainWhole blood46 COVID-19USRNA-seqGSE161731 McClain et al. (2021)
59 seasonal coronavirus
17 influenza
20 bacterial pneumonia
19 healthy
LeePBMC11 COVID-19South KoreascRNA-seqGSE149689 Lee et al. (2020)
5 influenza
4 healthy
NgWhole blood7 COVID-19USRNA-seqGSE163151 Ng et al. (2021)
20 influenza
6 bacterial sepsis
20 healthy
CombesWhole blood21 COVID-19USscRNA-seqGSE163668 Combes et al. (2021)
11 non-COVID
14 healthy
OvermyerLeukocytes102 COVID-19USRNA-seqGSE157103 Overmyer et al. (2021)
26 non-COVID
ArunachalamPBMC17 COVID-19USRNA-seqGSE152418 Arunachalam et al. (2020)
17 healthy
LiuPBMC33 COVID-19ChinaCITE-seqGSE161918 Liu et al. (2021)
14 healthy
WilkPBMC7 COVID-19USscRNA-seqGSE150728 Wilk et al. (2020)
6 healthy
BrunettaMonocytes6 COVID-19ItalyRNA-seqGSE160351 Brunetta et al. (2021)
3 healthy
Datasets used in this study.

COVID-19-specific transcriptomic signature of 149 genes appears systematically across multiple cohorts

To discover a robust set of genes related specifically to COVID-19 in peripheral blood, we considered two datasets, covering COVID-19 patients, patients with other common respiratory infections, as well as healthy controls: the McClain data are a whole blood RNA-seq dataset (McClain et al., 2021), while the Lee data are a single-cell RNA-seq dataset of PBMCs (Lee et al., 2020). To focus on COVID-19 specific signals, we identified those genes that were differentially expressed between COVID-19 patients and healthy controls at a false discovery rate (FDR) of 0.05 but not between any other disease state (influenza, seasonal coronavirus, and bacterial pneumonia) and healthy controls. This analysis suggested altogether 212 COVID-19 specific genes from the two datasets (Figures 1A,B). Out of these genes, 123 were discovered from the whole blood McClain data and 95 from the PBMC Lee data, with an overlap of six findings between the datasets: IGHG1, IGHG3, IGHG4, IGCL2, CMTM5, and GP9.
FIGURE 1

Determination of the COVID-19 specific signature (A) Differentially expressed genes in COVID-19 and other infections compared to healthy controls in the McClain and Lee data. Genes that were differentially expressed specifically in the COVID-19 patients compared to healthy controls and not between other disease states (influenza, bacterial pneumonia, seasonal coronavirus) and healthy controls are shown with a red background. The Venn diagram on the right shows the number of shared COVID-19 specific differentially expressed genes in the McClain and Lee data. (B) Heatmaps of the expression levels of COVID-19 specific genes detected from the McClain and Lee data. (C) Fold changes of the signature genes in COVID-19 and other infections compared to healthy controls across multiple independent cohorts.

Determination of the COVID-19 specific signature (A) Differentially expressed genes in COVID-19 and other infections compared to healthy controls in the McClain and Lee data. Genes that were differentially expressed specifically in the COVID-19 patients compared to healthy controls and not between other disease states (influenza, bacterial pneumonia, seasonal coronavirus) and healthy controls are shown with a red background. The Venn diagram on the right shows the number of shared COVID-19 specific differentially expressed genes in the McClain and Lee data. (B) Heatmaps of the expression levels of COVID-19 specific genes detected from the McClain and Lee data. (C) Fold changes of the signature genes in COVID-19 and other infections compared to healthy controls across multiple independent cohorts. To further refine and validate our COVID-19 specific set of genes, we used six additional independent datasets, including both bulk and single-cell RNA-seq data from whole blood (Combes et al., 2021; Ng et al., 2021), PBMCs (Arunachalam et al., 2020; Wilk et al., 2020; Liu et al., 2021), or leukocytes (Overmyer et al., 2021). Of the 212 genes, altogether, 149 genes were consistently changed across the datasets (Figure 1C; Supplementary Table S1). Majority of these genes (85%) were up-regulated in COVID-19 when compared to the healthy controls.

COVID-19-specific signature is dominated by immunoglobulin-related genes

Our COVID-19 specific signature was dominated by immunoglobulin-related genes, including several immunoglobulin heavy chain variable (IGHV), immunoglobulin lambda variable (IGLV), immunoglobulin lambda constant (IGLC), immunoglobulin heavy constant gamma (IGHG), and immunoglobulin kappa variable (IGKV) region genes, among others. In particular, four of the shared six genes between the McClain and Lee data were immunoglobulin encoding genes: IGHG1, IGHG3, IGHG4, and IGLC2, all of which were consistently upregulated in a COVID-19 specific manner across the datasets. Similarly, other immunoglobulin-related genes were consistently upregulated in COVID-19 across the datasets, such as pentraxin 3 (PTX3), which has earlier been associated with COVID-19 (Brunetta et al., 2021). The functional enrichment results among the COVID-19 specific genes were also dominated by the immunoglobulin-related signal (Table 2, Supplementary Table S2), the most enriched processes being the classical pathway of complement activation (GO:0006958, hypergeometric test, FDR < 10−53) and immunoglobulin production (GO:0002377, FDR < 10−23). Among the more specific terms, Fc receptor-related processes were enriched, such as the Fc-gamma receptor signaling pathway (GO:0038094, FDR < 10−34). Overall, the detected immunoglobulin signal was strong and consistent across the datasets. While this immunoglobulin signal appeared to be specifically related to COVID-19 in the McClain data, the involved genes were not expressed highly enough in the Lee and Ng datasets including, other infections, to confirm this specificity.
TABLE 2

Functionally enriched gene sets among the COVID-19 specific signature. Summary gene sets at false discovery rate (FDR) of 0.05 are shown.

Summary gene setGene set IDSource of gene setFDR
Complement activation, classical pathwayGO:0006958GO biological processes<10−53
Immunoglobulin productionGO:0002377GO biological processes<10−23
RUNX1 regulates genes involved in megakaryocyte differentiation and platelet functionR-HSA-8936459Reactome gene sets<10−5
Antimicrobial humoral responseGO:0019730GO biological processes<10−4
Platelet activationGO:0030168GO biological processes<10−4
Mitotic nuclear divisionGO:0140014GO biological processes<0.01
Multicellular organismal homeostasisGO:0048871GO biological processes0.02
Positive regulation of fibroblast proliferationGO:0048146GO biological processes0.02
Functionally enriched gene sets among the COVID-19 specific signature. Summary gene sets at false discovery rate (FDR) of 0.05 are shown.

COVID-19-specific signature involves induction of hemostasis

Another strongly induced signal in the COVID-19 specific signature was related to hemostasis, including significant enrichment of platelet activation (GO:0030168, FDR < 10−4) (Table 2, Supplementary Table S2). Two of the shared-six genes between the McClain and Lee data were related to platelet function: glycoprotein IX platelet (GP9) and CKLF-like MARVEL transmembrane domain-containing protein 5 (CMTM5), both of which were consistently upregulated in a COVID-19 specific manner across the datasets. GP9 is a small membrane glycoprotein localized on human platelets and is associated with hemostasis and platelet adhesion to blood vessels in injured vascular surfaces (Resource Coordinators et al., 2017), whereas CMTM5 has been associated with platelet function in response to aspirin and is related to cardiovascular outcomes (Voora et al., 2013). Interestingly, the COVID-19 specific signature was also highly enriched with genes from the Reactome pathway “RUNX1 regulates genes involved in megakaryocyte differentiation and platelet function” (R-HSA-8936459, FDR < 10−5), showing consistent COVID-19 specific upregulation. Megakaryocytes are large bone marrow cells responsible for the production of platelets (Choi et al., 1995), while the runt-related transcription factor 1 (RUNX1) is considered a master regulator in hematopoiesis, involved in the maturation of hematopoietic stem cells into mature blood cells (Okuda et al., 2001).

Several cell cycle-related genes are dysregulated in COVID-19

Finally, we observed enrichment of the cell cycle and mitotic division-related processes among our COVID-19 specific genes (Table 2, Supplementary Table S2). Many of the cell cycle-related genes, such as aurora kinase B (AURKB) and cyclin-dependent kinase inhibitor 1A (CDKN1A, p21), were consistently upregulated in a COVID-19 specific manner, while some were consistently downregulated, such as cyclin-dependent kinase inhibitor 1C (CDKN1C, p57). Coronaviruses in general have been shown to manipulate the cell cycle of host cells, especially the arrest of the cell cycle at specific cell cycle checkpoints (Dove et al., 2006; Simabuco et al., 2020; Su et al., 2020) and also mitotic events (Bock and Ortea, 2020). In agreement, several distinct terms related to different cell cycle phases, such as mitotic prometaphase (R-HSA-68877, FDR = 0.02), mitotic sister chromatid segregation (GO:0000070, FDR < 0.01), and mitotic nuclear division (GO:0140014, FDR < 0.01) were enriched among our COVID-19 specific signature (Supplementary Table S2).

COVID-19-specific signature is associated with disease severity

Next, we investigated the association of our COVID-19 specific signature with the severity of the disease using those six datasets (McClain, Lee, Arunachalam, Combes, Liu, and Overmeyer) that had severity information available. The patients were divided into two categories based on whether they required mechanical ventilation or intensive care (severe) or not (mild) according to the original studies. All the patients in both categories were hospitalized apart from patients in the mild category in the McClain data. Many of the COVID-19 specific upregulated genes tended to have higher expression in severe diseases compared to milder diseases, while considerable variation was observed between the datasets (Figure 2A). Likewise, many of the COVID-19 specific downregulated genes tended to have lower expression in severe diseases compared to milder diseases (Figure 2B). Especially the hemostasis-related genes were generally and consistently upregulated in the severe cases, including the shared gene findings between the McClain and Lee data; GP9 and CMTM5 (Figures 2A,C). The immunoglobulin encoding genes, on the other hand, showed considerable variation between the datasets, with the exception of IGHV4-34, which was systematically upregulated in the severe cases in all four datasets where it was detected (Figures 2A,C).
FIGURE 2

Association of the COVID-19 specific signature genes with disease severity heatmaps showing logarithmic fold changes between the severe and mild cases across the different datasets separately for (A) upregulated and (B) downregulated COVID-19 specific genes. The significance of Wilcoxon rank-sum test is indicated with asterisks in the heatmaps: *p < 0.05, **p < 0.01, and ***p < 0.001. (C) Representative examples of genes with consistently higher or lower expression in the severe disease, with the individual expression values scaled by the average of the mild cases for each data; visualizations of all signature genes are available in our online tool (https://elolab.shinyapps.io/COVID19/).

Association of the COVID-19 specific signature genes with disease severity heatmaps showing logarithmic fold changes between the severe and mild cases across the different datasets separately for (A) upregulated and (B) downregulated COVID-19 specific genes. The significance of Wilcoxon rank-sum test is indicated with asterisks in the heatmaps: *p < 0.05, **p < 0.01, and ***p < 0.001. (C) Representative examples of genes with consistently higher or lower expression in the severe disease, with the individual expression values scaled by the average of the mild cases for each data; visualizations of all signature genes are available in our online tool (https://elolab.shinyapps.io/COVID19/). To investigate whether specific functions were related to disease severity, we explored functional enrichment among those 52 genes that were consistently up-regulated in the severe disease when compared to the milder disease across the datasets (Figure 2A). The most enriched functional terms were related to RUNX1 (“RUNX1 regulates genes involved in megakaryocyte differentiation and platelet function,” R-HSA-8936459, FDR < 10−8), hemostasis (R-HSA-109582, FDR < 10−6), platelet activation, signaling and aggregation (R-HSA-76002, FDR < 0.001), and blood coagulation (GO:0007596, FDR < 0.001), suggesting dysregulation related to platelet function and blood clotting in patients with more severe disease (Supplementary Table S2).

COVID-19-specific signature is not generally associated with time from symptom onset

In addition to disease severity, we investigated the association of our COVID-19 specific signature genes with time from symptom onset using those four datasets (Arunachalam, Combes, Liu, and Wilk) that had the symptom onset information available. The associations were determined using Pearson correlation between the reported number of days from the symptom onset and the measured gene expression level, scaled by the corresponding control average. Most (∼80%) of the signature genes did not show a significant correlation with the time from symptom onset (p > 0.05, Supplementary Table S3). Among the genes showing a trend (p < 0.05), most of them had larger expression changes at the early stages of the disease than at the later stages, at which the expression levels typically become closer to the healthy controls again (Figure 3A). However, two exceptions stood out: for peptidylprolyl isomerase F (PPIF) and T cell receptor delta constant (TRDC) the differences tended to be larger at the later stages (Figure 3B).
FIGURE 3

Association of COVID-19 specific signature genes with time from symptom onset. Representative examples of genes showing a significant association between expression and time from symptom onset, including (A) genes becoming closer to the healthy controls at later stages and (B) genes showing larger differences at the later stages. Gene expression levels of COVID-19 patients were scaled to healthy controls within each dataset and plotted as a function of days since symptom onset across all datasets that had the symptom onset information available (Arunachalam, Combes, Liu, and Wilk), including linear regression with 95% confidence interval. Additionally, the Pearson correlation coefficients (r) and the corresponding p-values are shown. Visualizations of all signature genes are available in our online tool (https://elolab.shinyapps.io/COVID19/).

Association of COVID-19 specific signature genes with time from symptom onset. Representative examples of genes showing a significant association between expression and time from symptom onset, including (A) genes becoming closer to the healthy controls at later stages and (B) genes showing larger differences at the later stages. Gene expression levels of COVID-19 patients were scaled to healthy controls within each dataset and plotted as a function of days since symptom onset across all datasets that had the symptom onset information available (Arunachalam, Combes, Liu, and Wilk), including linear regression with 95% confidence interval. Additionally, the Pearson correlation coefficients (r) and the corresponding p-values are shown. Visualizations of all signature genes are available in our online tool (https://elolab.shinyapps.io/COVID19/).

COVID-19-specific transcriptional signal comes from multiple cell types

To further investigate the source of the COVID-19 specific expression signal, we analyzed the signature genes at the single-cell level using three single-cell RNA-seq datasets (Wilk, Lee, and Combes) with harmonized cell type annotations (Hao et al., 2021) (Figure 4A).
FIGURE 4

COVID-19 specific expression signature in single-cell RNA-seq data. (A) Uniform manifold approximation and projection (UMAP) clustering of the Wilk, Lee, and Combes single-cell RNA-seq datasets together with harmonized cell type annotations. (B) Contribution of the different cell types to the observed bulk expression of the signature genes, as measured by the relative proportions of sequencing reads assigned to the different cell subsets that are common across the datasets. Neutrophil proportion was available only in the Combes data and is shown separately. The upregulated and downregulated COVID-19 specific genes are indicated by the green and purple text, respectively. (C) Expression of selected representative genes in the single cells of the Wilk dataset; visualizations of all signature genes are available in our online tool (https://elolab.shinyapps.io/COVID19/). These include FCRL2 and TNFRSF17 as examples of responses in B cells and blasmablasts, CDKN1C in CD16 monocytes as an example of cell cycle regulation, and LCN2 as an example of a novel marker from a cell population of developing neutrophils defined by Wilk. (D) Comparison of logarithmic fold changes between COVID-19 cases and healthy controls for genes preferentially expressed in monocytes, as indicated in panel (B). The data are from seven bulk expression datasets (y-axis) and isolated monocytes (x-axis, Brunetta data). The significance of Pearson correlation is indicated with asterisks: *p < 0.05, **p < 0.01, ***p < 0.001.

COVID-19 specific expression signature in single-cell RNA-seq data. (A) Uniform manifold approximation and projection (UMAP) clustering of the Wilk, Lee, and Combes single-cell RNA-seq datasets together with harmonized cell type annotations. (B) Contribution of the different cell types to the observed bulk expression of the signature genes, as measured by the relative proportions of sequencing reads assigned to the different cell subsets that are common across the datasets. Neutrophil proportion was available only in the Combes data and is shown separately. The upregulated and downregulated COVID-19 specific genes are indicated by the green and purple text, respectively. (C) Expression of selected representative genes in the single cells of the Wilk dataset; visualizations of all signature genes are available in our online tool (https://elolab.shinyapps.io/COVID19/). These include FCRL2 and TNFRSF17 as examples of responses in B cells and blasmablasts, CDKN1C in CD16 monocytes as an example of cell cycle regulation, and LCN2 as an example of a novel marker from a cell population of developing neutrophils defined by Wilk. (D) Comparison of logarithmic fold changes between COVID-19 cases and healthy controls for genes preferentially expressed in monocytes, as indicated in panel (B). The data are from seven bulk expression datasets (y-axis) and isolated monocytes (x-axis, Brunetta data). The significance of Pearson correlation is indicated with asterisks: *p < 0.05, **p < 0.01, ***p < 0.001. In general, there were more platelets in the COVID-19 cases than in the healthy controls (Supplementary Table S4). In particular, in the whole-blood Combes data, ∼17% of the cells in COVID-19 cases were platelets, while the proportion was on average 9% in the healthy controls (Wilcoxon test p < 0.05). Similarly, there were significantly more neutrophils in the COVID-19 cases than in the controls in the Combes data (49% vs. 21%, p < 0.0001). When considering only the PBMC cell types, the COVID-19 cases had more plasmablasts in the Wilk and Combes data (on average 9% and 2%, respectively) than the controls (0.5% or below in both datasets, p < 0.01), whereas in the Lee data, both groups had less than 0.5% of the cells classified as plasmablasts. There was also a systematic tendency to have larger proportions of CD14 monocytes in COVID-19 (on average 30%–38%) than in controls (on average 15%–23%, p < 0.001 in Combes, p < 0.1 in Wilk, p = 0.2 in Lee data, Supplementary Table S4). Next, we investigated which cell types contributed most to the observed bulk expression of the signature genes (Figures 4B,C). This was done by assigning the sequencing reads to different cell subsets and determining their relative proportions. Among the COVID-19 specific signature genes, ∼30% were preferentially expressed in platelets (Figure 4B). All of them were upregulated in the bulk datasets, likely reflecting the increased proportion of platelets in COVID-19 compared to controls. A total of ∼40% of the signature genes were preferentially expressed in B cells or plasmablasts, with most genes expressed in both cell types. These were essentially the immunoglobulin-related genes, which were upregulated in COVID-19 compared to the controls. In total ∼20% of the signature genes were preferentially expressed in T cells or NK cells; however, many of these genes were expressed also in several other cell types. Finally, ∼10% of the COVID-19 specific signature genes were preferentially expressed in CD14 monocytes and ∼1% in CD16 monocytes, including both upregulated and downregulated genes; many of these genes were also expressed in neutrophils in the whole-blood Combes data. Finally, we confirmed the observed monocyte-related COVID-19 specific signal by comparing the bulk expression levels of the preferentially monocyte-expressed genes to those observed in isolated monocytes (Brunetta data). Indeed, the bulk expression changes of the monocyte-specific genes were very well in line with the changes calculated from the Brunetta monocyte data, with consistent upregulation and downregulation (Figure 4D) (Brunetta et al., 2021), confirming the relevance of the detected signature and potential differences in the monocytes between COVID-19 patients and controls.

COVID-19-specific signature is associated with multiple known COVID-19-related drugs

Finally, we investigated the associations of the COVID-19 specific signature with known drug and chemical compound signatures using the Library of Integrated Network-based Cellular Signatures (LINCS) database (Koleti et al., 2017; Keenan et al., 2018). Only the most relevant findings were considered when accounting for both the significance (p < 0.01) and the concordance (effect size) (concordance value < −0.35) of the connected drugs and chemical compounds. Interestingly, several of the top associated drugs and compounds showing negative concordance have earlier been suggested to be useful against COVID-19 by independent sources and by distinct mechanisms (Table 3, Supplementary Table S5). For instance, the anti-inflammatory theophylline, which is used for the treatment of asthma and chronic obstructive pulmonary disease, has been associated with increased respiratory rate and oxygenation score in COVID-19 pneumonia patients (Wall et al., 2021), and its potential as a relevant candidate to treat COVID-19 patients was recently reviewed based on computational studies (Montaño et al., 2022). Cyclosporin A has been associated with decreased COVID-19 mortality (Guisado-Vasco et al., 2020) and it has been demonstrated to act as an antiviral against SARS-CoV-2 in preclinical infection models (Sauerhering et al., 2022). Fenofibrate has been suggested to enable faster recovery of COVID-19 patients compared to patients treated with standard care (Nahmias et al., 2021). The aminoglycoside antibiotic amikacin has been predicted in silico both as a potential inhibitor of the main protease of SARS-CoV-2 (Ahmed et al., 2021) and another enzyme (Elbadwi et al., 2021), as well as a potential inhibitor of the interaction between the SARS-CoV-2 spike protein S1 domain and host ACE2 receptor (Prajapat et al., 2020). The cancer drug lapatinib has been suggested to effectively block SARS-CoV-2 replication in human pulmonary fibroblasts (MRC5 cell line) (Raymonda et al., 2020), while another cancer drug gemcitabine has been shown to block the viral protein expression in virus-infected human lung epithelial cells (Calu-3) (Jang et al., 2021) and kidney epithelial cells (Vero-E6) (Zhang et al., 2020a). The cancer drug sunitinib, on the other hand, has been reported to reduce SARS-CoV-2 infectivity (Wang et al., 2020). Captopril (DrugBank DB01197) and quinapril (DrugBank DB00881) belong to angiotensin-converting enzyme (ACE) inhibitors, which are widely used for the treatment of hypertension and have been associated with reduced risks of COVID-19 (Hippisley-Cox et al., 2020; Tepasse et al., 2022).
TABLE 3

Top drug and chemical compound signatures negatively associated with the COVID-19 specific signature. Signatures with p < 0.01 and negative concordance value below −0.35 are listed.

Drug/CompoundSignature IDSource of signatureConcordance p-value
LapatinibPG_2820Pharmacogenomics−0.420.0001
GemcitabinePG_2488, PG_2404Pharmacogenomics−0.380.0005
SulfadimethoxineDM_4847Drug Matrix−0.550.001
GeldanamycinPG_2042, PG_2102Pharmacogenomics−0.360.001
SunitinibPG_4065Pharmacogenomics−0.350.001
AmikacinDM_1692Drug matrix−0.520.002
TheophyllineDM_4986Drug matrix−0.510.002
EnoxacinDM_2916Drug matrix−0.510.002
Alendronic acidDM_1630Drug matrix−0.500.003
FenofibrateDM_3102Drug matrix−0.500.003
CaptoprilDM_2148Drug matrix−0.490.004
OfloxacinDM_4225, DM_4227Drug matrix−0.480.005
Stannous fluorideDM_4809, DM_4810Drug matrix−0.470.006
QuinaprilDM_4566Drug matrix−0.470.006
CyproheptadineDM_2616Drug matrix−0.460.007
Choline chlorideDM_2382Drug matrix−0.450.009
Cyclosporin ADM_2594Drug matrix−0.440.009
Top drug and chemical compound signatures negatively associated with the COVID-19 specific signature. Signatures with p < 0.01 and negative concordance value below −0.35 are listed. Besides confirming the relevance of our COVID-19 specific gene signature, these results also suggest possible new drugs with potential connections to COVID-19. For instance, enoxacin is a broad-spectrum antibiotic that has recently been suggested to also have antiviral activity against various viruses by enhancing RNA interference (RNAi) as an antiviral defense mechanism (Xu et al., 2019; Scroggs et al., 2020). A recent in silico analysis suggested the RNA genome of SARS-CoV-2 is a suitable substrate for DICER activity and enoxacin is a promising candidate for COVID-19 treatment (Ahmadi and Moradi, 2021). Similarly, ofloxacin has been suggested to enhance RNAi activity (Zhang et al., 2008). The potential antiviral property of fluoroquinolone antibiotics (such as ofloxacin) against DNA and RNA viruses is well documented (Ikeda et al., 1987; Witvrouw et al., 1998; Dalhoff, 2015). The potential action of fluoroquinolones such as ciprofloxacin, moxifloxacin, and levofloxacin has been demonstrated for the treatment of SARS-CoV-2 associated pneumonia (Karampela and Dalamaga, 2020; Marciniec et al., 2020) and these antibiotics were also recommended to treat community-acquired pneumonia in COVID-19 patients (Metlay and Waterer, 2020). Cyproheptadine (DrugBank DB00434) is a serotonin antagonist. Interestingly, a recent study of in vivo platelet activation reported a significant COVID-19 specific increase in plasma serotonin levels compared to healthy controls and patients with acute respiratory distress syndrome without COVID-19 (Zaid et al., 2021). In a case study, COVID-19 patients whose symptoms resembled serotonin syndrome were treated with cyproheptadine (Keith et al., 2021).

COVID-19 shares several transcriptomic changes with other viral and bacterial infections

While our focus was on host genes related specifically to COVID-19 in peripheral blood, we also investigated functional enrichment of pathways and processes among genes similarly regulated in COVID-19 and other viral or bacterial infections (influenza, seasonal coronavirus, and bacterial pneumonia). Altogether, 246 genes shared differential regulation between all these disease states versus the healthy controls in the McClain data (Figure 1A), the majority of which (89%) were downregulated. Functional enrichment analysis of the shared differentially regulated genes identified the KEGG ribosome pathway as the most distinctively enriched (hsa03010, FDR < 10−63, Supplementary Table S2), with almost all of the genes downregulated in the different disease states, suggesting an overall downregulation of the ribosome pathway. Similarly, several other ribosome-related functionalities were enriched, such as the TRBP-containing protein complex involved in microRNA-mediated silencing (CORUM:5380, FDR < 10−10), ribonucleoprotein complex subunit organization (GO:0071826, FDR < 10−9), and the mitochondrial 55S ribosome (CORUM:320, FDR < 10−5). Furthermore, investigation of the shared differentially regulated genes in COVID-19 and any other disease state (grey area in Figure 1A Venn diagrams), excluding the effect of the 246 common genes regulated across all the disease states, resulted in 180 genes, of which 59% were downregulated. Functional enrichment analysis of these genes did not reveal any strong enrichment (Supplementary Table S2), suggesting variable functions among the host genes. This is in sharp contrast to the COVID-19 specific signature or the genes commonly regulated between all the disease states, for which striking enrichments were discovered. The genes were included in various processes, such as those related to cell death, immunoglobulin production, interferon response, and hemostasis, but none of them remained statistically significant after multiple hypothesis correction. For comparison, processes active specifically in bacterial pneumonia (top 500 most differentially expressed genes in patients with bacterial pneumonia compared to the healthy controls and not detected in any other disease state) were related to the regulation of leukocyte activation (GO:0002694, FDR < 10−15), lymphocyte activation (GO:0046649, FDR < 10−15), T helper cell 17 (Th17) differentiation (hsa04659, FDR < 10−15), neutrophil degranulation (R-HSA-6798695, FDR < 10−10), and T cell selection (GO:0045058, FDR < 10−7), suggesting a larger involvement of adaptive immune responses in the respiratory infections related to bacterial pneumonia.

Discussion

Using altogether nine independent transcriptomic datasets from diverse cohort studies and various types of blood samples, we discovered a signature of 149 genes consistently and specifically related to COVID-19, providing a comprehensive view of the specific disease-related host processes. The identified COVID-19 specific signature confirmed many processes previously associated with the disease in multiple studies, including induction of the immunoglobulin and hemostasis signals, as well as dysregulation of the cell cycle. Moreover, many specific genes previously associated with COVID-19 showed consistent dysregulation across multiple datasets, supporting their relevance in the disease. In addition to genes previously associated with COVID-19, we also identified multiple genes that have not yet been widely studied in the context of COVID-19. These included, for instance, COVID-19 specific upregulation of tumor necrosis factor receptor superfamily members 13B and 17 (TNFRSF13B and TNFRSF17), predominantly found in B cells and involved in immune responses; upregulation of the regulator of G protein signaling 1 (RGS1), which has previously been linked to multiple immune-mediated diseases such as celiac disease, type 1 diabetes, and multiple sclerosis (Smyth et al., 2008; International Multiple Sclerosis Genetics Consortium et al., 2011); upregulation of the G protein-coupled receptor 84 (GPR84), which is a pro-inflammatory receptor that has previously been associated with inflammatory bowel disease (Planell et al., 2017); and downregulation of Fc receptor-like 2 (FCRL2), which encodes a member of the immunoglobulin receptor superfamily. One of the strongest COVID-19 specific signals observed in this study was related to specific immunoglobulin genes, which were consistently upregulated in COVID-19 patients compared to healthy controls across multiple datasets. Furthermore, the discovered immunoglobulin genes were not similarly upregulated in other infections in the McClain data, suggesting a possible COVID-19 specific upregulation for the particular combination of the discovered immunoglobulin genes. Our findings were in agreement with those of McClain et al. (2021), who observed the immunoglobulin pathways and specific immunoglobulin-related genes as upregulated in COVID-19 patients when compared to other infections and healthy controls. Investigation of the immunoglobulin-related genes at the single-cell level suggested that the observed bulk signal came from B cells and plasmablasts (class-switched B cells), possibly reflecting a markedly strong induction of B cell differentiation to antibody-producing plasmablasts in COVID-19 patients. Immunoglobulin-related genes have been recently reported to be similarly upregulated in COVID-19 when compared to healthy patients and patients with active influenza infection (Bibert et al., 2021). The B cell-mediated humoral immune response plays a critical role in preventing and neutralizing COVID-19 infection and partly depends on the somatic recombination and differential usage of the immunoglobulin genes in producing a diverse repertoire of B cell receptors and associated antibodies (He et al., 2021). It is conceivable that a COVID-19 infection induces a strong immunoglobulin signal involving a distinct combination of immunoglobulin-related genes. However, even though suggestive, the extent to which the discovered immunoglobulin signal in this study is specific to COVID-19 and not other respiratory infections requires further confirmation from future studies. The elevated expression of the immunoglobulin-related genes was observed in both severe and mild cases. While Overmyer et al. (2021) observed upregulation of many immunoglobulin genes in patients with severe COVID-19 when compared to those with mild disease, we did not observe systematic associations between immunoglobulin-related genes and disease severity across the datasets. The only exception was IGHV4-34, which was systematically upregulated in the severe cases in all three datasets, where it was detected. Curiously, IGHV4-34 has an inherent ability to encode autoreactive antibodies; IGHV4-34 antibodies represent a major proportion of serum antibodies, especially in systemic lupus erythematosus (SLE), and they are associated with the disease severity, while they are underrepresented in the serum of healthy adults (van Vollenhoven et al., 1999; Pugh-Bernard et al., 2001). In line with our study, increased usage of IGHV4-34 has also been observed in COVID-19 compared to healthy controls (Galson et al., 2020). It would be interesting to study whether autoreactivity contributes to the development of long COVID, with symptoms often resembling those observed in autoimmune diseases (Galson et al., 2020; Khamsi, 2021). Immunoglobulin antibodies against the SARS-CoV-2 spike protein antigens have been shown to develop rapidly in individuals infected with the virus (Secchi et al., 2020). Intravenous immunoglobulin injections from healthy donors or recovering patients (Nabih, 2021) have been used to treat COVID-19 patients with promising results (Herth et al., 2020; Cao et al., 2021), with the rationale to suppress the hyperactive immune responses seen in patients with severe disease (Tzilas et al., 2020). Interestingly, PTX3 has been suggested as a biomarker for the unresponsiveness to intravenous immunoglobulin treatment of patients with Kawasaki disease, causing inflammation of blood vessels throughout the body (PTX3 and PREDICTS INTRAVENOUS IMMUNOGLOBULIN UNRESPONSIVENESS IN PATIENTS WITH KAWASAKI DISEASE, 2011; Kitoh et al., 2021). PTX3 is involved in humoral innate immunity and regulation of inflammation, including neutrophil recruitment and complement cascade regulation (Deban et al., 2008; Deban et al., 2010). Uncontrolled complement activation has been associated with severe COVID-19 (Risitano et al., 2020). Recently, PTX3 was identified as a predictor of 28-day mortality of hospitalized COVID-19 patients, with increased PTX3 hypothesized to reflect the failure to regulate uncontrolled inflammation (Brunetta et al., 2021). Several previous studies have shown that the interferon response is elevated in COVID-19 when compared to healthy controls, but it appears to be less strongly induced in COVID-19 compared to other infections such as influenza (Lee et al., 2020; Wilk et al., 2020; Liu et al., 2021; McClain et al., 2021; Ng et al., 2021), especially in the severe disease (Combes et al., 2021; Liu et al., 2021). We did not observe an enrichment of genes directly related to the interferon response among our COVID-19 specific gene signature. Only one gene directly related to the interferon response was detected in our COVID-19 specific gene signature: interferon-alpha inducible protein 27 (IFI27). IFI27 was consistently up-regulated in COVID-19 when compared to the healthy controls, but the comparisons against other infections and in relation to disease severity varied depending on the dataset, perhaps reflecting inconsistent interferon response observed in previous studies. IFI27 was also strongly associated with the time from the symptom onset, with the initial high expression decreasing close to control levels relatively quickly. Although the interferon signal among our COVID-19 specific genes was mostly absent, some upregulated chemokines were identified as COVID-19 specific. These included the C-X-C motif chemokine ligand 3 (CXCL3), which is involved in the migration and adhesion of monocytes (Smith et al., 2005), and the platelet factor 4 (PF4), which is also known as the C-X-C motif chemokine ligand 4 (CXCL4) and is a chemotaxis inducer for neutrophils, monocytes, and fibroblasts (Eisman et al., 1990). Chemokines have been suggested to be deeply involved in COVID-19 and even the main cause of the acute respiratory syndrome and cytokine storm associated with the most severe forms of the disease (Majumdar and Murphy, 2020; Coperchini et al., 2021; Khalil et al., 2021). Furthermore, neutrophilia (i.e., a high number of circulating neutrophils) (Coperchini et al., 2021; Ng et al., 2021), neutrophil degranulation (Ng et al., 2021; Overmyer et al., 2021), and high chemokine expression (Ng et al., 2021; Overmyer et al., 2021) have previously been associated with severe COVID-19. Curiously, Wilk et al. (2020) observed a novel cell population of developing neutrophils in COVID-19. These neutrophils appeared similar to plasmablasts and neutrophil progenitors in their gene expression and were suspected to be possibly derived from plasmablasts or through emergency granulopoiesis (Wilk et al., 2020). Interestingly, LCN2 was identified to be upregulated in a COVID-19 specific manner in our analysis across the datasets and was preferentially expressed in that cell population. Recently, Meizlish et al. (2021) suggested LCN2 as a discriminator of critical illness in COVID-19, being highly enriched in neutrophil precursors in circulation. Earlier studies have implicated LCN2 to deactivate macrophages, worsening the inflammatory response and negatively affecting the outcome of pneumococcal pneumonia (Warszawska et al., 2013). The identified COVID-19 specific signature involved a clear upregulation of hemostasis signal when compared to healthy controls. The signal was stronger in severe diseases, with many of the hemostasis-related genes consistently upregulated when comparing severe cases to milder diseases. This was well in line with previous studies. Several studies have shown significant increases in platelet activation, platelet reactivity, and platelet-leukocyte aggregates in COVID-19 compared to healthy blood donors (Hottz et al., 2020; Manne et al., 2020; Comer et al., 2021). Platelet activation has also been reported to correlate with COVID-19 severity (Hottz et al., 2020; Comer et al., 2021) and platelet hyperreactivity has been suggested as a primary driver of thrombosis contributing to organ failure and death in the severe disease (Zaid et al., 2021). Notably, a recent study reported that platelet activation was significantly higher in COVID-19 patients compared to patients with acute respiratory distress syndrome without COVID-19 (Zaid et al., 2021), supporting the COVID-19 specificity of the signal. In association with hemostasis and platelet function, we also observed a highly enriched Reactome pathway “RUNX1 regulates genes involved in megakaryocyte differentiation and platelet function” (Matthews et al., 2009) in our COVID-19 specific signature. The related genes were consistently upregulated in COVID-19 patients when compared to healthy controls and other infections. RUNX1 is a transcription factor and a master regulator that is involved in the maturation of hematopoietic stem cells into mature cells, and it has been previously related to angiogenesis and fibrosis (O’Hare et al., 2021). Accumulated clonal mutations in hematopoietic stem cells have been associated with an increased risk of severe COVID-19 (Bolton et al., 2020). Moreover, inhibition of RUNX1 has been shown to enhance symptoms of lung fibrosis in a mouse model (O’Hare et al., 2021), while overexpression of RUNX1 has been observed in the lungs of severe COVID-19 patients who died of the disease, with widespread thrombosis and microangiopathy and related vascular angiogenesis much more prevalent in COVID-19 than in influenza (Ackermann et al., 2020). In line with this, the RUNX1-related genes of our COVID-19 signature were mostly detected as upregulated in the severe form of the disease when compared to the milder disease. Neutrophils are known to interact extensively with platelets during inflammatory conditions, and they modulate each other’s functions (Lisman, 2018; Ramirez et al., 2019; Zucoloto and Jenne, 2019). Among such interactions, platelets have been shown to induce the formation of neutrophil extracellular traps, which are known to kill pathogens but also be involved in thrombin activation and coagulation initiation, which has been associated with hypercoagulability in vascular disorders (Zucoloto and Jenne, 2019). Interestingly, COVID-19 has been strongly associated with a hypercoagulative phenotype (Becker, 2020; Goshua et al., 2020), and thrombosis (Ackermann et al., 2020; McClain et al., 2021), and widespread microangiopathy in the lungs (Ackermann et al., 2020). Furthermore, while platelets are well known to be associated with hemostasis (Lisman, 2018; Zucoloto and Jenne, 2019), the myriad of interactions between the two cell types also suggests the involvement of neutrophils in hemostasis and blood coagulation during inflammatory conditions (Lisman, 2018; Ramirez et al., 2019; Zucoloto and Jenne, 2019). Another clear signal detected among our COVID-19 specific genes was related to cell cycle and mitotic division control. Coronaviruses in general have been shown to manipulate the cell cycle of the host cells, especially the arrest of the cell cycle at specific cell cycle checkpoints (Dove et al., 2006; Simabuco et al., 2020; Su et al., 2020) and also mitotic events in COVID-19 (Bock and Ortea, 2020). Furthermore, several mRNA molecules related to the cell cycle and mitotic processes have been observed to be upregulated in response to COVID-19 infection (Bouhaddou et al., 2020). For instance, CDKN1A has been observed to have significantly higher expression in COVID-19 patients than in healthy controls (Bordoni et al., 2021), which is in agreement with our results of COVID-19 specific CDKN1A upregulation. The protein product of CDKN1A, p21, is an essential mediator of p53-dependent cell arrest (Bordoni et al., 2021). While multiple genes in our COVID-19 specific signature were associated with disease severity, their association with time from symptom onset was less evident, as could be expected. Among the genes showing association, the expression changes typically tended to become closer to the healthy controls over time, with two outstanding exceptions: TRDC and PPIF. TRDC encodes the constant region of the T cell receptor delta chain, and it is considered a marker of gamma-delta T cells. In line with our finding that the expression of TRDC decreased in the blood of COVID-19 patients, previous studies have suggested decreased proportions of gamma-delta T cells in the blood of hospitalized COVID-19 patients compared to healthy controls (Wilk et al., 2020). The decrease has been associated with their recruitment to airway tissues (Caron et al., 2021) and disease severity (Zhang et al., 2020b). PPIF, whose expression increased in the blood of COVID-19 patients, is a cyclophilin that is an essential component of the mitochondrial permeability transition pore. The opening of the pore has been implicated in the pathophysiology of multiple diseases, such as muscular dystrophies, ischemia-reperfusion injury, and various neurological diseases, while inhibition of PPIF has been suggested as a therapeutic strategy to delay it (Briston et al., 2019). Intriguingly, one of our identified top drug candidates was cyclosporin A, which is indeed a cyclophilin inhibitor. Cyclosporin A is widely used to prevent organ rejection after transplantation, but it has recently been shown to have substantial antiviral activity against SARS-CoV-2 and preliminary clinical trials on COVID-19 patients have reported a lower incidence of death among the cyclosporin A treated patients, recently reviewed by (Devaux et al., 2021). Overall, an investigation of the associations of the COVID-19 specific transcriptomic signature with known drug and chemical compound signatures identified several drugs and chemical compounds with known relations to COVID-19, providing further support for the relevance of our COVID-19 specific signature. Additionally, this provided opportunities to gain further insights for possible new drug relationships with the disease, while further investigations of the findings are needed to provide the rationale for their potential in COVID-19 treatment. In addition to the newly discovered COVID-19 specific gene signature, we explored genes similarly regulated in COVID-19 and other viral or bacterial infections when compared to healthy controls. The majority of these genes were down-regulated and they were highly enriched in ribosome-related processes. Congruently to our results, different viruses such as influenza (Bercovich-Kinori et al., 2016), HIV-1 (Kleinman et al., 2014), vaccinia (Dai et al., 2017), and SARS-CoV2 (Banerjee et al., 2020; Hsu et al., 2021) have been observed to be related to a global inhibition of the host mRNA translation upon infection and, as such, might represent a common strategy employed by several viruses to shut down the native host protein synthesis (Bercovich-Kinori et al., 2016; Hsu et al., 2021). Similarly, the bacterial agent Legionella pneumophila, causative of pneumonia in humans, has been observed to target and inhibit host mRNA translation and protein synthesis (Belyi, 2020). Taken together, our results offer a rich resource to comprehensively investigate the COVID-19 specific host responses in circulating blood, providing support for many signals previously associated with the disease and a solid foundation for future research into the specific mechanisms related to COVID-19. To facilitate such exploration, we also offer a web-based software platform enabling information-rich visualization of the transcriptomic profiles across multiple datasets at both bulk and single-cell levels. The expression of specific genes can be compared between COVID-19 patients and healthy controls, as well as patients with other infections. Gene expression can also be investigated in relation to many relevant attributes, such as age, sex, disease severity, disease progression, and cell-type specificity. The software platform is freely available at https://elolab.shinyapps.io/COVID19/.

Methods

Transcriptomic datasets and their preprocessing

The transcriptomic datasets used in this study (Table 1) were downloaded from the Gene Expression Omnibus (GEO) as raw count matrices, except for the monocyte Brunetta data, for which only the preprocessed data was available, and the single-cell Wilk data, which was downloaded from the COVID-19 cell Atlas (https://www.covid19cellatlas.org) as preprocessed data. The single-cell Lee and Combes datasets were processed using Seurat (v.4.0.1) in R similarly to in the original publications (Lee et al., 2020; Combes et al., 2021). Cell type annotations for all the single-cell datasets were performed using the Azimuth tool with the human PBMC reference (Hao et al., 2021). Since the PBMC reference did not include neutrophils, the neutrophil annotations for the whole-blood Combes data were retrieved from the original study. For the bulk analysis, the raw gene-wise count values from all cells belonging to a sample were aggregated using the R package Muscat (v.1.4.0), resulting in a pseudobulk expression matrix with genes as rows and samples as columns. To identify gene signals comparable to those from bulk RNA-seq datasets, the raw count values were aggregated across all cells belonging to one sample (Crowell et al., 2020). All bulk and pseudobulk RNA-seq datasets were preprocessed as similarly as possible from the raw count matrices. First, lowly expressed genes were filtered out, retaining only genes that had a count per million (CPM) value above the threshold in at least as many samples as the size of the smallest experimental group in the data. The threshold was determined for each dataset as the CPM value corresponding to the read count of ten in the sample with the smallest library size. The data were normalized using the trimmed mean of M-values (TMM) method from the Bioconductor package edgeR (v.3.26.8). For the analysis, we used log2 transformed CPM values with an offset of 1. In the Lee data, one influenza sample (“Flu 5”) was removed as an outlier. In the Combes data, nine samples (one healthy control, three COVID-19 positives, and five COVID-19 negatives) were excluded because they contained less than 1,000 cells.

Defining the COVID-19-specific gene signature

For defining the COVID-19 specific gene signature, we used the whole blood McClain data (McClain et al., 2021) and the PBMC Lee data (Lee et al., 2020). Only one sample per individual was considered in the analysis; if an individual had multiple samples, their median was used for each gene. The reproducibility optimized test statistic (ROTS) (v.1.12.0) (Suomi et al., 2017) was first applied to determine differentially expressed genes between the COVID-19 cases and healthy controls separately in both datasets. Genes with a false discovery rate (FDR) of 0.05 were considered differentially expressed. To focus on COVID-19 specific signals, we then identified those genes that were differentially expressed between COVID-19 patients and healthy controls but not similarly differentially expressed in the same direction between any other disease state (influenza, seasonal coronavirus, and bacterial pneumonia) and healthy controls. Finally, the union of these COVID-19 specific genes was further refined using six additional datasets by including only those genes that had a concordant fold change in at least four of the datasets and an opposite change in at most one dataset in our final COVID-19 specific signature.

Association of the COVID-19-specific gene signature with disease severity and time from symptom onset

For the exploration of disease severity, we used those six datasets (McClain, Lee, Arunachalam, Combes, Liu, Overmeyer) that had severity information available. The patients were divided into two categories based on whether they required mechanical ventilation or intensive care unit care (severe) or not (mild) according to the original studies. All the patients in both categories were hospitalized apart from patients in the mild category in the McClain data. To determine the significance of differences between the severe and mild cases, Wilcoxon rank-sum test was used. For functional enrichment analysis, genes consistently upregulated in severe disease in at least four datasets were considered. For determining the association of the COVID-19 specific genes with time from symptom onset, we considered those four (Arunachalam, Combes, Liu, Wilk) datasets that had the symptom onset information available. The associations were determined using Pearson correlation between the reported number of days from the symptom onset and the measured gene expression level, scaled by the corresponding control average.

Functional enrichment and associations with known drugs and chemical perturbations

To explore functional enrichment in the detected COVID-19 specific gene signature, we used the Metascape platform (Zhou et al., 2019). The following ontology sources were considered: KEGG Pathways, GO Biological Processes, Reactome Gene Sets, Canonical Pathways, and WikiPathways. All genes in the genome were used as the background. Metascape calculates the statistical significance of enrichment using the hypergeometric distribution and adjusts for multiple testing using the Benjamini-Hochberg method. Terms with a p-value < 0.01, a minimum of three genes, and an enrichment of at least 1.5 are further grouped into clusters using hierarchical clustering with Kappa scores as the similarity measure. Sub-trees with a similarity of >0.3 were considered a cluster, and the most significant term within a cluster was used to represent the cluster. For investigating the associations of the COVID-19 specific signature genes with known drugs and chemical compound signatures, we used the Library of Integrated Network-based Cellular Signatures (LINCS) database and the associated integrative web-based platform (iLINCS) (Koleti et al., 2017; Keenan et al., 2018). The upregulated and downregulated COVID-19 specific genes were used as the query signatures and the DrugMatrix and pharmacogenomics transcriptional signatures as the iLINCS signatures. Concordance was determined on the basis of the correlation between the query signature and the iLINCS signatures. Signatures with p < 0.01 and a negative concordance value below −0.35 were considered.

Cell type proportions and cell type-specific expression

The Wilcoxon rank-sum test was used to calculate the statistical significance of differences in the cell type proportions between the COVID-19 patients and healthy controls in the single-cell datasets. To investigate which cell types contributed most to the observed bulk expression of the signature genes, the sequencing reads were assigned to different cell types and their relative proportions across the cell types were calculated.
  106 in total

1.  Clinical characteristics and outcomes among hospitalized adults with severe COVID-19 admitted to a tertiary medical center and receiving antiviral, antimalarials, glucocorticoids, or immunomodulation with tocilizumab or cyclosporine: A retrospective observational study (COQUIMA cohort).

Authors:  Pablo Guisado-Vasco; Sofia Valderas-Ortega; Maria Maravillas Carralón-González; Ana Roda-Santacruz; Lucia González-Cortijo; Gabriel Sotres-Fernández; Eva María Martí-Ballesteros; José Manuel Luque-Pinilla; Elena Almagro-Casado; Félix J La Coma-Lanuza; Ruth Barrena-Puertas; Esteban Javier Malo-Benages; María José Monforte-Gómez; Rocío Diez-Munar; Esther Merino-Lanza; Lorena Comeche-Casanova; Margarita Remirez-de-Esparza-Otero; María Correyero-Plaza; Manuel Recio-Rodríguez; Margarita Rodríguez-López; María Dolores Sánchez-Manzano; Cristina Andreu-Vázquez; Israel John Thuissard-Vasallo; José María Echave-Sustaeta María-Tomé; Daniel Carnevali-Ruiz
Journal:  EClinicalMedicine       Date:  2020-10-15

Review 2.  The Library of Integrated Network-Based Cellular Signatures NIH Program: System-Level Cataloging of Human Cells Response to Perturbations.

Authors:  Alexandra B Keenan; Sherry L Jenkins; Kathleen M Jagodnik; Simon Koplev; Edward He; Denis Torre; Zichen Wang; Anders B Dohlman; Moshe C Silverstein; Alexander Lachmann; Maxim V Kuleshov; Avi Ma'ayan; Vasileios Stathias; Raymond Terryn; Daniel Cooper; Michele Forlin; Amar Koleti; Dusica Vidovic; Caty Chung; Stephan C Schürer; Jouzas Vasiliauskas; Marcin Pilarczyk; Behrouz Shamsaei; Mehdi Fazel; Yan Ren; Wen Niu; Nicholas A Clark; Shana White; Naim Mahi; Lixia Zhang; Michal Kouril; John F Reichard; Siva Sivaganesan; Mario Medvedovic; Jaroslaw Meller; Rick J Koch; Marc R Birtwistle; Ravi Iyengar; Eric A Sobie; Evren U Azeloglu; Julia Kaye; Jeannette Osterloh; Kelly Haston; Jaslin Kalra; Steve Finkbiener; Jonathan Li; Pamela Milani; Miriam Adam; Renan Escalante-Chong; Karen Sachs; Alex Lenail; Divya Ramamoorthy; Ernest Fraenkel; Gavin Daigle; Uzma Hussain; Alyssa Coye; Jeffrey Rothstein; Dhruv Sareen; Loren Ornelas; Maria Banuelos; Berhan Mandefro; Ritchie Ho; Clive N Svendsen; Ryan G Lim; Jennifer Stocksdale; Malcolm S Casale; Terri G Thompson; Jie Wu; Leslie M Thompson; Victoria Dardov; Vidya Venkatraman; Andrea Matlock; Jennifer E Van Eyk; Jacob D Jaffe; Malvina Papanastasiou; Aravind Subramanian; Todd R Golub; Sean D Erickson; Mohammad Fallahi-Sichani; Marc Hafner; Nathanael S Gray; Jia-Ren Lin; Caitlin E Mills; Jeremy L Muhlich; Mario Niepel; Caroline E Shamu; Elizabeth H Williams; David Wrobel; Peter K Sorger; Laura M Heiser; Joe W Gray; James E Korkola; Gordon B Mills; Mark LaBarge; Heidi S Feiler; Mark A Dane; Elmar Bucher; Michel Nederlof; Damir Sudar; Sean Gross; David F Kilburn; Rebecca Smith; Kaylyn Devlin; Ron Margolis; Leslie Derr; Albert Lee; Ajay Pillai
Journal:  Cell Syst       Date:  2017-11-29       Impact factor: 10.304

3.  Regulation of leukocyte recruitment by the long pentraxin PTX3.

Authors:  Livija Deban; Remo Castro Russo; Marina Sironi; Federica Moalli; Margherita Scanziani; Vanessa Zambelli; Ivan Cuccovillo; Antonio Bastone; Marco Gobbi; Sonia Valentino; Andrea Doni; Cecilia Garlanda; Silvio Danese; Giovanni Salvatori; Marica Sassano; Virgilio Evangelista; Barbara Rossi; Elena Zenaro; Gabriela Constantin; Carlo Laudanna; Barbara Bottazzi; Alberto Mantovani
Journal:  Nat Immunol       Date:  2010-03-07       Impact factor: 25.606

4.  Global absence and targeting of protective immune states in severe COVID-19.

Authors:  Alexis J Combes; Tristan Courau; Nicholas F Kuhn; Kenneth H Hu; Arja Ray; William S Chen; Nayvin W Chew; Simon J Cleary; Divyashree Kushnoor; Gabriella C Reeder; Alan Shen; Jessica Tsui; Kamir J Hiam-Galvez; Priscila Muñoz-Sandoval; Wandi S Zhu; David S Lee; Yang Sun; Ran You; Mélia Magnen; Lauren Rodriguez; K W Im; Nina K Serwas; Aleksandra Leligdowicz; Colin R Zamecnik; Rita P Loudermilk; Michael R Wilson; Chun J Ye; Gabriela K Fragiadakis; Mark R Looney; Vincent Chan; Alyssa Ward; Sidney Carrillo; Michael Matthay; David J Erle; Prescott G Woodruff; Charles Langelier; Kirsten Kangelaris; Carolyn M Hendrickson; Carolyn Calfee; Arjun Arkal Rao; Matthew F Krummel
Journal:  Nature       Date:  2021-01-25       Impact factor: 49.962

5.  Gemcitabine, lycorine and oxysophoridine inhibit novel coronavirus (SARS-CoV-2) in cell culture.

Authors:  Ya-Nan Zhang; Qiu-Yan Zhang; Xiao-Dan Li; Jin Xiong; Shu-Qi Xiao; Zhen Wang; Zhe-Rui Zhang; Cheng-Lin Deng; Xing-Lou Yang; Hong-Ping Wei; Zhi-Ming Yuan; Han-Qing Ye; Bo Zhang
Journal:  Emerg Microbes Infect       Date:  2020-12       Impact factor: 7.163

6.  Comparison of Antiviral Activity of Gemcitabine with 2'-Fluoro-2'-Deoxycytidine and Combination Therapy with Remdesivir against SARS-CoV-2.

Authors:  Yejin Jang; Jin Soo Shin; Myoung Kyu Lee; Eunhye Jung; Timothy An; Uk-Il Kim; Kyungjin Kim; Meehyein Kim
Journal:  Int J Mol Sci       Date:  2021-02-04       Impact factor: 5.923

7.  Increased Pentraxin 3 Levels Correlate With IVIG Responsiveness and Coronary Artery Aneurysm Formation in Kawasaki Disease.

Authors:  Toshiyuki Kitoh; Tsuyoshi Ohara; Taichiro Muto; Akihisa Okumura; Reizo Baba; Yusuke Koizumi; Yuka Yamagishi; Hiroshige Mikamo; Kenji Daigo; Takao Hamakubo
Journal:  Front Immunol       Date:  2021-04-12       Impact factor: 7.561

Review 8.  Theophylline: Old Drug in a New Light, Application in COVID-19 through Computational Studies.

Authors:  Luis M Montaño; Bettina Sommer; Juan C Gomez-Verjan; Genaro S Morales-Paoli; Gema Lizbeth Ramírez-Salinas; Héctor Solís-Chagoyán; Zuly A Sanchez-Florentino; Eduardo Calixto; Gloria E Pérez-Figueroa; Rohan Carter; Ruth Jaimez-Melgoza; Bianca S Romero-Martínez; Edgar Flores-Soto
Journal:  Int J Mol Sci       Date:  2022-04-09       Impact factor: 6.208

9.  Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention.

Authors:  Zunyou Wu; Jennifer M McGoogan
Journal:  JAMA       Date:  2020-04-07       Impact factor: 56.272

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.