| Literature DB >> 35079083 |
Nathan D Maulding1, Spencer Seiler1, Alexander Pearson1, Nicholas Kreusser1, Joshua M Stuart2.
Abstract
The SARS-CoV-2 pandemic has challenged humankind's ability to quickly determine the cascade of health effects caused by a novel infection. Even with the unprecedented speed at which vaccines were developed and introduced into society, identifying therapeutic interventions and drug targets for patients infected with the virus remains important as new strains of the virus evolve, or future coronaviruses may emerge that are resistant to current vaccines. The application of transcriptomic RNA sequencing of infected samples may shed new light on the pathways involved in viral mechanisms and host responses. We describe the application of the previously developed "dual RNA-seq" approach to investigate, for the first time, the co-regulation between the human and SARS-CoV-2 transcriptomes. Together with differential expression analysis, we describe the tissue specificity of SARS-CoV-2 expression, an inferred lipopolysaccharide response, and co-regulation of CXCL's, SPRR's, S100's with SARS-CoV-2 expression. Lipopolysaccharide response pathways in particular offer promise for future therapeutic research and the prospect of subgrouping patients based on chemokine expression that may help explain the vastly different reactions patients have to infection. Taken together these findings highlight unappreciated SARS-CoV-2 expression signatures and emphasize new considerations and mechanisms for SARS-CoV-2 therapeutic intervention.Entities:
Mesh:
Year: 2022 PMID: 35079083 PMCID: PMC8789814 DOI: 10.1038/s41598-022-05342-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 4(A) The overlap in gene and pathway results indicated by DE, CE-Dendro, CE-Net, and CE-PageRank views by venn diagram. (B) Sankey diagram showing genes found by at least two of four views and pathway themes in which they participate. Genes are colored based on the views that they are found in, white, light blue, and dark blue indicate that DE and 1, 2, or 3 other CE views, respectively, and red for a gene that is not found by DE, but was by a CE view. Genes are connected to themes created from a list of gProfiler pathways (Suppl. Table 1). (C) The consensus network for the cross-analysis overlaps between these four views results in four gene modules. Genes are displayed as circular nodes and pathways are displayed as triangular nodes. Nodes with red borders represent results that did not manifest in traditional differential expression analysis, and, therefore, demonstrate the power of the dRAP pipeline to not only highlight important findings in DE analysis, but to also reveal previously hidden signatures. (D) The red module (CXCL5, CXCL8, CCL20, ASS1, HIF1A) is indicative of chemokine activity, cytokines, and a lipopolysaccharide response. (E) The blue module (IFIH1, PRDM1, BIRC3, FOSL1, DTX3L, S100A8) indicates an innate immune response. (F) The lime module (SPRR2A, SPRR2D, SPRR2E, KRT6B, ALOX15B, PI3) indicates cornification/keratinization and epithelial cell differentiation changes. (G) The black module (OAS1, IFITM3) indicates regulation of viral genome replication.
Figure 1(A) Illustration of the dual RNAseq alignment pipeline (dRAP) using both host and guest genomes. Dual RNAseq creates a single reference index for RNAseq read alignment with STAR by appending “guest” genomes to the host genome allowing for simultaneous alignment of reads with multiple transcriptomes where the best overall alignment is selected. An example is shown where dRAP can resolve that a single read has a better match to the guest genome with only 1 mismatch (green arrow) compared to the host genome with 3 mismatches (red vertical lines). The SARS-CoV-2 guest genome (NC_045512v2) is depicted with its annotated set of genes designated as open reading frames, structural proteins, or accessory factors. (B) Overview of dRAP application to SARS-CoV-2 analysis enabling the detection of coexpression associations between transcripts originating from human (red lines) and virus (blue lines). RNAseq reads from SARS-CoV-2 infected samples from human cell lines (A549 and NHBE) and patients (BALF, PBMC, and Lung) were collected from public datasets (Blanco-Melo et. al. 2020 and Xiong et. al. 2020). Traditional RNAseq (dashed green arrows), which does not quantify both human and viral transcripts, allows for only differential expression analysis, while dRAP (black arrows) enables both downstream differential and coexpression analyses between host and virus. A549, NHBE, and BALF samples were selected for downstream analyses as they contained SARS-CoV-2 transcripts. Several co-expression analyses (CE) were performed in linear (CE-Net) and nonlinear (CE-dendro) relationships between the human and SARS-CoV-2 transcriptomes and genes of influence (CE-PageRank) in the gene regulatory network. Each CE view, along with DE, produced a set of results for A549, NHBE, and BALF groups. Genes implicated in two or more CE views were collected and used to determine enriched pathways (denoted “consensus pathways”). A “consensus network” was determined by including genes and pathways found by two or more views.
SARS-CoV-2 differential gene expression for infected patient tissue and cell line samples compared with non-infected samples.
| BALF | A549 | NHBE | PBMC | Lung | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Padj | Log2(FC) | Padj | Log2(FC) | Padj | Log2(FC) | Padj | Log2(FC) | Padj | Log2(FC) | |
| Cov2_ORF10 | 1.11E−32 | 18.3 | 2.11E−17 | 10.9 | 7.83E−177 | 4.7 | DNQ | DNQ | 1.28E−05 | 11.9 |
| Cov2_N | 2.78E−50 | 21.4 | 5.83E−26 | 12.8 | 2.11E−58 | 6.9 | DNQ | DNQ | 1.75E−05 | 10.1 |
| Cov2_ORF8 | 7.45E−41 | 18.3 | 1.60E−11 | 9.7 | 5.93E−17 | 4.3 | DNQ | DNQ | DNQ | 9.6 |
| Cov2_ORF7b | 2.51E−04 | 11 | DNQ | 0.3 | DNQ | − 1.3 | DNQ | DNQ | DNQ | DNQ |
| Cov2_ORF7a | 4.86E−37 | 18.6 | 2.02E−13 | 9.7 | 8.81E−17 | 5.5 | DNQ | DNQ | DNQ | 9.3 |
| Cov2_ORF6 | 2.81E−17 | 14.9 | DNQ | 5.9 | DNQ | 1.7 | DNQ | DNQ | DNQ | DNQ |
| Cov2_M | 1.51E−42 | 19.1 | 1.71E−16 | 10.6 | 2.52E−21 | 6.5 | DNQ | DNQ | DNQ | 9.6 |
| Cov2_E | 1.20E−24 | 15.2 | DNQ | 7.7 | DNQ | 3.6 | DNQ | DNQ | DNQ | DNQ |
| Cov2_ORF3a | 8.40E−40 | 19 | 8.07E−14 | 9.8 | 3.12E−13 | 7 | DNQ | DNQ | DNQ | 9 |
| Cov2_S | 1.16E−58 | 22.7 | 5.49E−18 | 11.1 | 2.24E−73 | 6.3 | DNQ | 0.4 | DNQ | 9.9 |
| Cov2_ORF1ab | 4.04E−69 | 24.7 | 5.01E−13 | 9.6 | 1.34E−20 | 4.8 | DNQ | 0.4 | 1.57E−03 | 10.6 |
Patient tissue types show dramatically different expression profiles with PBMC and Lung biopsy tissue rarely ever passing detection limits while BALF tissues show robust expression in infected patients. Cell lines also display strong SARS-CoV-2 expression although the magnitude of fold change was far less than that observed in BALF samples. “DNQ” stands for “did not qualify”, which indicates genes that did not pass Cook’s distance filtering in DESeq2 analysis.
Figure 2(A) dRAP is sensitive enough to detect subtle differences in SARS-CoV-2 transcripts quantities resulting in differential expression within the SARS-CoV-2 transcriptome. SARS-CoV-2 expression is also shown to be highly dependent on the system being studied. Patient BALF samples show high amounts of SARS-CoV-2, while PBMC and Lung patient samples display low or no SARS-CoV-2. (B–F) Log2 fold change comparison between differentially expressed genes in infected samples against non-infected samples shows that the tissue specificity of SARS-CoV-2 extends to the degree of concordance observed in the human transcriptome. Statistically significant concordance was observed between NHBE and A549 cell lines (p < 1e−5)(B) and BALF patient samples with NHBE (p < 0.05) (C) and A549 (p < 0.01) (E) cell lines using the chisquare test. However, there was no concordance observed in BALF versus Lung (p = 0.26) (D) and a significant discordance versus PBMC (p < 1e−38) (F) patient samples. This suggests that the lack of SARS-CoV-2 expression observed in Lung and PBMC samples is also associated with significantly altered human expression, making these tissue types not ideal for learning SARS-CoV-2 mechanisms.
Figure 3By clustering expression SARS-CoV-2 and human expression patterns concurrently we observed that SARS-CoV-2 transcripts were localized in a small clade visualized in red (A). This clade of coexpression with SARS-CoV-2 transcripts contains a set of genes associated with SARS-CoV-2 mechanisms in infection (B). The histogram distribution of PageRank values for the A549 (C) and NHBE (D) cell lines shows that the SARS-CoV-2 genes are highly influential. However, in BALF samples (E), SARS-CoV-2 genes are at the lower end of the PageRank distribution likely due to the numerous differentially expressed genes creating a much larger set than that for the cell lines. In A-C, the green line marks the 80th percentile in the distribution and the small red nodes along the distribution represent SARS-CoV-2 genes.