| Literature DB >> 34205050 |
Roberto J Cieza1,2, Jonathan L Golob2, Justin A Colacino3,4, Christiane E Wobus1.
Abstract
Acute gastroenteritis (AGE) has a significant disease burden on society. Noroviruses, rotaviruses, and astroviruses are important viral causes of AGE but are relatively understudied enteric pathogens. Recent developments in novel biomimetic human models of enteric disease are opening new possibilities for studying human-specific host-microbe interactions. Human intestinal enteroids (HIE), which are epithelium-only intestinal organoids derived from stem cells isolated from human intestinal biopsy tissues, have been successfully used to culture representative norovirus, rotavirus, and astrovirus strains. Previous studies investigated host-virus interactions at the intestinal epithelial interface by individually profiling the epithelial transcriptional response to a member of each virus family by RNA sequencing (RNA-seq). Despite differences in the tissue origin, enteric virus used, and hours post infection at which RNA was collected in each data set, the uniform analysis of publicly available datasets identified a conserved epithelial response to virus infection focused around "type I interferon production" and interferon-stimulated genes. Additionally, transcriptional changes specific to only one or two of the enteric viruses were also identified. This study can guide future explorations into common and unique aspects of the host response to virus infections in the human intestinal epithelium and demonstrates the promise of comparative RNA-seq analysis, even if performed under different experimental conditions, to discover universal and virus-specific genes and pathways responsible for antiviral host defense.Entities:
Keywords: astrovirus; human intestinal enteroids; meta-analysis; norovirus; rotavirus; transcriptomics
Mesh:
Year: 2021 PMID: 34205050 PMCID: PMC8227290 DOI: 10.3390/v13061059
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Figure 1Samples across the three datasets showed divergence due to tissue origin of the HIE line but due to viral infection when looking at each dataset individually. Nonlinear dimensionality reduction with t-SNE of the 16 samples belonging to the three datasets were analyzed. Samples are colored by the segment of the small intestine that was used to derive HIE and each one of the datasets corresponds to a different small intestinal segment (duodenum, ileum and jejunum). The t-SNE plot shows that clustering is primarily driven by the tissue origin of HIE (A). Nonlinear dimensionality reduction with t-SNE for each individual dataset (6, 12 and 4 samples for astrovirus-infected (B), norovirus-infected (C), and rotavirus-infected (D) HIEs datasets, respectively) shows clustering of samples due to treatment (mock-treated versus virus-infected) as well as the HIE line since in the norovirus-infected and rotavirus-infected HIE datasets more than one HIE line were used. Plots were generated with Rtsne and perplexity values were chosen for each plot after stability.
Figure 2A large degree of overlap of non-weakly expressed genes is observed across all datasets. Estimated gene count matrices for each RNA-seq dataset were filtered to remove genes with low constant expression levels (non-weakly expressed) with the package HTSFilter in R. Log transformed estimated counts for selected genes associated with IFN and Toll-like receptor signaling are shown. (A) A total of 11,262 shared genes were identified in the three RNA-seq datasets, and the norovirus-infected ileal HIEs contained the largest number of genes identified in a single dataset. (B) Genes detected in all three datasets and annotated in the Entrez Molecular Sequence Database System as part of the IFN response or involved in Toll-like receptor signaling are shown. Each column shows individual replicates (mock-treated samples in grey and virus-infected samples in black) for each study. Highlighted in green are genes coding for IFN or IFN receptors, while genes associated with Toll-like receptor signaling are in blue.
Differential expression analysis summary. Differential analysis of RNA-seq data was performed on the three published RNA-seq datasets. Listed are the total number non-weakly expressed genes that were analyzed in each dataset to determine differentially expressed genes (DEG).
| Differential Expression Analysis Summary | |||||
|---|---|---|---|---|---|
| Dataset | HIEs | # Genes Analyzed | DEGs | Up-Regulated-DEGs | Down-Regulated DEGs |
| astrovirus-infected | duodenal | 11,824 | 42 | 42 | 0 |
| norovirus-infected | ileal | 12,917 | 109 | 109 | 0 |
| rotavirus-infected | jejunal | 12,416 | 75 | 71 | 4 |
Figure 3Differential expression analysis in HIEs infected with astrovirus, norovirus and rotavirus. Differentially expressed genes (DEGs) for the three re-analyzed RNA-seq datasets are shown. DEGs with Log2 fold-change (FC) > 0.58 when compared to mock-treated samples and an FDR of less than 1% are shown red, while genes with an FDR of less than 1% (adjusted p-value < 0.01) that were below the Log2 FC threshold are shown in blue. FDR was calculated using an Empirical Bayes (EB) approach with the R package ashr. DEGs were identified in astrovirus-infected duodenal-HIEs (A), norovirus-infected ileal-HIEs (B) and rotavirus-infected jejunal-HIEs (C) using the package DESeq2 and plotted using the package EnhanceVolcano in R.
Figure 4Shared differentially expressed genes (DEGs) across datasets, which are part of the conserved epithelial response to viral infection, are enriched in IFN-associated genes. Shared and unique DEGs across all datasets were determined using the package DESeq2 and heatmaps generated with the package ComplexHeatmap in R. For a gene to be classified as differentially expressed it had to meet the criteria of an actual fold-change of at least 1.5 (Log2 FC > 0.58) with a false discovery rate (FDR) cutoff of 1% (adjusted p-value < 0.01) using an Empirical Bayes (EB) approach with the R package ashr. The intersection and union of DEGs across the three RNA-seq datasets evaluated is shown (A). Expression pattern for DEGs shared across all datasets is shown, with genes annotated in the Entrez Molecular Sequence Database System as part of IFN responses highlighted in red and genes with a symbol containing IFI (IFN-inducible) or ISG (IFN-stimulated genes) in orange (B). Partially shared DEGs in two out of three RNA-seq datasets are also shown (C). The sampling time point in hours post-infection (h.p.i) at which each RNA-seq dataset was generated is indicated.
Differentially expressed genes (DEGs) shared in all datasets that are linked to IFN responses. This list contains DEGs that were identified in each dataset and thus thought to contribute to the conserved antiviral response of human intestinal enteroids (HIEs) as well as their association with Type I, II and III IFN responses according to the Interferome database.
| Shared Differentially Expressed Genes (DEGs) and IFN Responses | ||||||
|---|---|---|---|---|---|---|
| HGNC Symbol | Ensemble Gene ID | Entrez Gene ID | Entrez Gene Description | Type I IFN | Type II IFN | Type III IFN |
|
| ENSG00000168062 | 116071 | basic leucine zipper ATF-like transcription factor 2 | yes | yes | yes |
|
| ENSG00000134326 | 129607 | cytidine/uridine monophosphate kinase 2 | yes | yes | no |
|
| ENSG00000169245 | 3627 | C-X-C motif chemokine ligand 10 | yes | yes | no |
|
| ENSG00000107201 | 23586 | DExD/H-box helicase 58 | yes | yes | yes |
|
| ENSG00000137628 | 55601 | DExD/H-box helicase 60 | yes | yes | no |
|
| ENSG00000108771 | 79132 | DExH-box helicase 58 | yes | yes | no |
|
| ENSG00000133106 | 94240 | epithelial stromal interaction 1 | yes | yes | yes |
|
| ENSG00000130589 | 85441 | helicase with zinc finger 2 | yes | yes | yes |
|
| ENSG00000138646 | 51191 | HECT and RLD domain containing E3 ubiquitin protein ligase 5 | yes | yes | no |
|
| ENSG00000196684 | 84941 | hematopoietic SH2 domain containing | yes | yes | no |
|
| ENSG00000163565 | 3428 | interferon gamma inducible protein 16 | yes | yes | no |
|
| ENSG00000115267 | 64135 | interferon induced with helicase C domain 1 | yes | yes | yes |
|
| ENSG00000185745 | 3434 | interferon induced protein with tetratricopeptide repeats 1 | yes | yes | yes |
|
| ENSG00000119922 | 3433 | interferon induced protein with tetratricopeptide repeats 2 | yes | yes | yes |
|
| ENSG00000119917 | 3437 | interferon induced protein with tetratricopeptide repeats 3 | yes | yes | yes |
|
| ENSG00000185885 | 8519 | interferon induced transmembrane protein 1 | yes | yes | no |
|
| ENSG00000187608 | 9636 | ISG15 ubiquitin like modifier | yes | yes | yes |
|
| ENSG00000078081 | 27074 | lysosomal associated membrane protein 3 | yes | yes | no |
|
| ENSG00000002549 | 51056 | leucine aminopeptidase 3 | yes | yes | no |
|
| ENSG00000157601 | 4599 | MX dynamin like GTPase 1 | yes | yes | yes |
|
| ENSG00000183486 | 4600 | MX dynamin like GTPase 2 | yes | yes | no |
|
| ENSG00000111335 | 4939 | 2′-5′-oligoadenylate synthetase 2 | yes | yes | no |
|
| ENSG00000111331 | 4940 | 2′-5′-oligoadenylate synthetase 3 | yes | yes | yes |
|
| ENSG00000135114 | 8638 | 2′-5′-oligoadenylate synthetase like | yes | yes | yes |
|
| ENSG00000178685 | 84875 | poly(ADP-ribose) polymerase family member 10 | yes | yes | yes |
|
| ENSG00000134321 | 91543 | radical S-adenosyl methionine domain containing 2 | yes | yes | no |
|
| ENSG00000101347 | 25939 | SAM and HD domain containing deoxynucleoside triphosphate triphosphohydrolase 1 | yes | yes | no |
|
| ENSG00000115415 | 6772 | signal transducer and activator of transcription 1 | yes | yes | yes |
|
| ENSG00000170581 | 6773 | signal transducer and activator of transcription 2 | yes | yes | yes |
|
| ENSG00000184979 | 11274 | ubiquitin specific peptidase 18 | yes | yes | yes |
|
| ENSG00000132530 | 54739 | XIAP associated factor 1 | yes | yes | no |
|
| ENSG00000124256 | 81030 | Z-DNA binding protein 1 | yes | yes | no |
Figure 5Gene Ontology (GO) over-representation analysis showed that the top over-represented biological processes (BP) for all datasets were associated with defense response to virus and type I IFN responses. Gene ontology (GO) analysis was performed on the list of DEG identified for each independently analyzed dataset to identify over-represented biological processes (BP). The top 10 over-represented BP categories are shown for each dataset as well as the number of genes contributing to the over-representation of each BP. Astrovirus-infected duodenal HIE (A), norovirus-infected ileal HIE (B), and rotavirus-infected jejunal HIE (C) are shown. An adjusted p-value cutoff of 0.001 was used to determine significantly over-represented BP. Shared and unique over-represented BP categories across all datasets were determined from the list of over-represented BP categories obtained for each dataset (D). Over-represented BP categories were identified with the package clusterProfiler in R.
Summary of Gene Ontology (GO) over-representation analysis. Over-representation analysis of biological processes (BP) was performed with the package ClusterProfiler in R on three published datasets of virus infected HIEs. An FDR cutoff of 0.1% (adjusted p value < 0.001) with the Benjamini–Hochberg (BH) step-up procedure was used to determine significantly over-represented BP. Listed are the total number of DEGs identified for each dataset as well as the number of DEGs that were assigned to a BP after over-representation analysis. For each dataset, the number of BP over-represented are also shown.
| Gene Ontology (GO) Over-Representation Analysis Summary | ||||
|---|---|---|---|---|
| Dataset | HIE | # DEGs | # DEGs Assigned to BP Categories | # BP Categories Over-Represented |
| astrovirus-infected | duodenal | 42 | 41 | 18 |
| norovirus-infected | ileal | 109 | 103 | 26 |
| rotavirus-infected | jejunal | 75 | 69 | 23 |
Over-representation analysis of biological processes (BP) with the package ClusterProfiler in R was used to identify uniquely over-represented BPs for each dataset. A false discovery rate (FDR) adjusted p-value cutoff of <0.001 using the Benjamini–Hochberg (BH) step-up procedure was used to determine significant over-represented BP categories. BP categories unique to each dataset are indicated including the gene ontology (GO) term ID and description as well as the genes assigned to the GO term ID.
| Uniquely Over-Represented Biological Processes (BPs) for each Dataset Analyzed | |||||||
|---|---|---|---|---|---|---|---|
| Dataset | HIEs | GO Term ID | GO Term Description | Gene | Gene | adj. | Gene ID |
| astrovirus-infected | duodenal | GO:0050688 | regulation of defense response to virus | 0.12 | 5 | 0.00012 |
|
| norovirus-infected | ileal | GO:1903900 | regulation of viral life cycle | 0.27 | 28 | 0.00000 |
|
| norovirus-infected | ileal | GO:0019058 | viral life cycle | 0.27 | 28 | 0.00000 |
|
| norovirus-infected | ileal | GO:0034340 | response to type I interferon | 0.26 | 27 | 0.00000 |
|
| norovirus-infected | ileal | GO:0000209 | protein polyubiquitination | 0.15 | 15 | 0.00000 |
|
| norovirus-infected | ileal | GO:0002683 | negative regulation of immune system process | 0.12 | 12 | 0.00020 |
|
| norovirus-infected | ileal | GO:0051701 | interaction with host | 0.11 | 11 | 0.00002 |
|
| norovirus-infected | ileal | GO:0043123 | positive regulation of I-kappaB kinase/NF-kappaB signaling | 0.10 | 10 | 0.00006 |
|
| norovirus-infected | ileal | GO:0046718 | viral entry into host cell | 0.09 | 9 | 0.00001 |
|
| norovirus-infected | ileal | GO:0051092 | positive regulation of NF-kappaB transcription factor activity | 0.09 | 9 | 0.00007 |
|
| norovirus-infected | ileal | GO:0071360 | cellular response to exogenous dsRNA | 0.04 | 4 | 0.00029 |
|
| rotavirus-infected | jejunal | GO:0048525 | negative regulation of viral process | 0.25 | 17 | 0.00000 |
|
| rotavirus-infected | jejunal | GO:0050778 | positive regulation of immune response | 0.23 | 16 | 0.00000 |
|
| rotavirus-infected | jejunal | GO:0007259 | receptor signaling pathway via JAK-STAT | 0.10 | 7 | 0.00002 |
|
| rotavirus-infected | jejunal | GO:0002753 | cytoplasmic pattern recognition receptor signaling pathway | 0.07 | 5 | 0.00110 |
|
Figure 6DEG expression patterns in shared over-represented biological processes (BP) in the three RNA-seq datasets after gene ontology (GO) over-representation analysis. Heatmap-like functional classification displays in which differentially expressed genes (DEGs) were assigned to the BP categories “response to virus” (A), “type I interferon production” (B) and “regulation of response to cytokine stimulus” (C). For each dataset, genes that were not differentially expressed for the indicated RNA-seq dataset are indicated in grey. Genes annotated in the Entrez Molecular Sequence Database System as part of IFN responses are highlighted in red. Genes with a symbol containing IFI (IFN-inducible) or ISG (IFN-stimulated genes) are in orange, while genes coding for IFN proteins are in green. Log2 fold-change (FC) values within the range of each dataset are shown. Over-represented BP categories were identified with the package clusterProfiler and heatmaps generated with the package ComplexHeatmap in R. The sampling time point in hours post-infection (h.p.i) at which each RNA-seq dataset was generated is indicated.
Figure 7Transcription factor binding site (TFBS) analysis of differentially expressed gene sets (DEGs). Displayed are the enriched TFBS present in the list of 63 DEGs that were shared or partially shared across the three RNA-seq datasets (astrovirus, norovirus and rotavirus-infected HIEs) (A), the 47 DEGs only identified in norovirus-infected HIEs (B), and the 18 DEGs only identified in rotavirus-infected HIEs (C). Enrichment Score (ES) values are displayed for each transcription factor (TF) reflecting the degree to which a TF is significantly enriched (FDR-adjusted p value < 0.05) in the subset of DEGs as well as the number of ChIP-seq datasets used in the analysis of the TF. Transcription factor enrichment analysis (TFEA) was performed with the package TFEA.ChIP in R.