| Literature DB >> 31355053 |
Michael T Zimmermann1,2,3, Brian Kabat3, Diane E Grill3, Richard B Kennedy4, Gregory A Poland4.
Abstract
BACKGROUND: Identifying the biologic functions of groups of genes identified in high-throughput studies currently requires considerable time and/or bioinformatics experience. This is due in part to each resource housed within separate databases, requiring users to know about them, and integrate across them. Time consuming and often repeated for each study, integrating across resources and merging with data under study is an increasingly common bioinformatics task.Entities:
Keywords: Enrichment analysis; Gene annotation; Gene networks; Genomic data interpretation; Knowledge generation; Open-source software; Software tools; Systems biology; Transcriptomics
Year: 2019 PMID: 31355053 PMCID: PMC6644632 DOI: 10.7717/peerj.6994
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 3.061
Figure 1RITAN facilitates rapid and comprehensive annotation and network integration.
We use the arbitrarily chosen example of GSE9988, a study of anti-TREM1 effects on influenza response. This study considered multiple treatments and lists of genes associated with each comparison of those treatments have been published. (A) From an input gene list (blue nodes), RITAN leverages multiple network resources to identify neighbors (green nodes) and (B) performs integrated term enrichment. We use GSE9988 (Dower et al., 2008), a study of anti-TREM1 effects on influenza response, as an example. Using RITAN, multiple resources were combined into a single enrichment analysis with results presented as a heatmap where greater intensity indicates greater statistical significance. The values in the heatmap are the −log10(q-value) with values capped at 10 for visualization. To better use the color scale. Adding a study design matrix above the enrichment heatmap (also by RITAN) visually reveals that significant enrichment scores are defined by the up-regulated gene lists and that they capture aspects of inflammation, innate immunity, cytokine signaling, and interferon signaling. Considering the two comparisons emphasized by a black border, LPS and anti-TREM1-antibody vs. control-treated, there are many terms and pathways exhibiting significant differences in enrichment. (C) RITAN also integrates across multiple network biology resources. The genes associating with up-regulated responses in Fig. 1B are further contextualized by identification of their known inter-relationships. The network information can be directly mined, used in computation, or visualized. In this case, we have uploaded the integrated network returned by RITAN to Cytoscape for visualization. We have scaled each gene by the number of interactions they have as an indication of how “central” they are to response. The network facilitates hypothesis generation by visually displaying the differential association with anti-TREM1 response (indicated by red outline) with different NFkB subunits, differential Akt regulation via JUN, PTEN, GRB2, etc., and different neutrophil chemotactic signals (e.g., FPR3 or CXCL3). Thus, the network is a hypothesis generating tool, focused by first identifying the most relevant subset of results from our integrated enrichment analysis. (D) RITAN integrates network biology resources, which can be directly visualized or imported into Cytoscape. In this example, JUN, NFKB1, TNF, and multiple cytokines were not differentially expressed, but their interactions with the differentially expressed genes were identified, implicating their activity in cellular responses.
Figure 2RITAN facilitates comparing resource similarities.
(A) Within the DisGeNet resource, we compare the overlap of genes annotated to each disease. A disease is shown if it shares at least an 80% overlap with another disease, ignoring self-overlap. (B) Between two resources, DisGeNet, and Go Slim, we show interactions among terms when any pair shares at least 95% overlap.
Figure 3RITAN facilitates hypothesis generation and exploration.
We used TCGA BRCA differential gene expression data and RITAN to explore specific hypotheses. (A) We annotated differentially expressed genes among luminal A (LumA), Basal, and HER2+ subtypes by their presence in protein complexes, rare disease definitions, and known regulatory programs. The overlapping genes for each combination of annotations indicates additional hypotheses to explore. We used a five-way Venn diagram to show how many genes have each annotation. (B) An example of an additional hypothesis is how functional term enrichment may differ, not only between disease subtypes, but also by contribution to rare disease definitions. We show enrichment plots by subgroup formatted similar to Fig. 1. (C) Additionally, how pathway activation may differ. (D) Similar to Fig. 2, RITAN can be used to explore relationships among resources and to filter each resource to a unique subset. The analysis script for this figure is short (93 lines including comments and plotting; available in Supplemental Data), emphasizing the flexibility and simplicity afforded by RITAN.