| Literature DB >> 35205126 |
Alyssa Obermayer1, Li Dong2, Qianqian Hu3, Michael Golden4, Jerald D Noble5, Paulo Rodriguez6, Timothy J Robinson5, Mingxiang Teng1, Aik-Choon Tan1, Timothy I Shaw1.
Abstract
High-throughput transcriptomic and proteomic analyses are now routinely applied to study cancer biology. However, complex omics integration remains challenging and often time-consuming. Here, we developed DRPPM-EASY, an R Shiny framework for integrative multi-omics analysis. We applied our application to analyze RNA-seq data generated from a USP7 knockdown in T-cell acute lymphoblastic leukemia (T-ALL) cell line, which identified upregulated expression of a TAL1-associated proliferative signature in T-cell acute lymphoblastic leukemia cell lines. Next, we performed proteomic profiling of the USP7 knockdown samples. Through DRPPM-EASY-Integration, we performed a concurrent analysis of the transcriptome and proteome and identified consistent disruption of the protein degradation machinery and spliceosome in samples with USP7 silencing. To further illustrate the utility of the R Shiny framework, we developed DRPPM-EASY-CCLE, a Shiny extension preloaded with the Cancer Cell Line Encyclopedia (CCLE) data. The DRPPM-EASY-CCLE app facilitates the sample querying and phenotype assignment by incorporating meta information, such as genetic mutation, metastasis status, sex, and collection site. As proof of concept, we verified the expression of TP53 associated DNA damage signature in TP53 mutated ovary cancer cells. Altogether, our open-source application provides an easy-to-use framework for omics exploration and discovery.Entities:
Keywords: CCLE; R Shiny application; RNA-seq; T-cell acute lymphoblastic leukemia; multi-omics analysis; proteomics
Year: 2022 PMID: 35205126 PMCID: PMC8869715 DOI: 10.3390/biology11020260
Source DB: PubMed Journal: Biology (Basel) ISSN: 2079-7737
Figure 1DRPPM-EASY expression analysis pipeline. (A) Schematic workflow of DRPPM-EASY. The pipeline takes in input files of an expression matrix, a sample meta-file specifying sample grouping, and a gene set database for GSEA. A GSEA enriched signature table is generated as a preprocessing step, which is used as input to the R Shiny app. The app generates two modes of exploring the data: (1) general differential gene expression analysis and (2) gene set enrichment analysis. The result from the analysis can be downloaded as output tables. (B) Schematic of the integrative analysis with three major features for pathway signature comparison. The app has three modes of integrative analysis: (1) scatter plot mode, (2) correlation plot mode, and (3) paired multi-omics analysis.
Data Exploration Module.
| App Function | Description | |
|---|---|---|
| E1 | Unsupervised Heatmap |
Top variable gene selection Expression data is log2 transformed then z-normalized User-specified clustering method |
| E2 | Scatter Plot |
User selects two genes of interest Expression values compared via interactive scatter plot (log2 transformation is optional) |
| E3 | Custom Heatmap |
Visualize user-selected genes and samples Expression data is log2 transformed and z-normalized User-specified clustering method |
| E4 | Box Plot |
Gene expression in each group are shown Expression values are log2 transformed Comparing groups for statistical differences |
Differential Expression Analysis Module.
| App Function | Description | |
|---|---|---|
| DEA1 | Volcano Plot |
User selects comparison groups Differential gene expression analysis with LIMMA Up- and downregulated differentially expressed genes determined with user input |
| DEA2 | MA Plot |
User selects comparison groups Differential gene expression analysis with LIMMA Up- and downregulated differentially expressed genes determined with user input |
| DEA4 | Pathway Enrichment Analysis |
User selects comparison groups and gene set/pathway Differential gene expression analysis with LIMMA Pathway enrichment analysis using enrichR |
Gene Set Enrichment Analysis Module.
| App Function | Description | |
|---|---|---|
| GA1 | Enrichment Plot |
User selects comparison groups Signal-to-noise ranking performed on expression data GSEA function performed with chosen gene set |
| GA2 | Gene Expression Heatmap |
User selects comparison groups Signal-to-noise ranking performed on expression data GSEA function performed with chosen gene set Expression data log2 transformed and scaled Genes from chosen gene set displayed in the heatmap |
| GA3 | GSEA Summary Table |
Displays user pre-generated enriched signatures table |
| GA4 | Generate Summary Table |
GSEA function performed on expression data with user input GMT file Enriched signatures table produced is displayed |
| GA5 | ssGSEA Boxplots |
User-selects gene set and single-sample GSEA method Comparing groups for statistical differences |
Integrative Analysis.
| App Function | Description | |
|---|---|---|
| IA1 | Scatter Plot Comparison |
User input features are merged and plotted Samples are colored based on metadata type |
| IA2 | Correlation Rank Plot |
Assessing the relationship between ssGSEA score and gene expression performed Correlation can be performed as Spearman, Pearson, or Kendall Correlation values plotted by rank from lowest to highest |
| IA3 | Matrix Comparison File Upload |
Upload two expression matrices and two metadata files |
| IA4 | Log2FC Comparison Scatter Plot |
Differential gene expression analysis with LIMMA performed on both matrices Log2 fold change values subset and difference between matrices calculated Expression data displayed as scatter plot |
| IA5 | Reciprocal GSEA |
Differential gene expression analysis with LIMMA Four gene sets derived differentially expressed genes (two upregulated, and two downregulated gene set) GSEA performed on the reciprocal data |
| IA6 | Reciprocal ssGSEA |
Differential gene expression analysis with LIMMA Four gene sets derived differentially expressed genes (two upregulated, and two downregulated gene set) ssGSEA performed on the reciprocal data |
| IA7 | Venn Diagram |
Differential gene expression analysis with LIMMA Overlapping differentially expressed genes Perform Fisher’s exact test. Calculate Cohen’s kappa, and Jaccard index to compare between the two matrix and across user selected pathways. |
Figure 2Expression analysis example of RNA-seq data USP7 silenced Jurkat cells. (A) Unsupervised clustering of the RNA sequencing data using the top 100 genes ranked based on mean absolute deviation (MAD). (B) Differential gene expression analysis comparing USP7 knockdown and scramble. Genes upregulated after USP7 knockdown are shown in red and genes downregulated after USP7 knockdown are shown in blue (USP7-associated targets). (C) Boxplot showing the USP7 expression in log2 FPKM. (D) Gene set enrichment analysis of MYC targets. (E) Boxplot showing the single sample GSVA analysis of the TAL1 gene set. (F) Boxplot showing the single sample GSVA analysis of the Hallmark Apoptosis gene set.
Figure 3Integrated analysis example of proteomics and transcriptomics USP7 silenced Jurkat cells. (A) Jurkat samples treated with USP7 shRNA and scramble were profiled by RNA sequencing and TMT mass spectrometry. (B) The log2 fold change from the differential expression analyses is plotted. Positive log2FC indicates upregulated expression after USP7 silencing. Negative log2FC indicates downregulated expression after USP7 knockdown. Dotted line indicates the −1 and 1 log2FC cutoff. (C) Upregulated and downregulated gene signatures derived from differentially expressed mRNAs. (D) Venn diagram of genes differentially upregulated (top panel) and downregulated (bottom panel) in the transcriptome (left) and proteome (right). (E) Up-regulated and downregulated gene signatures derived from differentially expressed proteins. (F,G) Reciprocal GSEA of differentially expressed genes derived from the transcriptome and examined in the proteomics data (F). Similarly, differentially expressed proteins were first derived then examined in the transcriptome data by GSEA (G).
Figure 4Use case analysis example of CCLE Expression data. (A) Drop-down menu selection of sample cohort and sample phenotype characteristic. CCLE ovary samples and TP53 mutation status were selected from the drop-down menu option. (B) Single-sample GSEA analysis of genes defining the DNA damage response by Amundson et al. Analyzed samples were selected from the drop-down menu from (A). (C) Drop-down menu selection of sample cohort and sample phenotype characteristic. CCLE non-small cell lung cancer samples and phenotype associated with the KRAS mutation status were selected from the drop-down menu option. (D) Single sample GSEA analysis of genes negatively regulating the DNA damage response. (E) Single sample GSEA of genes defining the stress granule assembly and disassembly. Gene sets were compiled from Biological Pathways from the Gene Ontology database (GOBP). Analyzed samples were selected from the drop-down menu from (C).