| Literature DB >> 33532728 |
Irina Kuznetsova1,2,3, Artur Lugmayr4,5, Oliver Rackham1,2,6,7,8, Aleksandra Filipovska1,2,3,7,8,9.
Abstract
Advances in omics technologies have generated exponentially larger volumes of biological data; however, their analyses and interpretation are limited to computationally proficient scientists. We created OmicsVolcano, an interactive open-source software tool to enable visualization and exploration of high-throughput biological data, while highlighting features of interest using a volcano plot interface. In contrast to existing tools, our software and user-interface design allow it to be used without requiring any programming skills to generate high-quality and presentation-ready images.Entities:
Keywords: Bioinformatics; Genomics; Proteomics; RNA-seq
Mesh:
Year: 2021 PMID: 33532728 PMCID: PMC7821039 DOI: 10.1016/j.xpro.2020.100279
Source DB: PubMed Journal: STAR Protoc ISSN: 2666-1667
Figure 1The OmicsVolcano home page
The software requires four steps to generate the interactive omics-data visualization plots. The “file” option allows the upload of the input data as an ASCII file which contains IDs, gene names, gene descriptions, log2 fold changes, and adjusted p values. The “explore” option provides several main core functionalities of the software: plot, custom gene or protein selection, mitochondrial processes, multiple process visualization, and cellular compartment visualization. Customization of the statistical significance and threshold of the y-axis, are performed by adjustments of the slider widgets. Additional options, e.g., “upload a gene file”, “insert a list of genes”, “select organism”, and “show mitochondrial process” widgets allow the customization of the data exploration processes. The “export” option enables the export of data as tables or graphics in various pre-defined formats. A manual is available through the “help” option, and the software package version including additional information about the software can be found in the “about” option.
Operating environments on which the software was tested
Recommended hardware: minimal 4 Gb memory. Memory requirements may increase with input data size.
Processors: 1 required, 2 recommended.
Example data are provided with the software package. User input files for omics datasets should be formatted as a tab or as a semicolon separated file in ASCII/text format. The file should contain five columns with the column names as shown in Table 2: ID, GeneSymbol, Description, Log2FC, AdjPValue.
Input file example
Column names are case-sensitive and require a header row for the input file. Thus, when preparing input files for OmicsVolcano, it is essential to provide a header row. The following rows contain various values, which will be processed by the OmicsVolcano software.
| ID | Gene Symbol | Description | Log2FC | AdjPValue |
|---|---|---|---|---|
| Q4U4S6 | Xirp2 | Xin actin-binding repeat-containing protein 2 OS=Mus musculus OX=10090 GN=Xirp2 PE=1 SV=1 | 6.64 | 1.33E-08 |
| Q497D7 | Rpl30fo | Rpl30 protein OS=Mus musculus OX=10090 GN=Rpl30 PE=2 SV=1 | 2.14 | 0.8 |
| Q9CPP6 | Ndufa5 | NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 5 OS=Mus musculus OX=10090 GN=Ndufa5 PE=1 SV=3 | -1.52 | 6.24E-08 |
| P09055 | Itgb1 | Integrin beta-1 OS=Mus musculus OX=10090 GN=Itgb1 PE=1 SV=1 | 0.08 | 6.29E-08 |
| … | … | … | ... | … |
Input file example
| ID | Gene Symbol | Description | Log2FC | AdjPValue |
|---|---|---|---|---|
| Q4U4S6 | Xirp2 | Xin actin-binding repeat-containing protein 2 OS=Mus musculus OX=10090 GN=Xirp2 PE=1 SV=1 | 6.64 | 1.33E-08 |
| Q497D7 | Rpl30fo | Rpl30 protein OS=Mus musculus OX=10090 GN=Rpl30 PE=2 SV=1 | 2.14 | 0.8 |
| Q9CPP6 | Ndufa5 | NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 5 OS=Mus musculus OX=10090 GN=Ndufa5 PE=1 SV=3 | -1.52 | 6.24E-08 |
| P09055 | Itgb1 | Integrin beta-1 OS=Mus musculus OX=10090 GN=Itgb1 PE=1 SV=1 | 0.08 | 6.29E-08 |
| … | … | … | ... | … |
Figure 2Initializing the OmicsVolcano software in R studio
Screen capture indicating how to select the run app in R studio to open the software.
Figure 3A volcano plot example with specific interactively selected gene labels
Transcriptomic data are visualized where significantly increased transcripts are shown in light red and significantly decreased transcripts in light blue. Non-significant transcripts are shown in gray. Three increased and three decreased transcripts are indicated in the plot as examples, showing the software’s capability to label individual transcripts of interest.
Figure 4Volcano custom plot example including user-defined searches
Transcriptomic data are visualized with significantly increased transcripts shown in light red and significantly decreased transcripts shown in light blue. Non-significant transcripts are shown in gray. User-specified transcripts or proteins can be searched in the right-hand box under “Custom Gene list” or uploaded in a user-defined file. The file-based import is useful when a large number of transcripts or proteins are to be searched and visualized in the plot. Dark red color indicates if they are significantly increased, and dark blue if they are significantly reduced within the dataset.
Figure 5Volcano plot showing mitochondrial processes
Transcriptomic data are visualized and mitochondrial transcripts are highlighted in dark red, dark blue, and dark gray, if they are significantly increased, decreased, or unchanged, respectively. Specific processes can be selected from the dropdown menu and these will be highlighted in either dark red or dark blue depending on their change within the dataset.
Figure 6Volcano plot showing specific selection and color coding of multiple mitochondrial processes
This dropdown menu allows the user to select specific processes and custom colors for each process and visualizes them on the plot. This feature enables additional multiple processes to be visualized at the same time.
Figure 7Volcano plot showing cellular localizations
This feature enables the user to explore changes in specific cellular locations by selecting specific cellular compartments from the dropdown menu. The selected cellular compartments are visualized depending on the changes (increased in dark red and decreased in dark blue) in the input file related to the selected cellular compartment (in this case the endoplasmic reticulum).
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| RNA sequencing data | GEO | |
| Proteomics data | PRIDE | |
| R version 3.6.1 and 4.02 | Team, R.C. 2019 R: A language and environment for statistical computing | |
| shiny version 1.4.0 | ||
| shinydashboard version 0.7.1 | ||
| shinydashboardPlus version 0.7.1 | ||
| shinyWidgets version 0.5.0 | ( | |
| shinythemes version 1.1.2 | ||
| shinyjs version 2.0.0 | ||
| dplyr version 0.8.3 | ||
| plotly version 4.9.1 | ||
| ggplot2 version 3.2.1 | ||
| crosstalk version 1.0.0 | ||
| DT version 0.12 | ||
| svglite version 1.2.3 | ||
| stringr version 1.4.0 | ||
| config version 0.3 | ||
| colourpicker version 1.1.0 | ||
| gridExtra version 2.3 | ||
| OmicsVolcano | this manuscript | |
| Example: | Ndufs2 |
| Gatc | |
| Cox7a1 | |
| lmnb1 | |
| Ndufa8 |