Literature DB >> 33239692

VolcaNoseR is a web app for creating, exploring, labeling and sharing volcano plots.

Joachim Goedhart1, Martijn S Luijsterburg2.   

Abstract

Comparative genome- and proteome-wide screens yield large amounts of data. To efficiently present such datasets and to simplify the identification of hits, the results are often presented in a type of scatterplot known as a volcano plot, which shows a measure of effect size versus a measure of significance. The data points with the largest effect size and a statistical significance beyond a user-defined threshold are considered as hits. Such hits are usually annotated in the plot by a label with their name. Volcano plots can represent ten thousands of data points, of which typically only a handful is annotated. The information of data that is not annotated is hardly or not accessible. To simplify access to the data and enable its re-use, we have developed an open source and online web tool with R/Shiny. The web app is named VolcaNoseR and it can be used to create, explore, label and share volcano plots ( https://huygens.science.uva.nl/VolcaNoseR ). When the data is stored in an online data repository, the web app can retrieve that data together with user-defined settings to generate a customized, interactive volcano plot. Users can interact with the data, adjust the plot and share their modified plot together with the underlying data. Therefore, VolcaNoseR increases the transparency and re-use of large comparative genome- and proteome-wide datasets.

Entities:  

Year:  2020        PMID: 33239692      PMCID: PMC7689420          DOI: 10.1038/s41598-020-76603-3

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

The volcano plot visualizes complex datasets generated by genomic screening or proteomic approaches. It is essentially a scatter plot, in which the coordinates of data points are defined by effect size and statistical significance[1,2]. Volcano plots typically show the data of hundreds to ten thousands of genes or proteins. Examples of such datasets are gene expression changes measured by RNA-seq[3], genome-wide loss-of-function CRISPR screens[4], or mapping the interactome of proteins-of-interest by mass spectrometry[5]. Although volcano plots are based on rich datasets, only a handful of data points are usually labeled with a gene or protein name. This enables the visual identification of hits and simplifies the interpretation of the complex dataset. Nevertheless, the data points that are not annotated may be of equal interest. Therefore, it is highly desirable to have easy access to the information of all data points from such large datasets. Volcano plots are typically generated using commercial software or with software that requires the user to write scripts. A viable alternative is provided by dedicated free web apps that allow users to generate plots through a graphical user interface (GUI). Several web apps are available[6,7], but these do not generate interactive plots and only have limited options for customization and annotation. Moreover, there is currently no easy and straightforward way of sharing the volcano plot together with the data. Therefore, we decided to generate a web-based online tool for generating and sharing volcano plots, similar to other plotting apps that we previously generated[8,9]. Here, we report an open source web app for generating, exploring, labeling and sharing volcano plots. The web app is created with R/Shiny and is dubbed VolcaNoseR. Below we discuss the features of the app.

Availability, code and issue reporting

The VolcaNoseR webtool is available at: https://huygens.science.uva.nl/VolcaNoseR or at (as long as the bandwidth limit is not reached): https://goedhart.shinyapps.io/VolcaNoseR/. The code was written using R (https://www.r-project.org) and Rstudio (https://www.rstudio.com). To run the app, several freely available packages are required: shiny, ggplot2, magrittr, dplyr, ggrepel, shinycssloaders, DT, RCurl and readxl. The code of version 1.0.3 reported in this manuscript is archived at Zenodo.org: https://doi.org/10.5281/zenodo.4002791. Up-to-date code and new releases will be made available on GitHub, together with information on running the app locally: https://github.com/JoachimGoedhart/VolcaNoseR. The GitHub page of VolcaNoseR is the preferred way to communicate issues and request features (https://github.com/JoachimGoedhart/VolcaNoseR/issues). Alternatively, the users can contact the developers by email or Twitter. Contact information is found on the “About” page of the app.

Data input and format

The data can be supplied via file upload. The accepted file formats are text (with extension CSV or TXT) and spreadsheets (with extension XLS or XLSX). Different delimiters are acceptable for the text format, including the Comma Separate Values (CSV) format. Upload of Excel workbooks with multiple sheets is also supported. Alternatively, a CSV file from an online data repository can be used through a URL. A limitation of the app on the Huygens server (https://huygens.science.uva.nl/VolcaNoseR) is the file size of ~ 1 Mb. Larger files are accepted when the app is run locally from R (up to 10 Mb) or from the shinyapps webserver (https://goedhart.shinyapps.io/VolcaNoseR/). To demonstrate the features of the app, example data is included of which the details can be found elsewhere[3,10]. After data upload, the user selects the columns that hold the information on the fold change (for the x-coordinate) and the significance (for the y-coordinate). Selecting a column with gene or protein names is optional.

Data visualization

A typical volcano plot shows the log2 of the fold change on the x-axis and minus log10 of the p-value on the y-axis. The data is shown as dots and their size and transparency can be adjusted. The position of the individual points is defined by these coordinates. By hovering over the data points, the information about the data can be accessed immediately and dynamically. When the pointer (mouse) is near a data point, the x- and y-coordinate and the name is retrieved, providing the user with easy access to the underlying data. In some cases, it may be desirable to display a 90 degrees rotated volcano plot. This option is available and will depict the fold change on the y-axis and the significance on the x-axis.

Thresholds and hits

The user can set threshold values for the fold change and the significance. The threshold values are indicated by dashed lines in the plot and used to classify the data as ‘unchanged’, ‘decreased’ or ‘increased’. The data are colored according to this classification and this can be shown in a legend. The ‘top hits’ can be automatically detected and ranked based on a number of criteria. The default criterion is the Manhattan distance (|ΔX| +|ΔY|) of the data from the origin (0,0). The other criteria are Euclidean distance (SQRT(ΔX2 + ΔY2)), absolute fold change or significance. The data are sorted based on the selected criterion and the 10 top-ranking data points are selected. The number of top ranking data points can be adjusted by the user. It is possible to annotate only ‘increased’ or ‘decreased’ or all significantly changed (‘increased’ and ‘decreased’) data points. The top-ranking hits are shown in the plot and there is an option to list them in a table. Finally, the user can manually search and select the names of genes or proteins of interest, which will be annotated in the plot and added to the table. The standard colors to indicate ‘unchanged’, ‘increased’ and ‘decreased’ are respectively grey, red and blue. Another color combination that is available is grey, blue and green. Users can also define their own color scheme.

Output

Users can customize the titles and sizes for the axes labels. The plot that is generated by the app can be directly retrieved by drag-and-drop from the web browser. In addition, the plot can be downloaded as a PNG or PDF file. The PNG is a lossless bitmap format. The PDF allows for downstream processing/editing with software that can handle vector-based graphics.

Sharing data and plot settings

All settings that are defined in the user interface can be stored as a URL, as was previously implemented for PlotsOfData and PlotTwist[8,9]. When the data is retrieved from an external online resource, this hyperlink is included in the URL. The URL with settings is sufficient to (1) launch the app, (2) retrieve the data, and (3) plot the data according to user-defined settings. Once the plot is available, it can be adjusted and a new URL reflecting the new settings can be obtained. This feature enables transparent reporting of all the data and simplifies re-use of the data (Fig. 1).
Figure 1

The standard output of the VolcaNoseR app for example dataset 1 (

taken from a published RNA-seq dataset). The annotated dots are the ten data points that have the largest (Manhattan) distance from the origin and are above the thresholds indicated by the dashed line. Direct access to an interactive plot and all the data is provided by this hyperlink.

The standard output of the VolcaNoseR app for example dataset 1 ( taken from a published RNA-seq dataset). The annotated dots are the ten data points that have the largest (Manhattan) distance from the origin and are above the thresholds indicated by the dashed line. Direct access to an interactive plot and all the data is provided by this hyperlink. We illustrate this feature with data from proteomic screens that we have recently published[5]. These data are deposited and publicly available at the data repository zenodo.org, https://doi.org/10.5281/zenodo.3713174. Volcano plots are generated with VolcaNoseR using the data from the CSV files in the repository. Next, the URL that encodes all necessary information was generated using the ‘clone current setting’ button. With this unique URL, the data is retrieved and a plot is generated by VolcaNoseR based on the parameters that are stored in the URL. For instance, this URL produces an interactive plot of which a static version is shown in Fig. 2A:
Figure 2

(A) Volcano plot based on proteomics data generated from a published paper[5]. The interactive plot and all of the data is accessible through this hyperlink. (B) Volcano plot from the same data as (A) but replotted to show both significantly increased and decreased hits with the Enrichment and Significance threshold set to 3 and with a different color scale. The plot is accessible through this hyperlink.

(A) Volcano plot based on proteomics data generated from a published paper[5]. The interactive plot and all of the data is accessible through this hyperlink. (B) Volcano plot from the same data as (A) but replotted to show both significantly increased and decreased hits with the Enrichment and Significance threshold set to 3 and with a different color scale. The plot is accessible through this hyperlink. https://huygens.science.uva.nl:/VolcaNoseR/?data=5;;Difference_CSB_GFP;p_value_CSB_GFP;Gene_names&vis=4;0.8;1.5;2;increased&can=10;;;&layout=;;;;;log2;minus_log10;X;600;800&label=TRUE;GFP-CSB-WTvsGFP-NLS;TRUE;Enrichment(log2);Significance(-log10p-value);;24;24;18;6;&url=https://zenodo.org/record/3713174/files/CSV_1-GFP-CSB-WT_vs_GFP-NLS.csv Users can easily access the data and plot through this URL, inspect the data and replot it. Suppose that a user is interested in both showing and annotating increased and decreased proteins with more stringent threshold levels, the user can replot the data as shown in Fig. 2B. The URL can be copied and shared. This URL would be: https://huygens.science.uva.nl/VolcaNoseR/?data=5;;Difference_CSB_GFP;p_value_CSB_GFP;Gene_names&vis=4;0.8;-3,3;3;significant;manh&can=20;;;&layout=;;;;;X;600;800&color=3;none&label=TRUE;GFP-CSB-WT%20vs%20GFP-NLS;TRUE;Enrichment%20(log2);Significance%20(-log10%20p-value);;24;24;18;6;&url=https://zenodo.org/record/3713174/files/CSV_1-GFP-CSB-WT_vs_GFP-NLS.csv A list of settings that can be stored in the URL is available in a supplemental document (Supplementary information S1 text).

Data re-use

To demonstrate the re-use of data, we examined the results of a recently published genome-wide CRISPR-based proliferation screen in a retinal pigment epithelial (RPE1) cell line[4]. First, we retrieved the data of the 2D proliferation screens in wildtype and TP53 knockout cell lines (shown in Fig. 1 of that paper). The data of each of the screens was converted to a CSV file and deposited at zenodo.org, https://doi.org/10.5281/zenodo.3843685. Next, we used the CSV file as input for VolcaNoseR and inspected the volcano plot (Fig. 3). Given our interest in G protein-coupled receptor signaling[11], we looked for components of this signaling module. The GNAS gene was among the significant hits in the 2D proliferation screens in both wildtype and TP53 knockout cells (Fig. 3A,B), suggesting that it has an antiproliferative role in RPE1 cells. This result is in line with recent work on GNAS in the context of sonic hedgehog signaling[12,13]. This finding nicely demonstrates that the re-use of data from genome-wide, CRISPR-based screens is an efficient way to generate or test hypotheses. Here, we show that the VolcaNoseR web tool can be used to mine current datasets and communicate new observations (Fig. 3), which can be easily shared through hyperlinks for re-use.
Figure 3

Volcano plots generated from a published genome-wide CRISPR screen dataset[4] show that GNAS has an antiproliferative role in RPE1 cell growth. The volcano plots were generated with VolcaNoseR using published data and setting the threshold for the Log2(Fold Change) and − Log10(q-value) to 2. Significant hits are depicted in red and reflect genes that have an antiproliferative function. The top five candidates and GNAS are labeled. (A) Results from a 2D proliferation assay on RPE1 cells, plot and data are accessible through this hyperlink. (B) Results from a 2D proliferation assay on RPE1 TP53−/− cells, plot and data are accessible through this hyperlink.

Volcano plots generated from a published genome-wide CRISPR screen dataset[4] show that GNAS has an antiproliferative role in RPE1 cell growth. The volcano plots were generated with VolcaNoseR using published data and setting the threshold for the Log2(Fold Change) and − Log10(q-value) to 2. Significant hits are depicted in red and reflect genes that have an antiproliferative function. The top five candidates and GNAS are labeled. (A) Results from a 2D proliferation assay on RPE1 cells, plot and data are accessible through this hyperlink. (B) Results from a 2D proliferation assay on RPE1 TP53−/− cells, plot and data are accessible through this hyperlink.

Conclusion

Volcano plots are data visualizations that can plot a large amount of information. Unfortunately, only a fraction of the data is labeled in static figures and, therefore, the vast majority of the information is inaccessible. To provide access to all of the data represented in a volcano plot, we developed an interactive online plotting tool. A unique feature of the web app that sets it apart from other software for making volcano plots is that VolcanoseR enables an easy and straightforward way of sharing the volcano plot together with the data. By hovering over the plot with a pointer, each data point can be inspected. In addition, user-defined candidates can be labeled in the plot and listed in a table. Together, these features enable access to all the information that the plot is based on. Finally, the web app can be used to share the data and the plot to allow other users to interact with the data and reuse it. Therefore, VolcaNoseR increases the transparency and re-use of large comparative genome- and proteome-wide datasets.
  12 in total

Review 1.  Physical biology of GPCR signalling dynamics inferred from fluorescence spectroscopy and imaging.

Authors:  Sergei Chavez-Abiega; Joachim Goedhart; Frank Johannes Bruggeman
Journal:  Curr Opin Struct Biol       Date:  2019-07-15       Impact factor: 6.809

Review 2.  Volcano plots in analyzing differential expressions with mRNA microarrays.

Authors:  Wentian Li
Journal:  J Bioinform Comput Biol       Date:  2012-10-15       Impact factor: 1.122

3.  G protein-coupled receptors control the sensitivity of cells to the morphogen Sonic Hedgehog.

Authors:  Ganesh V Pusapati; Jennifer H Kong; Bhaven B Patel; Mina Gouti; Andreas Sagner; Ria Sircar; Giovanni Luchetti; Philip W Ingham; James Briscoe; Rajat Rohatgi
Journal:  Sci Signal       Date:  2018-02-06       Impact factor: 8.192

Review 4.  Statistical tests for differential expression in cDNA microarray experiments.

Authors:  Xiangqin Cui; Gary A Churchill
Journal:  Genome Biol       Date:  2003-03-17       Impact factor: 13.583

5.  CRISPR Screens Uncover Genes that Regulate Target Cell Sensitivity to the Morphogen Sonic Hedgehog.

Authors:  Ganesh V Pusapati; Jennifer H Kong; Bhaven B Patel; Arunkumar Krishnan; Andreas Sagner; Maia Kinnebrew; James Briscoe; L Aravind; Rajat Rohatgi
Journal:  Dev Cell       Date:  2017-12-28       Impact factor: 12.270

6.  In vivo identification of GTPase interactors by mitochondrial relocalization and proximity biotinylation.

Authors:  Alison K Gillingham; Jessie Bertram; Farida Begum; Sean Munro
Journal:  Elife       Date:  2019-07-11       Impact factor: 8.140

7.  PlotTwist: A web app for plotting and annotating continuous data.

Authors:  Joachim Goedhart
Journal:  PLoS Biol       Date:  2020-01-13       Impact factor: 8.029

8.  msVolcano: A flexible web application for visualizing quantitative proteomics data.

Authors:  Sukhdeep Singh; Marco Y Hein; A Francis Stewart
Journal:  Proteomics       Date:  2016-09       Impact factor: 3.984

9.  Impaired LXRα Phosphorylation Attenuates Progression of Fatty Liver Disease.

Authors:  Natalia Becares; Matthew C Gage; Maud Voisin; Elina Shrestha; Lucia Martin-Gutierrez; Ning Liang; Rikah Louie; Benoit Pourcet; Oscar M Pello; Tu Vinh Luong; Saioa Goñi; Cesar Pichardo-Almarza; Hanne Røberg-Larsen; Vanessa Diaz-Zuccarini; Knut R Steffensen; Alastair O'Brien; Michael J Garabedian; Krista Rombouts; Eckardt Treuter; Inés Pineda-Torra
Journal:  Cell Rep       Date:  2019-01-22       Impact factor: 9.423

10.  Genome-wide Screens Implicate Loss of Cullin Ring Ligase 3 in Persistent Proliferation and Genome Instability in TP53-Deficient Cells.

Authors:  Alexandros P Drainas; Ruxandra A Lambuta; Irina Ivanova; Özdemirhan Serçin; Ioannis Sarropoulos; Mike L Smith; Theocharis Efthymiopoulos; Benjamin Raeder; Adrian M Stütz; Sebastian M Waszak; Balca R Mardin; Jan O Korbel
Journal:  Cell Rep       Date:  2020-04-07       Impact factor: 9.423

View more
  52 in total

1.  Structurally derived universal mechanism for the catalytic cycle of the tail-anchored targeting factor Get3.

Authors:  Michelle Y Fry; Vladimíra Najdrová; Ailiena O Maggiolo; Shyam M Saladi; Pavel Doležal; William M Clemons
Journal:  Nat Struct Mol Biol       Date:  2022-07-18       Impact factor: 18.361

2.  A Designer Nanoparticle Platform for Controlled Intracellular Delivery of Bioactive Macromolecules: Inhibition of Ubiquitin-Specific Protease 7 in Breast Cancer Cells.

Authors:  Wynton D McClary; Alexis Catala; Wei Zhang; Fabia Gamboni; Monika Dzieciatkowska; Sachdev S Sidhu; Angelo D'Alessandro; Carlos E Catalano
Journal:  ACS Chem Biol       Date:  2022-07-07       Impact factor: 4.634

3.  Pathogenesis and prognosis of primary oral squamous cell carcinoma based on microRNAs target genes: a systems biology approach.

Authors:  Amir Taherkhani; Shahab Shahmoradi Dehto; Shokoofeh Jamshidi; Setareh Shojaei
Journal:  Genomics Inform       Date:  2022-09-30

4.  Multi-omics reveals mechanisms of resistance to potato root infection by Spongospora subterranea.

Authors:  Sadegh Balotf; Richard Wilson; David S Nichols; Robert S Tegg; Calum R Wilson
Journal:  Sci Rep       Date:  2022-06-25       Impact factor: 4.996

5.  Proteomic characterisation of triple negative breast cancer cells following CDK4/6 inhibition.

Authors:  Melina Beykou; Mar Arias-Garcia; Theodoros I Roumeliotis; Jyoti S Choudhary; Nicolas Moser; Pantelis Georgiou; Chris Bakal
Journal:  Sci Data       Date:  2022-07-11       Impact factor: 8.501

6.  ESCPE-1 mediates retrograde endosomal sorting of the SARS-CoV-2 host factor Neuropilin-1.

Authors:  Boris Simonetti; James L Daly; Lorena Simón-Gracia; Katja Klein; Saroja Weeratunga; Carlos Antón-Plágaro; Allan Tobi; Lorna Hodgson; Philip A Lewis; Kate J Heesom; Deborah K Shoemark; Andrew D Davidson; Brett M Collins; Tambet Teesalu; Yohei Yamauchi; Peter J Cullen
Journal:  Proc Natl Acad Sci U S A       Date:  2022-06-13       Impact factor: 12.779

7.  A metabolic regulatory network for the Caenorhabditis elegans intestine.

Authors:  Sushila Bhattacharya; Brent B Horowitz; Jingyan Zhang; Xuhang Li; Hefei Zhang; Gabrielle E Giese; Amy D Holdorf; Albertha J M Walhout
Journal:  iScience       Date:  2022-06-30

8.  The onset of PI3K-related vascular malformations occurs during angiogenesis and is prevented by the AKT inhibitor miransertib.

Authors:  Piotr Kobialka; Helena Sabata; Sandra D Castillo; Mariona Graupera; Odena Vilalta; Leonor Gouveia; Ana Angulo-Urarte; Laia Muixí; Jasmina Zanoncello; Oscar Muñoz-Aznar; Nagore G Olaciregui; Lucia Fanlo; Anna Esteve-Codina; Cinzia Lavarino; Biola M Javierre; Veronica Celis; Carlota Rovira; Susana López-Fernández; Eulàlia Baselga; Jaume Mora
Journal:  EMBO Mol Med       Date:  2022-06-13       Impact factor: 14.260

9.  SuperPlotsOfData-a web app for the transparent display and quantitative comparison of continuous data from different conditions.

Authors:  Joachim Goedhart
Journal:  Mol Biol Cell       Date:  2021-01-21       Impact factor: 4.138

10.  Garlic-Derived Organosulfur Compounds Regulate Metabolic and Immune Pathways in Macrophages and Attenuate Intestinal Inflammation in Mice.

Authors:  Ling Zhu; Laura J Myhill; Audrey I S Andersen-Civil; Stig M Thamsborg; Alexandra Blanchard; Andrew R Williams
Journal:  Mol Nutr Food Res       Date:  2022-03-07       Impact factor: 6.575

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.