Literature DB >> 27440201

msVolcano: A flexible web application for visualizing quantitative proteomics data.

Sukhdeep Singh1, Marco Y Hein2, A Francis Stewart3.   

Abstract

UNLABELLED: We introduce msVolcano, a web application for the visualization of label-free mass spectrometric data. It is optimized for the output of the MaxQuant data analysis pipeline of interactomics experiments and generates volcano plots with lists of interacting proteins. The user can optimize the cutoff values to find meaningful significant interactors for the tagged protein of interest. Optionally, stoichiometries of interacting proteins can be calculated. Several customization options are provided to the user for flexibility, and publication-quality outputs can also be downloaded (tabular and graphical). AVAILABILITY: msVolcano is implemented in R Statistical language using Shiny. It can be accessed freely at http://projects.biotec.tu-dresden.de/msVolcano/.
© 2016 The Authors. Proteomics Published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

Entities:  

Keywords:  Bioinformatics; Data visualization; Interaction proteomics; Label free quantification; Web Application

Mesh:

Year:  2016        PMID: 27440201      PMCID: PMC5096246          DOI: 10.1002/pmic.201600167

Source DB:  PubMed          Journal:  Proteomics        ISSN: 1615-9853            Impact factor:   3.984


affinity enrichment affinity purification label‐free quantification quality control quantile‐quantile The analysis of protein–protein interactions and complex networks using affinity purification or affinity enrichment coupled to mass spectrometry (AP/MS, AE/MS) is a commonly used technique in proteomics. The technology produces high quality protein interaction data 1 and is scalable to proteome‐wide levels 2. Even though isotope‐labeling methods have been developed to detect and quantify protein–protein interactions 3, label‐free approaches are gaining momentum due to their simplicity and applicability 4. While different quantification strategies exist for label‐free data, such as those based on spectral counting, methods that make use of peptide intensities (also known as extracted ion currents) are regarded as the most accurate 5, 6. Such methods generate the quantitative profiles of peptides or proteins across samples, which can be analyzed by established statistical methods, e.g. by a modified t‐test across replicate experiments 7. MaxQuant is an integrated suite of algorithms for the analysis of high‐resolution quantitative MS data 8. Its MaxLFQ module normalizes the contribution of individual peptide fractions and extracts the maximum available quantitative information to calculate highly reliable relative label‐free quantification (LFQ) intensity profiles 6, which are exported as tab‐limited text files for the downstream analysis. Though various post‐processing tools to analyze the output of MaxQuant exist 9, 10, Perseus 11 is the most widely used tool. To identify interactors of a tagged protein of interest (termed the “bait”), in the presence of a vast number of background binding proteins, replicates of affinity‐enriched bait samples are compared to a set of negative control samples. Although our primary application is protein interaction experiments, the workflow is generalizable to any case and control scenario. A student's t‐test or Welch's test can be used to determine those proteins that are significantly enriched along with the specific baits. A volcano plot is a good way to visualize this kind of analysis 12. When the negative logarithmic p values derived from the statistical test are plotted against the differences between the logarithmized mean protein intensities between bait and the control samples, unspecific background binders center around zero. The enriched interactors appear on the right section of the plot, whereas ideally no protein should appear on the left section when compared to an empty control (because these would represent proteins depleted by the bait). The higher the difference between the group means (i.e. the enrichment) and the p‐value (i.e. the reproducibility), the more the interactors shift towards the upper right section of the plot, which represents the area of the highest confidence for an interaction. In any quantitative workflow, determining a threshold is a crucial step. This threshold sepa‐rates statistically significant outliers, which are most likely to represent biological findings, from background proteins, which inevitably occur in any measurement. Threshold placement can be performed empirically, or automatically based on desired false discovery rates, and often benefits from some manual optimization. Downstream analysis of proteomics data can be challenging for a non‐specialized users and a burden for mass spectrometry core facilities. To facilitate the analysis and presentation of AE‐MS data, we present msVolcano, which is a user modulated, freely accessible web application. It requires the MaxQuant output of an interaction dataset that was analysed using the MaxLFQ module. LFQ intensity profiles retain the absolute scale from the original summed‐up peptide intensities 6, serving as a proxy for absolute protein abundance. The purpose of msVolcano is to implement all steps of downstream data analysis into a simple and intuitive user interface that requires no bioinformatics knowledge or specialized software. To this end, msVolcano automatically extracts relevant data columns, filters out hits to the decoy database and potential contaminants. A visual quality control (QC) output is generated allowing the user to monitor the correlation between replicates, fraction of missing values and behavior of the population of imputed values as shown in Fig. 2. Quantile‐Quantile (QQ) plots are also provided.
Figure 2

QC plot using a dataset from budding yeast study (sample data in msVolcano) 14 (A) top row displaying the distribution of the raw values (LFQ intensites ‐ in blue) overlaid with the distribution of imputed values (in red) per LFQ column selected. For contrast, comparisons are done between unrelated sample replicates, which immediately become apparent in these plots and will also help the user to catch possible errors or sample mix‐ups. (B) 2×2 scatter plots between the chosen LFQ columns with local regression (lowess) displayed as a red line with Pearson's correlations coefficient. For the visual aestheticity, the number of scatter plots are restricted to the number of histograms displayed above them.

A user‐defined statistical test is then performed between selected bait and control samples and the tool generates a volcano plot as shown in Fig. 1. We implemented a recently introduced hyperbolic curve (dotted double lines in the volcano plot in Fig. 1) threshold 14, based on the given formula: where c = curvature, x0 = minimum fold change, thus dividing enriched proteins into mildly and strongly enriched 14. The cutoff parameters can be adjusted by the user and monitored by the graphical output. The user has access to the plot aesthetics and can view the original input file and its subset for significant interactors in the inbuilt browser. A publication‐quality PDF plot can be generated and exported along with the subset of original data limited to the significant interactors. Next to the identities of interacting proteins, their stoichiometries relative to their bait are crucial for the understanding of the molecular function of protein complexes 2, 15. Thus, optional stoichiometry calculations have been implemented in the code. We use a modified version of intensity‐based absolute quantification (iBAQ) 16 for the estimation of protein abundance for stoichiometry calculations, where LFQ intensities are normalized by the number of theoretical tryptic peptides between seven and 30 amino acids, as described 2 (Fig. 1b). It has been shown that the number of theoretical peptides is a good and easily calculated proxy to control for the size and sequence properties of each protein that affect how much signal it can generate in the mass spectrometer. Theoretical peptides are pre‐calculated for the most commonly used proteomes of model organisms and are matched based on the proteins’ uniprot IDs. Stoichiometry calculations are based on the given formula where st = stoichiometry, sIp = size normalized protein intensity, msVolcano provides a web‐platform for the quick visualization of label‐free mass spectrometric data and can be freely accessed globally. With the underlying hyperbolic curve parameters and other statistics, user can intuitively separate the true protein interaction partners from the false positives, without the need of writing code. With its ftp file input support, the user can quickly analyze and re‐analyse the results of the interactomics experiment present on their own cloud servers and along with the calculated optional stoichiometries, all the results can be exported in publication quality tabular or graphical format.
Figure 1

Interface is divided into three sections, sidebar, body panel and column selection panel (left to right). Sidebar provides an access to the file upload, plot aesthetics, cutoff parameters, missing data imputation13, stoichiometry and the export options. The body panel has six different tabs, where the default panel labeled as “MS Volcano Plot”' displays the volcano plot. Second tab, “Subsets” displays the filtered input data for the significant interactors. “Data Preview” tab displays the user‐defined data for scrutiny. “Quality Control” tab displays a series of overlayed histograms, scatter and QQ plots helpful in assessing the correlation between the replicates, examining the distribution of the missing values and the behavior of imputed value population. “GettingStarted” and “About” tab display the specific and general information about the web interface. When user uploads a file or enters an ftp link, all LFQ columns are scanned and displayed in the column selection tab on the right side. User now selects respective bait and control columns (minimum two) and optionally enters the name of bait in the provided text box. As the “Update Plot” button is pressed, the plot is generated simultaneously.

Interface is divided into three sections, sidebar, body panel and column selection panel (left to right). Sidebar provides an access to the file upload, plot aesthetics, cutoff parameters, missing data imputation13, stoichiometry and the export options. The body panel has six different tabs, where the default panel labeled as “MS Volcano Plot”' displays the volcano plot. Second tab, “Subsets” displays the filtered input data for the significant interactors. “Data Preview” tab displays the user‐defined data for scrutiny. “Quality Control” tab displays a series of overlayed histograms, scatter and QQ plots helpful in assessing the correlation between the replicates, examining the distribution of the missing values and the behavior of imputed value population. “GettingStarted” and “About” tab display the specific and general information about the web interface. When user uploads a file or enters an ftp link, all LFQ columns are scanned and displayed in the column selection tab on the right side. User now selects respective bait and control columns (minimum two) and optionally enters the name of bait in the provided text box. As the “Update Plot” button is pressed, the plot is generated simultaneously. QC plot using a dataset from budding yeast study (sample data in msVolcano) 14 (A) top row displaying the distribution of the raw values (LFQ intensites ‐ in blue) overlaid with the distribution of imputed values (in red) per LFQ column selected. For contrast, comparisons are done between unrelated sample replicates, which immediately become apparent in these plots and will also help the user to catch possible errors or sample mix‐ups. (B) 2×2 scatter plots between the chosen LFQ columns with local regression (lowess) displayed as a red line with Pearson's correlations coefficient. For the visual aestheticity, the number of scatter plots are restricted to the number of histograms displayed above them. The authors have declared no conflict of interest.
  16 in total

1.  A human interactome in three quantitative dimensions organized by stoichiometries and abundances.

Authors:  Marco Y Hein; Nina C Hubner; Ina Poser; Jürgen Cox; Nagarjuna Nagaraj; Yusuke Toyoda; Igor A Gak; Ina Weisswange; Jörg Mansfeld; Frank Buchholz; Anthony A Hyman; Matthias Mann
Journal:  Cell       Date:  2015-10-22       Impact factor: 41.582

2.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification.

Authors:  Jürgen Cox; Matthias Mann
Journal:  Nat Biotechnol       Date:  2008-11-30       Impact factor: 54.908

3.  Global quantification of mammalian gene expression control.

Authors:  Björn Schwanhäusser; Dorothea Busse; Na Li; Gunnar Dittmar; Johannes Schuchhardt; Jana Wolf; Wei Chen; Matthias Selbach
Journal:  Nature       Date:  2011-05-19       Impact factor: 49.962

4.  A map of general and specialized chromatin readers in mouse tissues generated by label-free interaction proteomics.

Authors:  H Christian Eberl; Cornelia G Spruijt; Christian D Kelstrup; Michiel Vermeulen; Matthias Mann
Journal:  Mol Cell       Date:  2012-11-29       Impact factor: 17.970

Review 5.  Label-free quantitative proteomics trends for protein-protein interactions.

Authors:  Stephen Tate; Brett Larsen; Ron Bonner; Anne-Claude Gingras
Journal:  J Proteomics       Date:  2012-11-12       Impact factor: 4.044

6.  MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments.

Authors:  Meena Choi; Ching-Yun Chang; Timothy Clough; Daniel Broudy; Trevor Killeen; Brendan MacLean; Olga Vitek
Journal:  Bioinformatics       Date:  2014-05-02       Impact factor: 6.937

7.  Quantitative proteomics combined with BAC TransgeneOmics reveals in vivo protein interactions.

Authors:  Nina C Hubner; Alexander W Bird; Jürgen Cox; Bianca Splettstoesser; Peter Bandilla; Ina Poser; Anthony Hyman; Matthias Mann
Journal:  J Cell Biol       Date:  2010-05-17       Impact factor: 10.539

8.  Accurate protein complex retrieval by affinity enrichment mass spectrometry (AE-MS) rather than affinity purification mass spectrometry (AP-MS).

Authors:  Eva C Keilhauer; Marco Y Hein; Matthias Mann
Journal:  Mol Cell Proteomics       Date:  2014-11-02       Impact factor: 5.911

9.  DAPAR & ProStaR: software to perform statistical analyses in quantitative discovery proteomics.

Authors:  Samuel Wieczorek; Florence Combes; Cosmin Lazar; Quentin Giai Gianetto; Laurent Gatto; Alexia Dorffer; Anne-Marie Hesse; Yohann Couté; Myriam Ferro; Christophe Bruley; Thomas Burger
Journal:  Bioinformatics       Date:  2016-09-06       Impact factor: 6.937

10.  Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ.

Authors:  Jürgen Cox; Marco Y Hein; Christian A Luber; Igor Paron; Nagarjuna Nagaraj; Matthias Mann
Journal:  Mol Cell Proteomics       Date:  2014-06-17       Impact factor: 5.911

View more
  5 in total

1.  SETD1A Methyltransferase Is Physically and Functionally Linked to the DNA Damage Repair Protein RAD18.

Authors:  Manal Alsulami; Nayla Munawar; Eugene Dillon; Giorgio Oliviero; Kieran Wynne; Mona Alsolami; Catherine Moss; Peadar Ó Gaora; Fergal O'Meara; David Cotter; Gerard Cagney
Journal:  Mol Cell Proteomics       Date:  2019-05-10       Impact factor: 5.911

2.  VolcaNoseR is a web app for creating, exploring, labeling and sharing volcano plots.

Authors:  Joachim Goedhart; Martijn S Luijsterburg
Journal:  Sci Rep       Date:  2020-11-25       Impact factor: 4.379

3.  OmicsVolcano: software for intuitive visualization and interactive exploration of high-throughput biological data.

Authors:  Irina Kuznetsova; Artur Lugmayr; Oliver Rackham; Aleksandra Filipovska
Journal:  STAR Protoc       Date:  2021-01-21

4.  CANVS: an easy-to-use application for the analysis and visualization of mass spectrometry-based protein-protein interaction/association data.

Authors:  Erick F Velasquez; Yenni A Garcia; Ivan Ramirez; Ankur A Gholkar; Jorge Z Torres
Journal:  Mol Biol Cell       Date:  2021-08-25       Impact factor: 4.138

5.  FACT Sets a Barrier for Cell Fate Reprogramming in Caenorhabditis elegans and Human Cells.

Authors:  Ena Kolundzic; Andreas Ofenbauer; Selman I Bulut; Bora Uyar; Gülkiz Baytek; Anne Sommermeier; Stefanie Seelk; Mei He; Antje Hirsekorn; Dubravka Vucicevic; Altuna Akalin; Sebastian Diecke; Scott A Lacadie; Baris Tursun
Journal:  Dev Cell       Date:  2018-08-02       Impact factor: 12.270

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.