| Literature DB >> 32929271 |
Meena Choi1, Jeremy Carver2, Cristina Chiva3,4, Manuel Tzouros5, Ting Huang1, Tsung-Heng Tsai1, Benjamin Pullman2, Oliver M Bernhardt6, Ruth Hüttenhain7, Guo Ci Teo8, Yasset Perez-Riverol9, Jan Muntel6, Maik Müller10, Sandra Goetze10,11, Maria Pavlou10, Erik Verschueren7, Bernd Wollscheid10,11, Alexey I Nesvizhskii8, Lukas Reiter6, Tom Dunkley5, Eduard Sabidó3,4, Nuno Bandeira12, Olga Vitek13.
Abstract
MassIVE.quant is a repository infrastructure and data resource for reproducible quantitative mass spectrometry-based proteomics, which is compatible with all mass spectrometry data acquisition types and computational analysis tools. A branch structure enables MassIVE.quant to systematically store raw experimental data, metadata of the experimental design, scripts of the quantitative analysis workflow, intermediate input and output files, as well as alternative reanalyses of the same dataset.Entities:
Year: 2020 PMID: 32929271 PMCID: PMC7541731 DOI: 10.1038/s41592-020-0955-0
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1 :Outline of MassIVE.quant repository structure, and reanalysis of three DDA-based experiments.
Each step can be performed with multiple algorithms and software tools, generating tool-specific files in diverse formats. For the experiments in the figure, MassIVE.quant stores the intermediate outputs from combinations of algorithms and tools for peptide ion identification and quantification. For example, DDA:Choi2017 was processed with eight combinations of parameter settings in Skyline. Each reanalysis is saved with a unique reanalysis ID, prefixed by RMSV, under the experiment repository prefixed by MSV in MassIVE.quant.
Figure 2 :Re-analyses of DIA:Selevsek2015, profiling changes in proteome abundance of S. cerevisiae over six time points: T0(0 min), T1(15 min), T2(30 min), T3(60 min), T4(90 min), T5 (120 min), n=3 biologically independent samples per each time points, in response to osmotic stress (RMSV000000251).
(a)-(d) Discrepancies of quantification of protein YKL096W across data processing tools. Gray lines: fragments reported by each tool. Red lines: protein quantification summarized by MSstats. (a) Skyline:lowCV used Skyline to quantify a subset of the fragments with low coefficient of variation. (b) Skyline:All used Skyline to quantify all detectable peptides, with a maximum of six fragments each; (c) data processed by Spectronaut; (d) data processed by DIA-Umpire. (e)–(h), Discrepancies in detecting differential abundance for protein YKL096W across data processing tools, with statistical analysis by MSstats: Skyline:lowCV (e), Skyline:all (f), Spectronaut (g) and DIA-Umpire (h). Dark red dot, center for error bars, model-based estimates of log2(fold change) of protein abundance, as determined by MSstats. Error bars, 95% confidence intervals for the log2(fold change), as determined by MSstats. *Adjusted P < 0.05. (i)–(l), Volcano plots, summarizing differential abundance between T5 and T0: Skyline:lowCV (i), Skyline:all (j), Spectronaut (k) and DIA-Umpire (l). Dashed line, FDR = 0.05; blue dots, significantly down-regulated proteins; red dots, significantly up-regulated proteins (counts are shown at the top left corner; other time points are shown in Supplementary Figure. 3). (m) Number of differentially abundant proteins across all time points and all tools, FDR = 0.05. (n) Venn diagram of differentially abundant proteins between two processing approaches by Skyline, comparing T5 versus T0. (o) Venn diagram of differentially abundant proteins across all tools, comparing T5 versus T0 (other time points are shown in Supplementary Figure. 4).