| Literature DB >> 27054327 |
Peter Blattmann1, Moritz Heusel1,2, Ruedi Aebersold1,3.
Abstract
SWATH-MS is an acquisition and analysis technique of targeted proteomics that enables measuring several thousand proteins with high reproducibility and accuracy across many samples. OpenSWATH is popular open-source software for peptide identification and quantification from SWATH-MS data. For downstream statistical and quantitative analysis there exist different tools such as MSstats, mapDIA and aLFQ. However, the transfer of data from OpenSWATH to the downstream statistical tools is currently technically challenging. Here we introduce the R/Bioconductor package SWATH2stats, which allows convenient processing of the data into a format directly readable by the downstream analysis tools. In addition, SWATH2stats allows annotation, analyzing the variation and the reproducibility of the measurements, FDR estimation, and advanced filtering before submitting the processed data to downstream tools. These functionalities are important to quickly analyze the quality of the SWATH-MS data. Hence, SWATH2stats is a new open-source tool that summarizes several practical functionalities for analyzing, processing, and converting SWATH-MS data and thus facilitates the efficient analysis of large-scale SWATH/DIA datasets.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27054327 PMCID: PMC4824525 DOI: 10.1371/journal.pone.0153160
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Overview of the R/Bioconductor package SWATH2stats.
(A) SWATH2stats uses the OpenSWATH results or results from similar software. The information on the experimental design for annotation of the conditions and replicates can be provided separately or extracted from the OpenSWATH data. The data is processed along 5 different steps (annotation, analyzing the data, false discovery rate (FDR) estimation, filtering, format conversion) using different functions (Table 1) until the data can be directly exported in a suitable format for the downstream analysis tools aLFQ, MSstats, and mapDIA. (B) Shown are example plots from the package that show the correlation of signals between injections or the coefficient of variation (cv) across conditions. (C) Shown are example plots on how the estimated global FDR or FDR by run changes depending on different score criteria.
Functions included in the SWATH2stats package.
| Category | Function names | Short description of function |
|---|---|---|
| sample_annotation | Annotates the samples with the study design (Replicates, Conditions) | |
| reduce_OpenSWATH_output | Reduces the data table to fewer columns | |
| transform_MSstats_OpenSWATH | Converts the data from an MSstats like format to the OpenSWATH format | |
| import_data | Transforms the column names from a data frame to the required format for SWATH2stats | |
| count_analytes | Counts the analytes across the different samples | |
| plot_correlation_between_samples | Plots the correlation between samples | |
| plot_variation | Plots the coefficient of variation across samples | |
| plot_variation_vs_total | Plots the coefficient of variation in the whole data versus within replicates | |
| write_matrix_peptides | Calculates the summed signal per peptide in the different samples | |
| write_matrix_proteins | Calculates the summed signal per protein in the different samples | |
| assess_decoy_rate | Counts the number of decoy assays in the data | |
| assess_fdr_byrun | Estimates the FDR using a target decoy approach within each run | |
| assess_fdr_overall | Estimates the FDR using a target decoy approach across all runs | |
| mscore4assayfdr | Calculates an m-score threshold for reaching a certain assay FDR | |
| mscore4pepfdr | Calculates an m-score threshold for reaching a certain peptide FDR | |
| mscore4protfdr | Calculates an m-score threshold for reaching a certain protein FDR. | |
| plot.fdr_cube | Generates plots from fdr_cube objects (FDR estimates by run) | |
| plot.fdr_table | Generates plots from fdr_table objects (FDR estimates overall) | |
| filter_proteotypic_peptides | Selects only data from peptides that are proteotypic | |
| filter_mscore | Selects only data quantified with a certain m-score threshold | |
| filter_mscore_condition | Selects only data quantified with a certain m-score threshold and quantified a certain number of times within a condition | |
| filter_mscore_freqobs | Selects only data quantified with a given m-score threshold and a certain frequency of observation across the different samples | |
| filter_on_max_peptides | Selects only a given number of highest intense peptides per protein | |
| filter_on_min_peptides | Selects only proteins that have a minimal number of peptides quantified | |
| disaggregate | Transforms the data into transition-level format | |
| convert4pythonscript | Converts the data into the format to be used by a supplied pythonscript to transform large data into transition-level format | |
| convert4aLFQ | Converts the data into the format for the R package aLFQ | |
| convert4mapDIA | Converts the data into the format for the C++ software mapDIA | |
| convert4MSstats | Converts the data into the format for the R package MSstats |