| Literature DB >> 25411686 |
William R French1, Lisa J Zimmerman, Birgit Schilling, Bradford W Gibson, Christine A Miller, R Reid Townsend, Stacy D Sherrod, Cody R Goodwin, John A McLean, David L Tabb.
Abstract
We report the implementation of high-quality signal processing algorithms into ProteoWizard, an efficient, open-source software package designed for analyzing proteomics tandem mass spectrometry data. Specifically, a new wavelet-based peak-picker (CantWaiT) and a precursor charge determination algorithm (Turbocharger) have been implemented. These additions into ProteoWizard provide universal tools that are independent of vendor platform for tandem mass spectrometry analyses and have particular utility for intralaboratory studies requiring the advantages of different platforms convergent on a particular workflow or for interlaboratory investigations spanning multiple platforms. We compared results from these tools to those obtained using vendor and commercial software, finding that in all cases our algorithms resulted in a comparable number of identified peptides for simple and complex samples measured on Waters, Agilent, and AB SCIEX quadrupole time-of-flight and Thermo Q-Exactive mass spectrometers. The mass accuracy of matched precursor ions also compared favorably with vendor and commercial tools. Additionally, typical analysis runtimes (∼1-100 ms per MS/MS spectrum) were short enough to enable the practical use of these high-quality signal processing tools for large clinical and research data sets.Entities:
Keywords: Continuous wavelet transformation; deisotoping; mass spectrometry; open-source software; peak-picking; precursor charge determination; signal deconvolution
Mesh:
Substances:
Year: 2014 PMID: 25411686 PMCID: PMC4324452 DOI: 10.1021/pr500886y
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466
Summary of Samples Used for Evaluating ProteoWizard Signal Processing Algorithmsa
| instrument | sample | replicates | min. charge | max. charge |
|---|---|---|---|---|
| Agilent 6530 | Bovine 6 | 8 | +1 | +5 |
| Agilent 6550 | Serum (Human) | 4 | +2 | +14 |
| AB SCIEX Triple TOF 5600 | UPS1 (Human) | 5 | +2 | +8 |
| AB SCIEX Triple TOF 5600 | 3 | +2 | +8 | |
| AB SCIEX Triple TOF 5600 | Rat Liver | 3 | +2 | +8 |
| Thermo Q-Exactive | Bovine 6 | 5 | +2 | +6 |
| Thermo Q-Exactive | Jurkat (Human) | 1 | +2 | +6 |
| Waters Synapt G2 | Bovine 6 | 3 | +1 | +8 |
| Waters Synapt G2 | Yeast | 5 | +1 | +8 |
| Waters Synapt G2 | 1 | +1 | +8 | |
| Waters Synapt GS-S | K562 (Human) | 3 | +1 | +8 |
For each vendor, sample complexity increases moving from top to bottom. The Turbocharger precursor charge search range is also listed and was taken from the vendor-reported ranges for each dataset.
Figure 1Comparison of peak list size distributions for the most complex samples analyzed on four instruments. The x axis represents the number of peaks in an individual MS/MS spectrum. Gray curves correspond to results from vendor or commercial software, and colored curves correspond to results produced by CantWaiT peak-picking. Note that the samples used for each platform are different and therefore comparisons should be made only within each individual panel.
Figure 2Median precursor mass accuracy difference between precursors matched from ProteoWizard vs vendor/commercial software. Positive values indicate that ProteoWizard’s median mass accuracy is better (i.e., smaller on an absolute scale) than the median mass accuracy of vendor/commercial software (value on y axis = |MAvendor| – |MAPwiz|, where MA is the median mass accuracy). Red data correspond to Waters Synapt G2/G2-S; green, AB SCIEX Triple TOF 5600; blue, Agilent 6530/6550 QqTOF; and purple, Thermo Q-Exactive. Sample complexity increases for each vendor moving from left to right.
Figure 3Logarithm of the ratio of the number of distinct peptides identified by ProteoWizard to the number of peptides identified by vendor/commercial software and searched using (Top) MyriMatch and (Bottom) MS-GF+. Positive values indicate that identifications were higher with ProteoWizard signal processing. Circles indicate that all signal processing was performed within ProteoWizard, and triangles correspond to cases where ProteoWizard peak lists were combined with vendor-reported precursor charges and monoisotopic m/z values. The symbol colors indicate the vendor; see the caption to Figure 2 for details. Note that rat liver data analyzed by MS-GF+ were removed due to suspected software errors.
Figure 4Percent of vendor-assigned precursor charges that were assigned the same charge from Turbocharger for the most complex sample analyzed from each vendor (except for Waters). Error bars span ±1 SD and are visible when greater than the symbol size. For each charge state, a sample size of at least 200 was required for inclusion. Note that comparisons should not be inferred between vendors, as each data set is different in sample and size, but should be made only to the agreement of Turbocharger with the vendor charge state assignments.