MOTIVATION: Mass spectrometry (MS) has become the method of choice for protein/peptide sequence and modification analysis. The technology employs a two-step approach: ionized peptide precursor masses are detected, selected for fragmentation, and the fragment mass spectra are collected for computational analysis. Current precursor selection schemes are based on data- or information-dependent acquisition (DDA/IDA), where fragmentation mass candidates are selected by intensity and are subsequently included in a dynamic exclusion list to avoid constant refragmentation of highly abundant species. DDA/IDA methods do not exploit valuable information that is contained in the fractional mass of high-accuracy precursor mass measurements delivered by current instrumentation. RESULTS: We extend previous contributions that suggest that fractional mass information allows targeted fragmentation of analytes of interest. We introduce a non-linear Random Forest classification and a discrete mapping approach, which can be trained to discriminate among arbitrary fractional mass patterns for an arbitrary number of classes of analytes. These methods can be used to increase fragmentation efficiency for specific subsets of analytes or to select suitable fragmentation technologies on-the-fly. We show that theoretical generalization error estimates transfer into practical application, and that their quality depends on the accuracy of prior distribution estimate of the analyte classes. The methods are applied to two real-world proteomics datasets. AVAILABILITY: All software used in this study is available from http://software.steenlab.org/fmf CONTACT: hanno.steen@childrens.harvard.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Mass spectrometry (MS) has become the method of choice for protein/peptide sequence and modification analysis. The technology employs a two-step approach: ionized peptide precursor masses are detected, selected for fragmentation, and the fragment mass spectra are collected for computational analysis. Current precursor selection schemes are based on data- or information-dependent acquisition (DDA/IDA), where fragmentation mass candidates are selected by intensity and are subsequently included in a dynamic exclusion list to avoid constant refragmentation of highly abundant species. DDA/IDA methods do not exploit valuable information that is contained in the fractional mass of high-accuracy precursor mass measurements delivered by current instrumentation. RESULTS: We extend previous contributions that suggest that fractional mass information allows targeted fragmentation of analytes of interest. We introduce a non-linear Random Forest classification and a discrete mapping approach, which can be trained to discriminate among arbitrary fractional mass patterns for an arbitrary number of classes of analytes. These methods can be used to increase fragmentation efficiency for specific subsets of analytes or to select suitable fragmentation technologies on-the-fly. We show that theoretical generalization error estimates transfer into practical application, and that their quality depends on the accuracy of prior distribution estimate of the analyte classes. The methods are applied to two real-world proteomics datasets. AVAILABILITY: All software used in this study is available from http://software.steenlab.org/fmf CONTACT: hanno.steen@childrens.harvard.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Kevin P Bateman; Jose Castro-Perez; Mark Wrona; John P Shockcor; Kate Yu; Renata Oballa; Deborah A Nicoll-Griffith Journal: Rapid Commun Mass Spectrom Date: 2007 Impact factor: 2.419
Authors: Bernhard Y Renard; Marc Kirchner; Hanno Steen; Judith A J Steen; Fred A Hamprecht Journal: BMC Bioinformatics Date: 2008-08-28 Impact factor: 3.169
Authors: Navdeep Jaitly; Anoop Mayampurath; Kyle Littlefield; Joshua N Adkins; Gordon A Anderson; Richard D Smith Journal: BMC Bioinformatics Date: 2009-03-17 Impact factor: 3.169
Authors: John W Froehlich; Eric D Dodds; Mathias Wilhelm; Oliver Serang; Judith A Steen; Richard S Lee Journal: Mol Cell Proteomics Date: 2013-02-25 Impact factor: 5.911
Authors: Piotr Dittwald; Vu Trung Nghia; Glenn A Harris; Richard M Caprioli; Raf Van de Plas; Kris Laukens; Anna Gambin; Dirk Valkenborg Journal: EuPA Open Proteom Date: 2014-09-01
Authors: Michael C Stagliano; Joshua G DeKeyser; Curtis J Omiecinski; A Daniel Jones Journal: Rapid Commun Mass Spectrom Date: 2010-12-30 Impact factor: 2.419