Xiaoxiao Kong1, Cavan Reilly. 1. Division of Biostatistics, School of Public Health, University of Minnesota, A460 Mayo Building (MMC 303), Minneapolis, MN 55455-0378, USA.
Abstract
MOTIVATION: The need to align spectra to correct for mass-to-charge experimental variation is a problem that arises in mass spectrometry (MS). Most of the MS-based proteomic data analysis methods involve a two-step approach, identify peaks first and then do the alignment and statistical inference on these identified peaks only. However, the peak identification step relies on prior information on the proteins of interest or a peak detection model, which are subject to error. Also numerous additional features such as peak shape and peak width are lost in simple peak detection, and these are informative for correcting mass variation in the alignment step. RESULTS: Here, we present a novel Bayesian approach to align the complete spectra. The approach is based on a parametric model which assumes that the spectrum and alignment function are Gaussian processes, but the alignment function is monotone. We show how to use the expectation-maximization algorithm to find the posterior mode of the set of alignment functions and the mean spectrum for a patient population. After alignment, we conduct tests while controlling for error attributable to multiple comparisons on the level of the peaks identified from the absolute mean spectra difference of two patient populations. CONTACT: cavanr@biostat.umn.edu.
MOTIVATION: The need to align spectra to correct for mass-to-charge experimental variation is a problem that arises in mass spectrometry (MS). Most of the MS-based proteomic data analysis methods involve a two-step approach, identify peaks first and then do the alignment and statistical inference on these identified peaks only. However, the peak identification step relies on prior information on the proteins of interest or a peak detection model, which are subject to error. Also numerous additional features such as peak shape and peak width are lost in simple peak detection, and these are informative for correcting mass variation in the alignment step. RESULTS: Here, we present a novel Bayesian approach to align the complete spectra. The approach is based on a parametric model which assumes that the spectrum and alignment function are Gaussian processes, but the alignment function is monotone. We show how to use the expectation-maximization algorithm to find the posterior mode of the set of alignment functions and the mean spectrum for a patient population. After alignment, we conduct tests while controlling for error attributable to multiple comparisons on the level of the peaks identified from the absolute mean spectra difference of two patient populations. CONTACT: cavanr@biostat.umn.edu.
Authors: Jeffrey S Morris; Kevin R Coombes; John Koomen; Keith A Baggerly; Ryuji Kobayashi Journal: Bioinformatics Date: 2005-01-26 Impact factor: 6.937
Authors: Yan Zhang; Matthew Wroblewski; Marshall I Hertz; Christine H Wendt; Tereza M Cervenka; Gary L Nelsestuen Journal: Proteomics Date: 2006-02 Impact factor: 3.984
Authors: Jeffrey S Morris; Philip J Brown; Richard C Herrick; Keith A Baggerly; Kevin R Coombes Journal: Biometrics Date: 2007-09-20 Impact factor: 2.571