| Literature DB >> 30083051 |
Kashyap Nagaraja1, Ulisses Braga-Neto1.
Abstract
Selected reaction monitoring (SRM) has become one of the main methods for low-mass-range-targeted proteomics by mass spectrometry (MS). However, in most SRM-MS biomarker validation studies, the sample size is very small, and in particular smaller than the number of proteins measured in the experiment. Moreover, the data can be noisy due to a low number of ions detected per peptide by the instrument. In this article, those issues are addressed by a model-based Bayesian method for classification of SRM-MS data. The methodology is likelihood-free, using approximate Bayesian computation implemented via a Markov chain Monte Carlo procedure and a kernel-based Optimal Bayesian Classifier. Extensive experimental results demonstrate that the proposed method outperforms classical methods such as linear discriminant analysis and 3NN, when sample size is small, dimensionality is large, the data are noisy, or a combination of these.Entities:
Keywords: Markov chain Monte Carlo (MCMC); Optimal Bayesian Classifier (OBC); Proteomics; approximate Bayesian computation (ABC); biomarker; selected reaction monitoring (SRM)
Year: 2018 PMID: 30083051 PMCID: PMC6071182 DOI: 10.1177/1176935118786927
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Parameters used in the experiment.
| Parameter | Symbol | Value/range |
|---|---|---|
| Instrument response factor |
| 5 |
| Noise severity |
| 0.03, 3.6 |
| Peptide efficiency factor |
| [0.1, 1] |
| Shape (gamma distribution) |
| Unif(1.6, 2.4), Unif(4, 6) |
| Scale (gamma distribution) |
| Unif(9e6, 11e6), Unif(90, 110) |
| Purification |
|
|
| Coefficient of variation |
| Unif(0.3, 0.5) |
| Fold change |
| Unif(1.5, 1.6) |
| 1. Generate |
| 1. Generate the mean vector |
| 1. Choose a set of kernel bandwidth parameters
|
Figure 1.Average classification error rates against sample size for a fixed number of selected proteins . ABC-MCMC indicates approximate Bayesian computation-Markov chain Monte Carlo; LDA, linear discriminant analysis.
Figure 2.Average classification error rates against number of selected proteins for a fixed sample size . ABC-MCMC indicates approximate Bayesian computation-Markov chain Monte Carlo; LDA, linear discriminant analysis.
Figure 3.Average classification error rates against the coefficient of variation for a fixed sample size per class and fixed number of selected proteins . ABC-MCMC indicates approximate Bayesian computation-Markov chain Monte Carlo; LDA, linear discriminant analysis.
Figure 4.Average classification error rates against the lower bound for the peptide efficiency factor for a fixed sample size per class and fixed number of selected proteins . ABC-MCMC indicates approximate Bayesian computation-Markov chain Monte Carlo; LDA, linear discriminant analysis.