Thang V Pham1, Sander R Piersma, Marc Warmoes, Connie R Jimenez. 1. OncoProteomics Laboratory, Department Medical Oncology, VUmc-Cancer Center Amsterdam, VU University Medical Center, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands. t.pham@vumc.nl
Abstract
MOTIVATION: Spectral count data generated from label-free tandem mass spectrometry-based proteomic experiments can be used to quantify protein's abundances reliably. Comparing spectral count data from different sample groups such as control and disease is an essential step in statistical analysis for the determination of altered protein level and biomarker discovery. The Fisher's exact test, the G-test, the t-test and the local-pooled-error technique (LPE) are commonly used for differential analysis of spectral count data. However, our initial experiments in two cancer studies show that the current methods are unable to declare at 95% confidence level a number of protein markers that have been judged to be differential on the basis of the biology of the disease and the spectral count numbers. A shortcoming of these tests is that they do not take into account within- and between-sample variations together. Hence, our aim is to improve upon existing techniques by incorporating both the within- and between-sample variations. RESULT: We propose to use the beta-binomial distribution to test the significance of differential protein abundances expressed in spectral counts in label-free mass spectrometry-based proteomics. The beta-binomial test naturally normalizes for total sample count. Experimental results show that the beta-binomial test performs favorably in comparison with other methods on several datasets in terms of both true detection rate and false positive rate. In addition, it can be applied for experiments with one or more replicates, and for multiple condition comparisons. Finally, we have implemented a software package for parameter estimation of two beta-binomial models and the associated statistical tests. AVAILABILITY AND IMPLEMENTATION: A software package implemented in R is freely available for download at http://www.oncoproteomics.nl/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Spectral count data generated from label-free tandem mass spectrometry-based proteomic experiments can be used to quantify protein's abundances reliably. Comparing spectral count data from different sample groups such as control and disease is an essential step in statistical analysis for the determination of altered protein level and biomarker discovery. The Fisher's exact test, the G-test, the t-test and the local-pooled-error technique (LPE) are commonly used for differential analysis of spectral count data. However, our initial experiments in two cancer studies show that the current methods are unable to declare at 95% confidence level a number of protein markers that have been judged to be differential on the basis of the biology of the disease and the spectral count numbers. A shortcoming of these tests is that they do not take into account within- and between-sample variations together. Hence, our aim is to improve upon existing techniques by incorporating both the within- and between-sample variations. RESULT: We propose to use the beta-binomial distribution to test the significance of differential protein abundances expressed in spectral counts in label-free mass spectrometry-based proteomics. The beta-binomial test naturally normalizes for total sample count. Experimental results show that the beta-binomial test performs favorably in comparison with other methods on several datasets in terms of both true detection rate and false positive rate. In addition, it can be applied for experiments with one or more replicates, and for multiple condition comparisons. Finally, we have implemented a software package for parameter estimation of two beta-binomial models and the associated statistical tests. AVAILABILITY AND IMPLEMENTATION: A software package implemented in R is freely available for download at http://www.oncoproteomics.nl/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Henry M Olivera-Perez; Larry Lam; Johnny Dang; Weilan Jiang; Fabian Rodriguez; Elizabeth Rigali; Sarah Weitzman; Verna Porter; Liudmilla Rubbi; Marco Morselli; Matteo Pellegrini; Milan Fiala Journal: FASEB J Date: 2017-06-20 Impact factor: 5.191
Authors: Marc Warmoes; Janneke E Jaspers; Thang V Pham; Sander R Piersma; Gideon Oudgenoeg; Maarten P G Massink; Quinten Waisfisz; Sven Rottenberg; Epie Boven; Jos Jonkers; Connie R Jimenez Journal: Mol Cell Proteomics Date: 2012-02-24 Impact factor: 5.911
Authors: Aniek D van der Woude; Kozhinjampara R Mahendran; Roy Ummels; Sander R Piersma; Thang V Pham; Connie R Jiménez; Karin de Punder; Nicole N van der Wel; Mathias Winterhalter; Joen Luirink; Wilbert Bitter; Edith N G Houben Journal: J Bacteriol Date: 2013-03-01 Impact factor: 3.490
Authors: Jakob Albrethsen; Jaco C Knol; Sander R Piersma; Thang V Pham; Meike de Wit; Sandra Mongera; Beatriz Carvalho; Henk M W Verheul; Remond J A Fijneman; Gerrit A Meijer; Connie R Jimenez Journal: Mol Cell Proteomics Date: 2010-01-20 Impact factor: 5.911
Authors: Bent Brachvogel; Frank Zaucke; Keyur Dave; Emma L Norris; Jacek Stermann; Münire Dayakli; Manuel Koch; Jeffrey J Gorman; John F Bateman; Richard Wilson Journal: J Biol Chem Date: 2013-03-24 Impact factor: 5.157