MOTIVATION: Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. RESULTS: Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. AVAILABILITY: The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. CONTACT: ctekwe@stat.tamu.edu.
MOTIVATION: Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. RESULTS: Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. AVAILABILITY: The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. CONTACT: ctekwe@stat.tamu.edu.
Authors: Gary R Kiebel; Ken J Auberry; Navdeep Jaitly; David A Clark; Matthew E Monroe; Elena S Peterson; Nikola Tolić; Gordon A Anderson; Richard D Smith Journal: Proteomics Date: 2006-03 Impact factor: 3.984
Authors: Bobbie-Jo M Webb-Robertson; Holli K Wiberg; Melissa M Matzke; Joseph N Brown; Jing Wang; Jason E McDermott; Richard D Smith; Karin D Rodland; Thomas O Metz; Joel G Pounds; Katrina M Waters Journal: J Proteome Res Date: 2015-04-22 Impact factor: 4.466
Authors: Jonathon J O'Brien; Harsha P Gunawardena; Joao A Paulo; Xian Chen; Joseph G Ibrahim; Steven P Gygi; Bahjat F Qaqish Journal: Ann Appl Stat Date: 2018-11-13 Impact factor: 2.083
Authors: Tsung-Heng Tsai; Zhiqi Hao; Qiuting Hong; Benjamin Moore; Cinzia Stella; Jeffrey H Zhang; Yan Chen; Michael Kim; Theo Koulis; Gregory A Ryslik; Erik Verschueren; Fred Jacobson; William E Haskins; Olga Vitek Journal: Sci Rep Date: 2017-08-11 Impact factor: 4.379
Authors: Qian Li; Kate Fisher; Wenjun Meng; Bin Fang; Eric Welsh; Eric B Haura; John M Koomen; Steven A Eschrich; Brooke L Fridley; Y Ann Chen Journal: Bioinformatics Date: 2020-01-01 Impact factor: 6.937
Authors: Megan M Niedzwiecki; Douglas I Walker; Jennifer Christina Howell; Kelly D Watts; Dean P Jones; Gary W Miller; William T Hu Journal: Ann Clin Transl Neurol Date: 2019-12-11 Impact factor: 4.511