Sandra L Taylor1, L Renee Ruhaak2, Robert H Weiss3, Karen Kelly4, Kyoungmi Kim1. 1. Division of Biostatistics, Department of Public Health Sciences, University of California Davis, CA, 95616, USA. 2. Department of Clinical Chemistry and Laboratory Medicine, Leiden University Medical Center, Leiden, The Netherlands. 3. Division of Nephrology, Department of Internal Medicine. 4. Division of Hematology and Oncology, Department of Internal Medicine School of Medicine, University of California, Davis, CA 95616, USA.
Abstract
MOTIVATION: High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. RESULTS: We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. AVAILABILITY AND IMPLEMENTATION: We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online.
MOTIVATION: High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. RESULTS: We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. AVAILABILITY AND IMPLEMENTATION: We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Authors: Sheila Ganti; Sandra L Taylor; Omran Abu Aboud; Joy Yang; Christopher Evans; Michael V Osier; Danny C Alexander; Kyoungmi Kim; Robert H Weiss Journal: Cancer Res Date: 2012-05-24 Impact factor: 12.701
Authors: Yuliya Karpievitch; Jeff Stanley; Thomas Taverner; Jianhua Huang; Joshua N Adkins; Charles Ansong; Fred Heffron; Thomas O Metz; Wei-Jun Qian; Hyunjin Yoon; Richard D Smith; Alan R Dabney Journal: Bioinformatics Date: 2009-06-17 Impact factor: 6.937