SungHwan Kim1, Dongwan Kang2, Zhiguang Huo3, Yongseok Park3, George C Tseng3,4. 1. Department of Statistics, Keimyung University, Daegu 42601, South Korea. 2. Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA. 3. Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA. 4. Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA.
Abstract
Motivation: With the prevalent usage of microarray and massively parallel sequencing, numerous high-throughput omics datasets have become available in the public domain. Integrating abundant information among omics datasets is critical to elucidate biological mechanisms. Due to the high-dimensional nature of the data, methods such as principal component analysis (PCA) have been widely applied, aiming at effective dimension reduction and exploratory visualization. Results: In this article, we combine multiple omics datasets of identical or similar biological hypothesis and introduce two variations of meta-analytic framework of PCA, namely MetaPCA. Regularization is further incorporated to facilitate sparse feature selection in MetaPCA. We apply MetaPCA and sparse MetaPCA to simulations, three transcriptomic meta-analysis studies in yeast cell cycle, prostate cancer, mouse metabolism and a TCGA pan-cancer methylation study. The result shows improved accuracy, robustness and exploratory visualization of the proposed framework. Availability and implementation: An R package MetaPCA is available online. (http://tsenglab.biostat.pitt.edu/software.htm). Contact: ctseng@pitt.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: With the prevalent usage of microarray and massively parallel sequencing, numerous high-throughput omics datasets have become available in the public domain. Integrating abundant information among omics datasets is critical to elucidate biological mechanisms. Due to the high-dimensional nature of the data, methods such as principal component analysis (PCA) have been widely applied, aiming at effective dimension reduction and exploratory visualization. Results: In this article, we combine multiple omics datasets of identical or similar biological hypothesis and introduce two variations of meta-analytic framework of PCA, namely MetaPCA. Regularization is further incorporated to facilitate sparse feature selection in MetaPCA. We apply MetaPCA and sparse MetaPCA to simulations, three transcriptomic meta-analysis studies in yeast cell cycle, prostate cancer, mouse metabolism and a TCGA pan-cancer methylation study. The result shows improved accuracy, robustness and exploratory visualization of the proposed framework. Availability and implementation: An R package MetaPCA is available online. (http://tsenglab.biostat.pitt.edu/software.htm). Contact: ctseng@pitt.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Sooryanarayana Varambally; Jianjun Yu; Bharathi Laxman; Daniel R Rhodes; Rohit Mehra; Scott A Tomlins; Rajal B Shah; Uma Chandran; Federico A Monzon; Michael J Becich; John T Wei; Kenneth J Pienta; Debashis Ghosh; Mark A Rubin; Arul M Chinnaiyan Journal: Cancer Cell Date: 2005-11 Impact factor: 31.743
Authors: P T Spellman; G Sherlock; M Q Zhang; V R Iyer; K Anders; M B Eisen; P O Brown; D Botstein; B Futcher Journal: Mol Biol Cell Date: 1998-12 Impact factor: 4.138
Authors: Jacques Lapointe; Chunde Li; John P Higgins; Matt van de Rijn; Eric Bair; Kelli Montgomery; Michelle Ferrari; Lars Egevad; Walter Rayford; Ulf Bergerheim; Peter Ekman; Angelo M DeMarzo; Robert Tibshirani; David Botstein; Patrick O Brown; James D Brooks; Jonathan R Pollack Journal: Proc Natl Acad Sci U S A Date: 2004-01-07 Impact factor: 11.205