| Literature DB >> 28066030 |
Meng Lu1, Jianhua Z Huang2, Xiaoning Qian1.
Abstract
We propose a Sparse exponential family Principal Component Analysis (SePCA) method suitable for any type of data following exponential family distributions, to achieve simultaneous dimension reduction and variable selection for better interpretation of the results. Because of the generality of exponential family distributions, the method can be applied to a wide range of applications, in particular when analyzing high dimensional next-generation sequencing data and genetic mutation data in genomics. The use of sparsity-inducing penalty helps produce sparse principal component loading vectors such that the principal components can focus on informative variables. By using an equivalent dual form of the formulated optimization problem for SePCA, we derive optimal solutions with efficient iterative closed-form updating rules. The results from both simulation experiments and real-world applications have demonstrated the superiority of our SePCA in reconstruction accuracy and computational efficiency over traditional exponential family PCA (ePCA), the existing Sparse PCA (SPCA) and Sparse Logistic PCA (SLPCA) algorithms.Entities:
Keywords: dimension reduction; exponential family principal component analysis; sparsity
Year: 2016 PMID: 28066030 PMCID: PMC5210214 DOI: 10.1016/j.patcog.2016.05.024
Source DB: PubMed Journal: Pattern Recognit ISSN: 0031-3203 Impact factor: 7.740