| Literature DB >> 27930330 |
Zhixiang Lin1, Can Yang2, Ying Zhu3,4, John Duchi1,5, Yao Fu6, Yong Wang7, Bai Jiang1, Mahdi Zamanighomi1, Xuming Xu4, Mingfeng Li4, Nenad Sestan4,8,9, Hongyu Zhao10, Wing Hung Wong11,12.
Abstract
Dimension reduction methods are commonly applied to high-throughput biological datasets. However, the results can be hindered by confounding factors, either biological or technical in origin. In this study, we extend principal component analysis (PCA) to propose AC-PCA for simultaneous dimension reduction and adjustment for confounding (AC) variation. We show that AC-PCA can adjust for (i) variations across individual donors present in a human brain exon array dataset and (ii) variations of different species in a model organism ENCODE RNA sequencing dataset. Our approach is able to recover the anatomical structure of neocortical regions and to capture the shared variation among species during embryonic development. For gene selection purposes, we extend AC-PCA with sparsity constraints and propose and implement an efficient algorithm. The methods developed in this paper can also be applied to more general settings. The R package and MATLAB source code are available at https://github.com/linzx06/AC-PCA.Entities:
Keywords: confounding variation; dimension reduction; transcriptome
Mesh:
Year: 2016 PMID: 27930330 PMCID: PMC5187682 DOI: 10.1073/pnas.1617317113
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205