| Literature DB >> 24687561 |
Jung Ae Lee1, Kevin K Dobbin, Jeongyoun Ahn.
Abstract
Batch bias has been found in many microarray gene expression studies that involve multiple batches of samples. A serious batch effect can alter not only the distribution of individual genes but also the inter-gene relationships. Even though some efforts have been made to remove such bias, there has been relatively less development on a multivariate approach, mainly because of the analytical difficulty due to the high-dimensional nature of gene expression data. We propose a multivariate batch adjustment method that effectively eliminates inter-gene batch effects. The proposed method utilizes high-dimensional sparse covariance estimation based on a factor model and a hard thresholding. Another important aspect of the proposed method is that if it is known that one of the batches is produced in a superior condition, the other batches can be adjusted so that they resemble the target batch. We study high-dimensional asymptotic properties of the proposed estimator and compare the performance of the proposed method with some popular existing methods with simulated data and gene expression data sets.Entities:
Keywords: batch effect; factor model; gene expression; high-dimensional covariance estimation
Mesh:
Year: 2014 PMID: 24687561 PMCID: PMC4065794 DOI: 10.1002/sim.6157
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.373