P Baldi1, A D Long. 1. Department of Information and Computer Science, University of California at Irvine, Irvine, CA 92697-3425, USA. pfbaldi@ics.uci.edu
Abstract
MOTIVATION: DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. RESULTS: We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model log-expression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a t -test, provide a systematic inference approach that compares favorably with simple t -test or fold methods, and partly compensate for the lack of replication.
MOTIVATION: DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. RESULTS: We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model log-expression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a t -test, provide a systematic inference approach that compares favorably with simple t -test or fold methods, and partly compensate for the lack of replication.
Authors: Mark G Carter; Toshio Hamatani; Alexei A Sharov; Condie E Carmack; Yong Qian; Kazuhiro Aiba; Naomi T Ko; Dawood B Dudekula; Pius M Brzoska; S Stuart Hwang; Minoru S H Ko Journal: Genome Res Date: 2003-05 Impact factor: 9.043
Authors: Leendert W Hamoen; Wiep Klaas Smits; Anne de Jong; Siger Holsappel; Oscar P Kuipers Journal: Nucleic Acids Res Date: 2002-12-15 Impact factor: 16.971
Authors: Paul J Cullen; Walid Sabbagh; Ellie Graham; Molly M Irick; Erin K van Olden; Cassandra Neal; Jeffrey Delrow; Lee Bardwell; George F Sprague Journal: Genes Dev Date: 2004-07-15 Impact factor: 11.361
Authors: Xiaoman Xu; Jaana Mannik; Elena Kudryavtseva; Kevin K Lin; Lisa A Flanagan; Joel Spencer; Amelia Soto; Ning Wang; Zhongxian Lu; Zhengquan Yu; Edwin S Monuki; Bogi Andersen Journal: Dev Biol Date: 2007-10-05 Impact factor: 3.582