Darlene R Goldstein1. 1. Ecole Polytechnique Fédérale de Lausanne, Institut de mathématiques, Bâtiment MA, Station 8, CH-1015 Lausanne, Switzerland. Darlene.Goldstein@epfl.ch
Abstract
MOTIVATION: Studies of gene expression using high-density short oligonucleotide arrays have become a standard in a variety of biological contexts. Of the expression measures that have been proposed to quantify expression in these arrays, multi-chip-based measures have been shown to perform well. As gene expression studies increase in size, however, utilizing multi-chip expression measures is more challenging in terms of computing memory requirements and time. RESULTS: A strategic alternative to exact multi-chip quantification on a full large chip set is to approximate expression values based on subsets of chips. This paper introduces an extrapolation method, Extrapolation Averaging (EA), and a resampling method, Partition Resampling (PR), to approximate expression in large studies. An examination of properties indicates that subset-based methods can perform well compared with exact expression quantification. The focus is on short oligonucleotide chips, but the same ideas apply equally well to any array type for which expression is quantified using an entire set of arrays, rather than for only a single array at a time. AVAILABILITY: Software implementing Partition Resampling and Extrapolation Averaging is under development as an R package for the BioConductor project.
MOTIVATION: Studies of gene expression using high-density short oligonucleotide arrays have become a standard in a variety of biological contexts. Of the expression measures that have been proposed to quantify expression in these arrays, multi-chip-based measures have been shown to perform well. As gene expression studies increase in size, however, utilizing multi-chip expression measures is more challenging in terms of computing memory requirements and time. RESULTS: A strategic alternative to exact multi-chip quantification on a full large chip set is to approximate expression values based on subsets of chips. This paper introduces an extrapolation method, Extrapolation Averaging (EA), and a resampling method, Partition Resampling (PR), to approximate expression in large studies. An examination of properties indicates that subset-based methods can perform well compared with exact expression quantification. The focus is on short oligonucleotide chips, but the same ideas apply equally well to any array type for which expression is quantified using an entire set of arrays, rather than for only a single array at a time. AVAILABILITY: Software implementing Partition Resampling and Extrapolation Averaging is under development as an R package for the BioConductor project.
Authors: Sébastien Jeay; Swann Gaulis; Stéphane Ferretti; Hans Bitter; Moriko Ito; Thérèse Valat; Masato Murakami; Stephan Ruetz; Daniel A Guthy; Caroline Rynn; Michael R Jensen; Marion Wiesmann; Joerg Kallen; Pascal Furet; François Gessier; Philipp Holzer; Keiichi Masuya; Jens Würthner; Ensar Halilovic; Francesco Hofmann; William R Sellers; Diana Graus Porta Journal: Elife Date: 2015-05-12 Impact factor: 8.140