| Literature DB >> 20140068 |
Markus Schmidberger1, Esmeralda Vicedo, Ulrich Mansmann.
Abstract
Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is necessary to estimate the rule's prediction quality honestly.This paper proposes the new Bioconductor package affyPara for parallelized preprocessing of Affymetrix microarray data. Partition of data can be applied on arrays and parallelization of algorithms is a straightforward consequence. The partition of data and distribution to several nodes solves the main memory problems and accelerates preprocessing by up to the factor 20 for 200 or more arrays.affyPara is a free and open source package, under GPL license, available form the Bioconductor project at www.bioconductor.org. A user guide and examples are provided with the package.Entities:
Keywords: R; microarray; normalization; parallel computing; preprocessing
Year: 2009 PMID: 20140068 PMCID: PMC2808179 DOI: 10.4137/bbi.s3060
Source DB: PubMed Journal: Bioinform Biol Insights ISSN: 1177-9322
Figure 1.Flowchart and relative speedup curves for parallelized quantile normalization calculated on the super-computer HLRBII at the LRZ in Munich, Germany.