MOTIVATION: Clusters of genes encoding proteins with related functions, or in the same regulatory network, often exhibit expression patterns that are correlated over a large number of conditions. Protein associations and gene regulatory networks can be modelled from expression data. We address the question of which of several normalization methods is optimal prior to computing the correlation of the expression profiles between every pair of genes. RESULTS: We use gene expression data from five experiments with a total of 78 hybridizations and 23 diverse conditions. Nine methods of data normalization are explored based on all possible combinations of normalization techniques according to between and within gene and experiment variation. We compare the resulting empirical distribution of gene x gene correlations with the expectations and apply cross-validation to test the performance of each method in predicting accurate functional annotation. We conclude that normalization methods based on mixed-model equations are optimal.
MOTIVATION: Clusters of genes encoding proteins with related functions, or in the same regulatory network, often exhibit expression patterns that are correlated over a large number of conditions. Protein associations and gene regulatory networks can be modelled from expression data. We address the question of which of several normalization methods is optimal prior to computing the correlation of the expression profiles between every pair of genes. RESULTS: We use gene expression data from five experiments with a total of 78 hybridizations and 23 diverse conditions. Nine methods of data normalization are explored based on all possible combinations of normalization techniques according to between and within gene and experiment variation. We compare the resulting empirical distribution of gene x gene correlations with the expectations and apply cross-validation to test the performance of each method in predicting accurate functional annotation. We conclude that normalization methods based on mixed-model equations are optimal.
Authors: Gang-Ping Xue; C Lynne McIntyre; Scott Chapman; Neil I Bower; Heather Way; Antonio Reverter; Bryan Clarke; Ray Shorter Journal: Plant Mol Biol Date: 2006-08 Impact factor: 4.076
Authors: Quan Gu; Shivashankar H Nagaraj; Nicholas J Hudson; Brian P Dalrymple; Antonio Reverter Journal: BMC Genomics Date: 2011-01-12 Impact factor: 3.969
Authors: Wendy J M Smith; Yutao Li; Aaron Ingham; Eliza Collis; Sean M McWilliam; Tom J Dixon; Belinda J Norris; Suzanne I Mortimer; Robert J Moore; Antonio Reverter Journal: BMC Vet Res Date: 2010-05-26 Impact factor: 2.741
Authors: Elsa García-Gámez; Antonio Reverter; Vicki Whan; Sean M McWilliam; Juan José Arranz; James Kijas Journal: PLoS One Date: 2011-06-20 Impact factor: 3.240
Authors: Bing Guo; Paul L Greenwood; Linda M Cafe; Guanghong Zhou; Wangang Zhang; Brian P Dalrymple Journal: BMC Genomics Date: 2015-03-13 Impact factor: 3.969
Authors: Dafne Pérez-Montarelo; Nicholas J Hudson; Ana I Fernández; Yuliaxis Ramayo-Caldas; Brian P Dalrymple; Antonio Reverter Journal: PLoS One Date: 2012-09-26 Impact factor: 3.240
Authors: Aaron B Ingham; Simone A Osborne; Moira Menzies; Suzie Briscoe; Wei Chen; Kritaya Kongsuwan; Antonio Reverter; Angela Jeanes; Brian P Dalrymple; Gene Wijffels; Robert Seymour; Nicholas J Hudson Journal: BMC Syst Biol Date: 2014-01-29