David C Hoyle1, Magnus Rattray, Ray Jupp, Andrew Brass. 1. School of Biological Sciences, University of Manchester, Stopford Building, Oxford Rd, Manchester M13 9PT, UK. david.c.hoyle@man.ac.uk
Abstract
MOTIVATION: Typical analysis of microarray data has focused on spot by spot comparisons within a single organism. Less analysis has been done on the comparison of the entire distribution of spot intensities between experiments and between organisms. RESULTS: Here we show that mRNA transcription data from a wide range of organisms and measured with a range of experimental platforms show close agreement with Benford's law (Benford, PROC: Am. Phil. Soc., 78, 551-572, 1938) and Zipf's law (Zipf, The Psycho-biology of Language: an Introduction to Dynamic Philology, 1936 and Human Behaviour and the Principle of Least Effort, 1949). The distribution of the bulk of microarray spot intensities is well approximated by a log-normal with the tail of the distribution being closer to power law. The variance, sigma(2), of log spot intensity shows a positive correlation with genome size (in terms of number of genes) and is therefore relatively fixed within some range for a given organism. The measured value of sigma(2) can be significantly smaller than the expected value if the mRNA is extracted from a sample of mixed cell types. Our research demonstrates that useful biological findings may result from analyzing microarray data at the level of entire intensity distributions.
MOTIVATION: Typical analysis of microarray data has focused on spot by spot comparisons within a single organism. Less analysis has been done on the comparison of the entire distribution of spot intensities between experiments and between organisms. RESULTS: Here we show that mRNA transcription data from a wide range of organisms and measured with a range of experimental platforms show close agreement with Benford's law (Benford, PROC: Am. Phil. Soc., 78, 551-572, 1938) and Zipf's law (Zipf, The Psycho-biology of Language: an Introduction to Dynamic Philology, 1936 and Human Behaviour and the Principle of Least Effort, 1949). The distribution of the bulk of microarray spot intensities is well approximated by a log-normal with the tail of the distribution being closer to power law. The variance, sigma(2), of log spot intensity shows a positive correlation with genome size (in terms of number of genes) and is therefore relatively fixed within some range for a given organism. The measured value of sigma(2) can be significantly smaller than the expected value if the mRNA is extracted from a sample of mixed cell types. Our research demonstrates that useful biological findings may result from analyzing microarray data at the level of entire intensity distributions.
Authors: Thomas E Royce; Joel S Rozowsky; Paul Bertone; Manoj Samanta; Viktor Stolc; Sherman Weissman; Michael Snyder; Mark Gerstein Journal: Trends Genet Date: 2005-08 Impact factor: 11.639
Authors: Chengjian Tu; Wilfrido Mojica; Robert M Straubinger; Jun Li; Shichen Shen; Miao Qu; Lei Nie; Rick Roberts; Bo An; Jun Qu Journal: Proteomics Clin Appl Date: 2017-01-20 Impact factor: 3.494
Authors: Thomas A Oliver; David A Garfield; Mollie K Manier; Ralph Haygood; Gregory A Wray; Stephen R Palumbi Journal: Genome Biol Evol Date: 2010-10-08 Impact factor: 3.416