BACKGROUND: The elucidation of networks from a compendium of gene expression data is one of the goals of systems biology and can be a valuable source of new hypotheses for experimental researchers. For Arabidopsis, there exist several thousand microarrays which form a valuable resource from which to learn. RESULTS: A novel Bayesian network-based algorithm to infer gene regulatory networks from gene expression data is introduced and applied to learn parts of the transcriptomic network in Arabidopsis thaliana from a large number (thousands) of separate microarray experiments. Starting from an initial set of genes of interest, a network is grown by iterative addition to the model of the gene, from another defined set of genes, which gives the 'best' learned network structure. The gene set for iterative growth can be as large as the entire genome. A number of networks are inferred and analysed; these show (i) an agreement with the current literature on the circadian clock network, (ii) the ability to model other networks, and (iii) that the learned network hypotheses can suggest new roles for poorly characterized genes, through addition of relevant genes from an unconstrained list of over 15,000 possible genes. To demonstrate the latter point, the method is used to suggest that particular GATA transcription factors are regulators of photosynthetic genes. Additionally, the performance in recovering a known network from different amounts of synthetically generated data is evaluated. CONCLUSION: Our results show that plausible regulatory networks can be learned from such gene expression data alone. This work demonstrates that network hypotheses can be generated from existing gene expression data for use by experimental biologists.
BACKGROUND: The elucidation of networks from a compendium of gene expression data is one of the goals of systems biology and can be a valuable source of new hypotheses for experimental researchers. For Arabidopsis, there exist several thousand microarrays which form a valuable resource from which to learn. RESULTS: A novel Bayesian network-based algorithm to infer gene regulatory networks from gene expression data is introduced and applied to learn parts of the transcriptomic network in Arabidopsis thaliana from a large number (thousands) of separate microarray experiments. Starting from an initial set of genes of interest, a network is grown by iterative addition to the model of the gene, from another defined set of genes, which gives the 'best' learned network structure. The gene set for iterative growth can be as large as the entire genome. A number of networks are inferred and analysed; these show (i) an agreement with the current literature on the circadian clock network, (ii) the ability to model other networks, and (iii) that the learned network hypotheses can suggest new roles for poorly characterized genes, through addition of relevant genes from an unconstrained list of over 15,000 possible genes. To demonstrate the latter point, the method is used to suggest that particular GATA transcription factors are regulators of photosynthetic genes. Additionally, the performance in recovering a known network from different amounts of synthetically generated data is evaluated. CONCLUSION: Our results show that plausible regulatory networks can be learned from such gene expression data alone. This work demonstrates that network hypotheses can be generated from existing gene expression data for use by experimental biologists.
Authors: Chih-Hung Jen; Iain W Manfield; Ioannis Michalopoulos; John W Pinney; William G T Willats; Philip M Gilmartin; David R Westhead Journal: Plant J Date: 2006-04 Impact factor: 6.417
Authors: Peter D Gould; James C W Locke; Camille Larue; Megan M Southern; Seth J Davis; Shigeru Hanano; Richard Moyle; Raechel Milich; Joanna Putterill; Andrew J Millar; Anthony Hall Journal: Plant Cell Date: 2006-04-14 Impact factor: 11.277
Authors: Iain W Manfield; Chih-Hung Jen; John W Pinney; Ioannis Michalopoulos; James R Bradford; Philip M Gilmartin; David R Westhead Journal: Nucleic Acids Res Date: 2006-07-01 Impact factor: 16.971
Authors: James C W Locke; Megan M Southern; László Kozma-Bognár; Victoria Hibberd; Paul E Brown; Matthew S Turner; Andrew J Millar Journal: Mol Syst Biol Date: 2005-06-28 Impact factor: 11.429
Authors: Adam A Margolin; Ilya Nemenman; Katia Basso; Chris Wiggins; Gustavo Stolovitzky; Riccardo Dalla Favera; Andrea Califano Journal: BMC Bioinformatics Date: 2006-03-20 Impact factor: 3.169
Authors: Kieron D Edwards; Paul E Anderson; Anthony Hall; Neeraj S Salathia; James C W Locke; James R Lynn; Martin Straume; James Q Smith; Andrew J Millar Journal: Plant Cell Date: 2006-02-10 Impact factor: 11.277
Authors: Darryl Hudson; David Guevara; Mahmoud W Yaish; Carol Hannam; Nykoll Long; Joseph D Clarke; Yong-Mei Bi; Steven J Rothstein Journal: PLoS One Date: 2011-11-10 Impact factor: 3.240