MOTIVATION: Much of a cell's regulatory response to changing environments occurs at the transcriptional level. Particularly in higher organisms, transcription factors (TFs), microRNAs and epigenetic modifications can combine to form a complex regulatory network. Part of this system can be modeled as a collection of regulatory modules: co-regulated genes, the conditions under which they are co-regulated and sequence-level regulatory motifs. RESULTS: We present the Combinatorial Algorithm for Expression and Sequence-based Cluster Extraction (COALESCE) system for regulatory module prediction. The algorithm is efficient enough to discover expression biclusters and putative regulatory motifs in metazoan genomes (>20,000 genes) and very large microarray compendia (>10,000 conditions). Using Bayesian data integration, it can also include diverse supporting data types such as evolutionary conservation or nucleosome placement. We validate its performance using a functional evaluation of co-clustered genes, known yeast and Escherichea coli TF targets, synthetic data and various metazoan data compendia. In all cases, COALESCE performs as well or better than current biclustering and motif prediction tools, with high accuracy in functional and TF/target assignments and zero false positives on synthetic data. COALESCE provides an efficient and flexible platform within which large, diverse data collections can be integrated to predict metazoan regulatory networks. AVAILABILITY: Source code (C++) is available at http://function.princeton.edu/sleipnir, and supporting data and a web interface are provided at http://function.princeton.edu/coalesce. CONTACT: ogt@cs.princeton.edu; hcoller@princeton.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Much of a cell's regulatory response to changing environments occurs at the transcriptional level. Particularly in higher organisms, transcription factors (TFs), microRNAs and epigenetic modifications can combine to form a complex regulatory network. Part of this system can be modeled as a collection of regulatory modules: co-regulated genes, the conditions under which they are co-regulated and sequence-level regulatory motifs. RESULTS: We present the Combinatorial Algorithm for Expression and Sequence-based Cluster Extraction (COALESCE) system for regulatory module prediction. The algorithm is efficient enough to discover expression biclusters and putative regulatory motifs in metazoan genomes (>20,000 genes) and very large microarray compendia (>10,000 conditions). Using Bayesian data integration, it can also include diverse supporting data types such as evolutionary conservation or nucleosome placement. We validate its performance using a functional evaluation of co-clustered genes, known yeast and Escherichea coli TF targets, synthetic data and various metazoan data compendia. In all cases, COALESCE performs as well or better than current biclustering and motif prediction tools, with high accuracy in functional and TF/target assignments and zero false positives on synthetic data. COALESCE provides an efficient and flexible platform within which large, diverse data collections can be integrated to predict metazoan regulatory networks. AVAILABILITY: Source code (C++) is available at http://function.princeton.edu/sleipnir, and supporting data and a web interface are provided at http://function.princeton.edu/coalesce. CONTACT: ogt@cs.princeton.edu; hcoller@princeton.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Marc E Colosimo; Adam Brown; Saikat Mukhopadhyay; Christopher Gabel; Anne E Lanjuin; Aravinthan D T Samuel; Piali Sengupta Journal: Curr Biol Date: 2004-12-29 Impact factor: 10.834
Authors: Karen Lemmens; Tijl De Bie; Thomas Dhollander; Sigrid C De Keersmaecker; Inge M Thijs; Geert Schoofs; Ami De Weerdt; Bart De Moor; Jos Vanderleyden; Julio Collado-Vides; Kristof Engelen; Kathleen Marchal Journal: Genome Biol Date: 2009-03-06 Impact factor: 13.583
Authors: Seth A Ament; Charles A Blatti; Cedric Alaux; Marsha M Wheeler; Amy L Toth; Yves Le Conte; Greg J Hunt; Ernesto Guzmán-Novoa; Gloria Degrandi-Hoffman; Jose Luis Uribe-Rubio; Gro V Amdam; Robert E Page; Sandra L Rodriguez-Zas; Gene E Robinson; Saurabh Sinha Journal: Proc Natl Acad Sci U S A Date: 2012-06-12 Impact factor: 11.205
Authors: Vessela N Kristensen; Ole Christian Lingjærde; Hege G Russnes; Hans Kristian M Vollan; Arnoldo Frigessi; Anne-Lise Børresen-Dale Journal: Nat Rev Cancer Date: 2014-05 Impact factor: 60.716
Authors: Peter Waltman; Thadeous Kacmarczyk; Ashley R Bate; Daniel B Kearns; David J Reiss; Patrick Eichenberger; Richard Bonneau Journal: Genome Biol Date: 2010-09-29 Impact factor: 13.583