| Literature DB >> 15980544 |
Abstract
The advent of microarray technology has revolutionized the search for genes that are differentially expressed across a range of cell types or experimental conditions. Traditional clustering methods, such as hierarchical clustering, are often difficult to deploy effectively since genes rarely exhibit similar expression pattern across a wide range of conditions. Biclustering of gene expression data (also called co-clustering or two-way clustering) is a non-trivial but promising methodology for the identification of gene groups that show a coherent expression profile across a subset of conditions. Thus, biclustering is a natural methodology as a screen for genes that are functionally related, participate in the same pathways, affected by the same drug or pathological condition, or genes that form modules that are potentially co-regulated by a small group of transcription factors. We have developed a web-enabled service called GEMS (Gene Expression Mining Server) for biclustering microarray data. Users may upload expression data and specify a set of criteria. GEMS then performs bicluster mining based on a Gibbs sampling paradigm. The web server provides a flexible and an useful platform for the discovery of co-expressed and potentially co-regulated gene modules. GEMS is an open source software and is available at http://genomics10.bu.edu/terrence/gems/.Entities:
Mesh:
Year: 2005 PMID: 15980544 PMCID: PMC1160230 DOI: 10.1093/nar/gki469
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Examples of the shape of clusters or biclusters. (a) Algorithms based on similarity in overall gene expression produce ellipsoid or arbitrary shape clusters. (b) GEMS sets a width constraint on a subset of genes and produces axis-parallel hyper-rectangular biclusters.
Figure 2Illustration of GEMS output using NCI60 cDNA expression data as an example. (a) Three biclusters are detected: the numbers of genes in the biclusters are 10, 11 and 10, respectively. (b) The heatmaps of original T-matrix cDNA expression dataset (truncated) and three extracted biclusters. The expression values of every gene in the biclusters are consistent across the subset of samples.