| Literature DB >> 16176576 |
Ron Shamir1, Adi Maron-Katz, Amos Tanay, Chaim Linhart, Israel Steinfeld, Roded Sharan, Yosef Shiloh, Ran Elkon.
Abstract
BACKGROUND: Gene expression microarrays are a prominent experimental tool in functional genomics which has opened the opportunity for gaining global, systems-level understanding of transcriptional networks. Experiments that apply this technology typically generate overwhelming volumes of data, unprecedented in biological research. Therefore the task of mining meaningful biological knowledge out of the raw data is a major challenge in bioinformatics. Of special need are integrative packages that provide biologist users with advanced but yet easy to use, set of algorithms, together covering the whole range of steps in microarray data analysis.Entities:
Mesh:
Year: 2005 PMID: 16176576 PMCID: PMC1261157 DOI: 10.1186/1471-2105-6-232
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1A high level summary of EXPANDER's microarray data analysis flow and of the main algorithms implemented in each analysis step.
Figure 2All-patterns display of a clustering solution. Each graph represents a specific cluster. The X-axis represents the measured conditions. The Y-axis represents (standardized) expression levels. Each cluster is represented by the mean expression pattern over all the genes assigned to it. Error bars denote ± 1 standard deviation. Clicking within a cell opens a window that lists the genes that are assigned to the cluster.
Figure 3Matrix displays. (A) Unclustered expression matrix display. Each row corresponds to a gene, and each column to a biological sample. The color of the (i, j) cell in the matrix indicates the expression level of the ith gene in the jth sample. Green represents below-average expression level; Red represents above-average expression level (color scheme can be adjusted by the user). (B) The same dataset as in A, with genes ordered according to a clustering solution. Horizontal white lines separate the different clusters. (C) Unclustered similarity matrix display. The color of the (i, j) cell in the matrix represents the similarity between the expression patterns of the ith and the jth genes over all the samples (hence the matrix is symmetric). Red represents high similarity, and green represents low similarity. (D) Same as in C, with genes ordered on both axes according to a clustering solution. Clusters appear as distinct red blocks along the matrix diagonal, and similar clusters are manifested by off-diagonal reddish blocks.
Figure 4Bicluster analysis. (A) A bicluster corresponds to a submatrix defined by row and column subsets. Both subsets are not known in advance. After reordering the original data matrix, it can be seen as the rectangle with the yellow border. (B) EXPANDER summarizes bicluster analysis results in a table that lists the dimensions (numbers of genes and conditions) of the biclusters identified and their scores. Clicking on a row in this table pops-up a window with the submatrix view of the selected bicluster. Below the table there are two examples of biclusters identified in a dataset comprising some 1,000 genes measured across over 70 conditions in human cells. Row and column labels are gene and condition names for the bicluster, respectively.
Figure 5GO functional enrichment analysis. (A) Enriched GO categories identified by TANGO in the analyzed gene groups (clusters or biclusters) are displayed as bar diagrams; each corresponding to a specific gene group (i. e., cluster or bicluster). In these diagrams, GO categories are color-coded, and the height of a bar represents the statistical significance (-log10(p-value)) of the observed enrichment for its corresponding category. The percentage of genes in the group assigned to the enriched category is indicated above the bar. (B) Clicking on a bar pops-up a window that lists the group's genes that are associated with the corresponding GO category. In this window, genes are linked to central annotation DBs (SGD [25] for yeast, WormBase [26] for worm, FlyBase [27] for fly, and Entrez Gene [28] for human, mouse and rat) where detailed gene descriptions can be found for in-depth analysis.
Figure 6Promoter cis-regulatory elements enrichment. (A) Transcription factors (TFs) whose DNA binding site signatures are over-represented in promoters of the genes assigned to the clusters are displayed in bar diagrams. Like the display for the GO analysis (Fig. 5), each diagram corresponds to a specific gene group (cluster or bicluster), TFs are color-coded and identified by the accession number of their binding site model in TRANSFAC DB. The statistical significance of the observed enrichment for a TF is represented by the height of its bar (-log10(p-value)). The TF enrichment factor, which is the ratio between the prevalence of the TF hits in the gene group and in the background set of promoters, is indicated above the bar. (B) Clicking on a specific bar pops-up a window that lists the genes in the group whose promoters were found to contain a hit for that TF. In this window, genes are linked to central annotation DB of the analyzed organism as specified in the legend of Figure 5.
Major biclusters identified in the test case analysis of the human stress data set.
| 9 | 79 | DNA Replication (GO:0006260, 5.3 × 10-9) | E2F (M00918, 1.3 × 10-7) | Down-regulation of DNA replication genes in fibroblasts exposed to DDT or Menadione. | |
| 33 | 74 | Mitosis (GO:0007067, 9.3 × 10-19) | NF-Y (M00287, 6.7 × 10-22) IRF-7 (M00453, 9.5 × 10-5) | Down-regulation of mitotic genes in response to various stresses. | |
| 41 | 13 | Mitosis (GO:0007067, 3.3 × 10-10) | NF-Y (M00287, 3.4 × 10-9) | Down-regulation of mitotic genes in response to various stresses. | |
| 5 | 89 | Carboxylic acid metabolism (GO:0019752, 3.4 × 10-8) | --- | Genes activated in Hela cells in response to Tunicamycin and Menadione | |
| 6 | 145 | Response to unfolded protein (GO:0006986, 1.2 × 10-7) | --- | Genes activated in Hela cells in response to heat shock | |
| 7 | 142 | Response to unfolded protein (GO:0006986, 7.3 × 10-9) | AP-2alpha (M00469, 5.6 × 10-4) | Genes activated in K562 cells in response to heat shock | |
| 10 | 24 | Response to unfolded protein (GO:0006986, 1.5 × 10-7) | --- | Genes that are activated by heat shock but repressed by crowding in Hela cells | |
| 6 | 105 | Transcription corepressor (GO:0003714, 1.5 × 10-6) | HIF-1 (M00797, 6.9 × 10-4) | Genes activated in fibroblasts in response to DDT | |
| 8 | 200 | --- | ---- | Genes activated in fibroblasts in response to oxidative stress (H2O2) | |
| 9 | 134 | --- | ---- | Genes that are repressed by crowding in fibroblasts. | |
| 22 | 51 | ---- | N-Myc(M00055, 2.7 × 10-6) | Genes that are repressed in both Hela cells and fibroblasts. | |
| 9 | 115 | --- | AP-4 (M00005, 2.1 × 10-4) | Genes repressed in Hela cells in response to various stresses. | |
| 10 | 31 | --- | NFkB (M00051, 7.1 × 10-4) | Genes activated in Hela cells in response to DDT. |