| Literature DB >> 19695084 |
Alain B Tchagang1, Kevin V Bui, Thomas McGinnis, Panayiotis V Benos.
Abstract
BACKGROUND: Time series gene expression data analysis is used widely to study the dynamics of various cell processes. Most of the time series data available today consist of few time points only, thus making the application of standard clustering techniques difficult.Entities:
Mesh:
Year: 2009 PMID: 19695084 PMCID: PMC2743670 DOI: 10.1186/1471-2105-10-255
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1. Comparison of the two algorithms on simulated data with different noise levels.
Figure 2Clustering yeast time series data. The most statistically significant clusters as they were identified by (A) ASTRO and (B) MiMeSR.
Evaluation of the clusters identified by ASTRO using Gene Ontology data
| A1 | 70 | Cellular biosynthetic process | 41 (58.6%) | 7.0e-16 |
| Biosynthetic process | 42 (60.0%) | 2.5e-13 | ||
| B1 | 86 | Translation | 55 (64.0%) | 1.5e-33 |
| Macromolecule biosynthetic process | 55 (64.0%) | 1.5e-27 | ||
| C1 | 25 | Ribosome biogenesis and assembly | 14 (56.0%) | 9.8e-15 |
| Ribonucleoprotein complex biogenesis and assembly | 14 (56.0%) | 9.6e-10 | ||
| D1 | 74 | Ribosome biogenesis and assembly | 44 (59.5%) | 2.6e-34 |
| Ribonucleoprotein complex biogenesis and assembly | 44 (59.9%) | 4.6e-31 | ||
| E1* | 33 | Sulfur metabolic process | 06 (18.2%) | 9.0e-05 |
| Sulfur amino acid metabolic process | 04 (12.1%) | 2.4e-03 | ||
| F1* | 26 | Amino acid transport | 04 (15.4%) | 1.4e-03 |
| Amine transport | 04 (15.4%) | 3.6e-03 | ||
* More than 20% genes of unknown function
Evaluation of the clusters identified by MiMeSR using Gene Ontology data
| A2 | 246 | Ribosome biogenesis and assembly | 103 (42.0%) | 2.0e-64 |
| Gene expression | 167 (68.0%) | 1.0e-49 | ||
| B2* | 66 | Sulfur metabolic process | 13 (20.0%) | 1.2e-12 |
| Sulfur amino acid metabolic process | 8 (12.1%) | 4.4e-08 | ||
| C2 | 133 | Ribosome biogenesis and assembly | 82 (62.0%) | 1.7e-68 |
| Ribonucleoprotein complex biogenesis and assembly | 84 (63.2%) | 5.6e-65 | ||
| D2 | 80 | Translation | 62 (77.5%) | 1.5e-46 |
| Macromolecule biosynthetic process | 66 (82.5%) | 1.9e-45 | ||
| E2 | 109 | Ribosome biogenesis and assembly | 73 (67.0%) | 1.7e-64 |
| Ribonucleoprotein complex biogenesis and assembly | 73 (67.0%) | 8.1e-59 | ||
| F2 | 60 | Ribosome biogenesis and assembly | 42 (70.0%) | 1.6e-37 |
| Ribonucleoprotein complex biogenesis and assembly | 42 (70.0%) | 2.2e-34 | ||
| G2 | 94 | Translation | 76 (81.0%) | 3.1e-60 |
| Macromolecule biosynthetic process | 77 (82.0%) | 5.2e-53 | ||
* More than 20% genes of unknown function
Evaluation of the clusters identified by ASTRO using ChIP-chip data.
| ARO80 | 2/3% | 3/3% | 1/4% | |||
| BAS1 | 1/3% | |||||
| CBF1 | 5/15% | |||||
| CHA4 | ||||||
| DAL81 | ||||||
| FHL1 | 3/10% | |||||
| GCR2 | 3/4% | |||||
| GCN4 | 3/9% | |||||
| MET31 | 1/3% | |||||
| MET32 | ||||||
| MET4 | 1/3% | |||||
| SFP1 | 3/10% | |||||
Bold letter boxes correspond to statistically overrepresented transcription factors in that cluster. Each box contains: (a) the number of genes, (b) the percent of genes in the cluster associated with this transcription factor, and (c) the p-value (Fisher's exact test).
Evaluation of the clusters identified by MiMeSR using ChIP-chip data.
| ARO80 | 4/3% | ||||||
| BAS1 | |||||||
| CBF1 | 10/4% | 3/3% | |||||
| CHA4 | |||||||
| DAL81 | |||||||
| FHL1 | 5/5% | 3/5% | |||||
| GCR2 | |||||||
| GCN4 | |||||||
| MET31 | |||||||
| MET32 | 3/3% | ||||||
| MET4 | |||||||
| SFP1 | 4/3% | ||||||
Bold letter boxes correspond to statistically overrepresented transcription factors in that cluster. Each box contains: (a) the number of genes, (b) the percent of genes in the cluster associated with this transcription factor, and (c) the p-value (Fisher's exact test).
Figure 3Comparison of clustering approaches. Comparative analysis of clustering approaches using (A) GO data and (B) amino acid starvation ChIP-chip data. The y-axis represents the percent of clusters for which the p-value of their most significant category (GO or ChIP-chip) was lower than the given threshold.
Figure 4Example of the .