| Literature DB >> 21592412 |
Mitja I Kurki1, Jussi Paananen, Markus Storvik, Seppo Ylä-Herttuala, Juha E Jääskeläinen, Mikael von Und Zu Fraunberg, Garry Wong, Petri Pehkonen.
Abstract
BACKGROUND: A major challenge in genomic research is identifying significant biological processes and generating new hypotheses from large gene sets. Gene sets often consist of multiple separate biological pathways, controlled by distinct regulatory mechanisms. Many of these pathways and the associated regulatory mechanisms might be obscured by a large number of other significant processes and thus not identified as significant by standard gene set enrichment analysis tools.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21592412 PMCID: PMC3120704 DOI: 10.1186/1471-2105-12-171
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The flow diagram of TAFFEL analysis. From the top: the list of genes given by the user is annotated by GO and TF information from Ensembl (20) and cisRED (12) databases. The genes are clustered separately in parallel, based on GO and TF annotations (for simplicity only the TF clustering tree is shown). In each resulting cluster, the enrichment of both GO and TF annotations is determined, providing a basis for suggesting implications between the biological processes and their regulator molecules.
Figure 2TAFFEL user interface. The clustering trees represents the clustering result for the DE genes after 4 hours of forskolin treatment in HEK293T cells. The genes have been clustered by the GO terms (left) and TFs (right). The topmost box represents the whole gene set without clustering. Below that, each level represents clustering to two, three, or more clusters. The green outline indicates the cluster number selection by AIC and blue by dAIC. The clusters obtained from the IEA analysis with FDR p < 0.1 are highlighted with the light blue background on the right side of cluster box. The best intercorrelating clusters (cell morphogenesis cluster in the GO tree and COUP cluster in the TF tree) between the trees are connected with the bold line. Information at bottom shows enriched annotations (left list) and cluster genes (right list). Positive regulation of biosynthetic process related cluster is selected in the picture.
Statistically significant clusters in the forskolin (FSK) and up-regulated (sIA↑) and down-regulated (sIA↓) aneurysm datasets
| CLUSTER | ANNOTATION | P | P LIST | N | N LIST | ||
|---|---|---|---|---|---|---|---|
| FSK | GO | 26 | transcription from RNA polymerase II promoter | 1.9E-21 | 8.2E-04 | 23 | 46 |
| positive regulation of macromolecule biosynthetic process | 2.5E-21 | 5.7E-02 | 19 | 26 | |||
| positive regulation of gene expression | 2.5E-21 | 2.4E-02 | 19 | 26 | |||
| HES-1 | 2.0E-02 | 7.2E-01 | 8 | 32 | |||
| AhR | 2.9E-02 | 4.9E-02 | 15 | 123 | |||
| FSK | GO | 50 | macromolecule localization | 8.9E-37 | 8.2E-02 | 39 | 49 |
| protein transport | 8.9E-37 | 1.0E-01 | 37 | 43 | |||
| establishment of protein localization | 8.9E-37 | 1.1E-01 | 37 | 43 | |||
| FOXO1 | 3.3E-02 | 5.1E-01 | 10 | 32 | |||
| FSK | TF | 33 | E2F-4_DP-2 | 4.2E-34 | 1.3E-01 | 31 | 48 |
| Rb_E2F-1_DP-1 | 4.2E-34 | 1.5E-01 | 30 | 43 | |||
| E2F-4_DP-1 | 6.2E-31 | 2.2E-02 | 29 | 45 | |||
| organelle organization | 2.5E-03 | 4.0E-03 | 12 | 50 | |||
| sIA ↑ | GO | 58 | cation transport | 3.8E-08 | 1.8E-01 | 17 | 23 |
| ion transport | 3.8E-08 | 3.7E-01 | 18 | 26 | |||
| metal ion transport | 1.2E-05 | 4.2E-01 | 12 | 16 | |||
| MTF-1 | 4.8E-02 | 4.7E-01 | 13 | 30 | |||
| ATF-1 | 4.8E-02 | 6.5E-01 | 9 | 17 | |||
| sIA ↑ | TF | 29 | S8 | 1.3E-19 | 8.5E-01 | 23 | 40 |
| Chx10 | 3.8E-14 | 7.3E-01 | 20 | 42 | |||
| Lhx3a | 8.1E-13 | 6.7E-01 | 19 | 42 | |||
| amine biosynthetic process | 9.1E-02 | 1.7E-01 | 3 | 4 | |||
| sIA ↓ | GO | 22 | nervous system development | 1.3E-11 | 1.3E-02 | 17 | 32 |
| generation of neurons | 2.1E-06 | 2.8E-01 | 8 | 10 | |||
| cell development | 2.1E-06 | 2.8E-01 | 10 | 17 | |||
| Tal-1 | 3.1E-02 | 1.0E+00 | 4 | 5 | |||
| AR | 9.4E-02 | 9.8E-01 | 6 | 17 | |||
| sIA ↓ | GO | 49 | organic acid metabolic process | 1.3E-08 | 2.8E-01 | 13 | 13 |
| carboxylic acid metabolic process | 1.3E-08 | 2.8E-01 | 13 | 13 | |||
| oxidation reduction | 1.2E-07 | 2.8E-01 | 14 | 16 | |||
| lipid metabolic process | 5.3E-07 | 4.4E-01 | 13 | 16 | |||
| NF-1 | 3.7E-02 | 9.8E-01 | 14 | 30 | |||
CLUSTER column indicates the clustered dataset, annotations used for clustering (either GO or TF) and the size of the cluster, respectively. ANNOTATION column indicates enriched GO terms and TF annotations from TRANSFAC in each cluster. P and P LIST columns indicate Benjamini-Hochberg FDR corrected Fisher's exact test p-values for the enrichment of the annotation in the cluster and in the gene list, respectively. N and N LIST columns show the number of genes associated with the annotation in the cluster and in the gene list.