| Literature DB >> 18254968 |
Abstract
BACKGROUND: Recent years have seen the development of various pathway-based methods for the analysis of microarray gene expression data. These approaches have the potential to bring biological insights into microarray studies. A variety of methods have been proposed to construct networks using gene expression data. Because individual pathways do not act in isolation, it is important to understand how different pathways coordinate to perform cellular functions. However, there are no published methods describing how to build pathway clusters that are closely related to traits of interest.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18254968 PMCID: PMC2335306 DOI: 10.1186/1471-2105-9-87
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1A Schematic Diagram of How to Identify Clusters of Pathways. Pathway (gene sets) information from externally available database, such as KEGG, BioCarta and GenMapp is combined with gene expression from clinical studies. We perform pathway-based Random Forests classification to obtain Class Votes. We identify clusters of pathways containing pathways with low OOB error rate using Tight Clustering. We identify the clusters of pathways that are consistent among different data sets. These pathway clusters are investigated further for possible crosstalk among them.
Figure 2Tight Clustering. A diagram illustrating Tight Clustering on Class votes.
Breast cancer data sets used in this study
| Data sets | Reference | n | Genes | Response type |
| INTEGEN | 99 | 54613 | ER status | |
| [10] Wang (2005) | 286 | 22215 | ER status | |
| [11] Miller (2005) | 251 | 22215 | ER status |
Tight Cluster Results 1
| Alzheimer's_disease | 17.13 | 23 |
| MAPK3, PELP1, ESR1, PDZK1, HSPB1, CA12, GLS, IL5, JUNB, GATA3, MAP2K3, MAPT, STH, CSNK1A1 | ||
| Nitrogen_metabolism | 17.13 | 40 |
| Gene Symbols of informative genes in this pathway cluster | ||
| MAPK3, PELP1, ESR1, PDZK1, HSPB1, HDAC2, CA12, GLS, IL5, JUNB, GATA3, MAP2K3 | ||
The bold pathways are those with low OOB error rates
Tight Cluster Results 2
| Alanine_and_aspartate_metabolism | 16.78 | 42 |
| Propanoate_metabolism | 17.83 | 59 |
| ABAT, ALDH1A3, GLUL, GMPS, HMGCL, HSD17B4, MAP3K15, MCCC2, PDHA1 | ||
| Alanine_and_aspartate_metabolism | 17.53 | 42 |
| Propanoate_metabolism | 18.33 | 59 |
| GM-Glycolysis_and_Gluconeogenesis | 23.11 | 66 |
| ABAT, ALDH1A3, HMGCL, HSD17B4, MAP3K15, MCCC2, PDHA1 | ||
| Glycosphingolipid_biosynthesis | 23.23 | 34 |
| Propanoate_metabolism | 21.21 | 85 |
| ABAT, GATA3, HSD17B4, MCCC2, PRKAR1B | ||
The bold pathways are those with low OOB error rates
Figure 3Pathway Clusters. A pathway cluster showing a total of five pathways, three of which have shared genes and two pathways do not share common genes.
Proportion of genes showing more than the indicated number of literature support
| Informative genes not in pathway cluster (Table 2) of top 22 pathways for | BC | ER | PR |
| ≥ 1 | 0.44 | 0.33 | 0.20 |
| ≥ 2 | 0.31 | 0.27 | 0.18 |
| ≥ 5 | 0.24 | 0.20 | 0.09 |
| BC | ER | PR | |
| ≥ 1 | 0.42 | 0.42 | 0.33 |
| ≥ 2 | 0.42 | 0.42 | 0.25 |
| ≥ 5 | 0.25 | 0.33 | 0.17 |
| Informative genes not in pathway cluster (Table 2) of top 22 pathways for | BC | ER | PR |
| ≥ 1 | 0.41 | 0.35 | 0.20 |
| ≥ 2 | 0.33 | 0.28 | 0.13 |
| ≥ 5 | 0.24 | 0.24 | 0.11 |
| BC | ER | PR | |
| ≥ 1 | 1.00 | 1.00 | 0.75 |
| ≥ 2 | 1.00 | 0.75 | 0.63 |
| ≥ 5 | 0.63 | 0.63 | 0.25 |
| Informative genes not in pathway cluster (Table 2) of top 22 pathways for | BC | ER | PR |
| ≥ 1 | 0.50 | 0.22 | 0.16 |
| ≥ 2 | 0.44 | 0.38 | 0.25 |
| ≥ 5 | 0.34 | 0.28 | 0.13 |
| BC | ER | PR | |
| ≥ 1 | 0.88 | 0.75 | 0.63 |
| ≥ 2 | 0.75 | 0.63 | 0.50 |
| ≥ 5 | 0.50 | 0.50 | 0.25 |
BC = breast cancer, ER = estrogen receptor, PR = progesterone receptor
Breast Cancer Citations
| In citation | Not in citation | p-value | |
| Genes in pathway cluster (Table 2) | 8 | 4 | |
| Informative genes not in pathway cluster (Table 2) of top 22 pathways | 20 | 25 | 0.149 |
| In citation | Not in citation | p-value | |
| Genes in pathway cluster (Table 2) | 8 | 0 | |
| Informative genes not in pathway cluster (Table 2) of top 22 pathways | 19 | 27 | 0.002 |
| In citation | Not in citation | p-value | |
| Genes in pathway cluster (Table 2) | 7 | 1 | |
| Informative genes not in pathway cluster (Table 2) of top 22 pathways | 16 | 16 | 0.061 |
Estrogen Receptor Citations
| In citation | Not in citation | p-value | |
| Genes in pathway cluster (Table 2) | 8 | 4 | |
| Informative genes not in pathway cluster (Table 2) of top 22 pathways | 16 | 30 | 0.048 |
| In citation | Not in citation | p-value | |
| Genes in pathway cluster (Table 2) | 8 | 0 | |
| Informative genes not in pathway cluster (Table 2) of top 22 pathways | 15 | 30 | 0.0006 |
| In citation | Not in citation | p-value | |
| Genes in pathway cluster (Table 2) | 6 | 2 | |
| Informative genes not in pathway cluster (Table 2) of top 22 pathways | 7 | 25 | 0.0084 |
Figure 4Links between GATA3 and CARM1 pathways using HPRD. The connection between genes in GATA3 and CARM1 pathways using information obtained from HPRD.
Shortest Path between GATA3 and other Genes in the Top 22 Pathways (without overlap with GATA3 pathway)
| 2 | 6 [MUC1, SMAD8, IKK-alpha, HDAC4, HNF3-alpha, GATA-1] | 7 | |
| 2 | 1 [MUC1] | 6 (subset of the 7 above) | |
| 2 | 1 [IKK-alpha] | 0 | |
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 3 | |||
| 4 | |||
| 4 | |||
| 4 | |||
| 4 | |||
| 5 | |||
| 5 | |||
| 6 | |||
| 6 | |||
| 6 | |||
| Infinity |
*shown only for the shortest distance
Shortest Path between CA12 and other Genes in the Top 22 Pathways (without overlap with Nitrogen Metabolism pathway)
| 3 | 1 (HIF-1, NCOA1) | 4 | |
| 3 | 4 (HIF-1 + MAPK3, Beta-catenin, STAT5B, MAPK1) | 6 | |
| 3 | 2 (HIF-1 + MAPK1, MPAK3) | 3 | |
| 4 | |||
| 4 | |||
| 4 | |||
| 5 | |||
| 5 | |||
| 5 | |||
| 6 | |||
| 6 | |||
| 6 | |||
| 6 | |||
| 7 | |||
| 7 | |||
| 7 | |||
| 7 | |||
| 9 | |||
| Infinity |
*shown only for the shortest distance