| Literature DB >> 15941488 |
Seon-Young Kim1, David J Volsky.
Abstract
BACKGROUND: Gene set enrichment analysis (GSEA) is a microarray data analysis method that uses predefined gene sets and ranks of genes to identify significant biological changes in microarray data sets. GSEA is especially useful when gene expression changes in a given microarray data set is minimal or moderate.Entities:
Mesh:
Year: 2005 PMID: 15941488 PMCID: PMC1183189 DOI: 10.1186/1471-2105-6-144
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Distribution pattern of fold change values in a microarray data and determination of minimal gene set size in PAGE. A and B. A Histogram (A) and a quantile-quantile (Q-Q) plot against standard normal distribution (B) of fold change values from microarray data set. The diabetic muscle microarray data set [25] was analyzed as described in Methods section. The fold change values between normal and patient groups were calculated and used to draw histogram (A) and Q-Q plot (B). C and D. A Histrogram (C) and a Q-Q plot (D) of an average of 10 randomly sampled values from fold change values of diabetic muscle microarray data. Kolmogorov-Smirnov normality test was performed with a null hypothesis that distribution is normal. For the distribution of fold change values (A and B), the null hypothesis was rejected (D = 0.08, p-value < 2.2e-16). For the distribution of an average of 10 randomly sampled values from fold change values (C and D), the null hypothesis was not rejected (D = 0.0239, p-value = 0.1783).
Comparison of PAGE with GSEA
| Gene Set | Z score | p-value | Gene Set | ES | p-value |
| OXPHOS_HG-U133A | -10.5835 | <1.0E-11 | OXPHOS_HG-U133A | 346.8827 | 0.003 |
| human_mitoDB_6_2002_HG-U133A | -6.7213 | 1.81E-11 | human_mitoDB_6_2002_HG-U133A | 215.9424 | 0.091 |
| mitochondr_HG-U133A | -6.4761 | 9.46E-11 | mitochondr_HG-U133A | 207.9381 | 0.087 |
| MAP00190_Oxidative_phosphorylation | -4.5745 | 4.78E-05 | c20_U133 | 181.1569 | 0.062 |
| c20_U133 | -3.7461 | 0.0002 | MAP00190_Oxidative_phosphorylation | 148.9061 | 0.084 |
| c25_U133 | -2.7617 | 0.0058 | c22_U133 | 142.9006 | 0.028 |
| c21_U133 | -2.1116 | 0.0347 | c29_U133 | 131.4732 | 0.026 |
Application of PAGE to different Affymetrix probe level analysis methods
| MAS5 | MBEI | RMA | ||||||
| Gene Set | Z score | p-value | Gene Set | Z score | p-value | Gene Set | Z score | p-value |
| Inflammatory Response Pathway | 7.5051 | <1.0E-12 | Inflammatory Responses | 7.5195 | 5.51E-14 | Inflammatory Responses | 7.3487 | 2.00E-13 |
| Eicosanoid Synthesis | 6.5925 | <1.0E-12 | Eicosanoid Synthesis | 3.8957 | 9.79E-05 | Eicosanoid Synthesis | 4.1557 | 3.24E-05 |
| Complement Activation Classical | 3.0382 | 0.0024 | Complement Activation | 3.5487 | 0.0004 | TGF-β Signaling Pathway | 3.0438 | 0.0023 |
| Nucleotide Metabolism | 2.0536 | 0.0400 | Nucleotide Metabolism | 2.4207 | 0.0155 | Complement Activation Classical | 2.9402 | 0.0033 |
| TGF-β Signaling Pathway | 1.9758 | 0.0482 | TGF-β Signaling Pathway | 2.2379 | 0.0250 | Nucleotide Metabolism | 2.5867 | 0.0097 |
| MAPK Cascade | -2.2529 | 0.0243 | Glutamate Metabolism | -1.7249 | 0.0846 | GPCRs Class A Rhodopsin-like | -1.9907 | 0.0465 |
| Translation Factors | -2.3124 | 0.0208 | MAPK Cascade | -1.8685 | 0.0617 | MAPK Cascade | -2.1306 | 0.0331 |
| Krebs-TCA Cycle | -3.2551 | 0.0011 | Proteasome Degradation | -1.9605 | 0.0499 | Krebs-TCA Cycle | -2.6076 | 0.0091 |
| Glycogen Metabolism | -3.3488 | 0.0008 | Krebs-TCA Cycle | -2.1182 | 0.0342 | Proteasome Degradation | -2.7822 | 0.0054 |
| Proteasome Degradation | -3.7468 | 0.0002 | Glycogen Metabolism | -2.8330 | 0.0046 | Glycogen Metabolism | -2.9328 | 0.0034 |
| Fatty Acid Degradation | -3.8286 | 0.0001 | Fatty Acid Degradation | -2.8570 | 0.0043 | Fatty Acid Degradation | -3.4024 | 0.0007 |
| Nuclear Receptors | -4.2579 | 2.06E-05 | Nuclear Receptors | -3.4686 | 0.0005 | Nuclear Receptors | -4.1600 | 3.18E-05 |
| Electron Transport Chain | -6.3789 | 1.78E-10 | Electron Transport Chain | -4.2009 | 2.66E-05 | Electron Transport Chain | -5.1177 | 3.09E-07 |
Comparison of PAGE results from data sets produced using different microarray platforms
| U95A | U133A | Agilent | ||||||
| Gene Set | Z score | p-value | Gene Set | Z score | p-value | Gene Set | Z score | p-value |
| Complement Activation Classical | 3.0477 | 0.0023 | Krebs TCA Cycle | 5.0505 | 4.41E-07 | tRNA Synthetases | 3.1714 | 0.0015 |
| Krebs-TCA Cycle | 2.6771 | 0.0074 | Cell Cycle | 3.4088 | 0.0007 | Krebs TCA Cycle | 3.1291 | 0.0018 |
| Nuclear Receptors | 2.6562 | 0.0079 | Translation Factors | 2.8213 | 0.0048 | Proteasome Degradation | 1.9894 | 0.0467 |
| Calcium Channels | 1.6956 | 0.0900 | Nuclear Receptors | 2.3404 | 0.0193 | Glycolysis and Gluconeogenesis | 1.5679 | 0.1169 |
| Apoptosis | 1.4085 | 0.1590 | Complement Activation Classical | 2.2861 | 0.0223 | Steroid Biosynthesis | 1.5109 | 0.1308 |
| TGF Beta Signaling Pathway | -1.8580 | 0.0632 | Matrix Metalloproteinases | -2.2837 | 0.0224 | Ribosomal Proteins | -2.1275 | 0.0334 |
| Inflammatory Response Pathway | -2.2590 | 0.0239 | TGF Beta Signaling Pathway | -3.3762 | 0.0007 | Glycogen Metabolism | -2.4904 | 0.0128 |
| Glycogen Metabolism | -2.6981 | 0.0070 | Inflammatory Response Pathway | -3.5668 | 0.0004 | TGF Beta Signaling Pathway | -2.6910 | 0.0071 |
| Cholesterol Biosynthesis | -4.6988 | 2.62E-06 | Cholesterol Biosynthesis | -5.8165 | 6.01E-09 | Cholesterol Biosynthesis | -4.1642 | 3.12E-05 |
| Gap Junction Proteins-Connexins | -5.7792 | 7.51E-09 | Gap Junction Proteins-Connexins | -6.2689 | 3.64E-10 | Gap Junction Proteins-Connexins | -6.6162 | 3.69E-11 |
Figure 2Comparison of different microarray data sets at gene set level shows better congruence than comparison at gene level. A. Comparison of two different microarray data sets at gene level. Two microarray data sets, GDS 287 (Muscle function and aging-Male) and GDS 472 (Muscle function and aging-Female) were analyzed, significantly changed genes (|fold change| > 1.5 and t-test p < 0.05) from each data set were selected, and the percentage of common gene lists for both data sets was calculated. B. Comparison at gene set level. We first performed PAGE on the two microarray data sets, selected significant gene sets (p < 0.05), and calculated percentage of common gene sets for both data sets.
Comparison of two microarray data sets at gene set level
| GDS 287 | GDS 472 | |||
| Gene Set | Z score | p-value | Z score | p-value |
| mRNA processing | 4.0322 | 0.0001 | 3.2288 | 0.0012 |
| cell cycle | 3.0715 | 0.0021 | 3.1127 | 0.0019 |
| mRNA catabolism | 2.2233 | 0.0262 | 2.9762 | 0.0029 |
| mRNA splicing | 5.9613 | 3.00E-09 | 2.8041 | 0.0050 |
| nuclear mRNA splicing_via spliceosome | 6.3123 | 2.75E-10 | 2.6642 | 0.0077 |
| regulation of cyclin dependent protein kinase activity | 2.1343 | 0.0328 | 2.5703 | 0.0102 |
| G1 phase of mitotic cell cycle | 2.5482 | 0.0108 | 2.5361 | 0.0112 |
| negative regulation of cell proliferation | 3.0116 | 0.0026 | 2.0812 | 0.0374 |
| cholesterol metabolism | 2.2758 | 0.0229 | 2.0312 | 0.0422 |
| blood coagulation | -2.6705 | 0.0076 | -2.0119 | 0.0442 |
| protein folding | -2.1739 | 0.0297 | -2.1045 | 0.0353 |
| regulation of blood pressure | -3.0984 | 0.0019 | -2.3543 | 0.0186 |
| carboxylic acid transport | -2.9366 | 0.0033 | -2.8088 | 0.0050 |
| signal transduction | -2.5917 | 0.0095 | -3.0115 | 0.0026 |
| glycolysis | -4.0631 | 4.84E-05 | -5.3157 | 1.06E-07 |
| tricarboxylic acid cycle | -2.2914 | 0.0219 | -6.3019 | 2.94E-10 |
| electron transport | -3.9150 | 0.0001 | -6.9442 | 3.81E-12 |