| Literature DB >> 18721468 |
Stephen W Tanner1, Pankaj Agarwal.
Abstract
BACKGROUND: Microarray experiments measure changes in the expression of thousands of genes. The resulting lists of genes with changes in expression are then searched for biologically related sets using several divergent methods such as the Fisher Exact Test (as used in multiple GO enrichment tools), Parametric Analysis of Gene Expression (PAGE), Gene Set Enrichment Analysis (GSEA), and the connectivity map.Entities:
Mesh:
Year: 2008 PMID: 18721468 PMCID: PMC2543031 DOI: 10.1186/1471-2105-9-348
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Cumulative distribution functions of Pearson enrichment scores for two gene sets across the CMAP corpus. There are clear differences in the variance of these two distributions of two gene sets. However, the empirical distribution of scores across the corpus fits a normal distribution well for most gene sets. For each gene set, we calculated a p-value based on the specific normal distribution associated with that gene set.
Figure 2Precision (1-FDR) of gene set queries methods with Pearson based p-values calibrated on GEO and CMAP compared to permutation based p-values and no permutation. Also included, for comparison are GSEA q-value (based on FDR) and GSEA p-value (based on FWER). We compared Precision across the various queries for threshold of gene sets (N) plotted on the x-axis (as described in Methods).
Figure 3Comparison of query accuracy, on the evaluation set, with p-values calibrated against the GEO corpus using Pearson correlation. Queries based upon signed p-value were more effective than just p-value. Cyber-T was also extremely effective especially for N > 60. Using log fold changes as gene values was least effective consistently, perhaps due to the noise in the log fold change for genes with low expression.
Figure 4Comparison of query accuracy, for the evaluation set, using various enrichment models. This is based on the GEO corpus using the Cyber-T. PAGE produces the best results. Pearson is a very close second; an additional advantage of Pearson correlation is that it is effective for queries against vectors and gene sets.
Top-scoring differentially expressed gene sets found for pairs of related microarray experiments (from the category in column 1 above) using Geneva.
| Muscle | 1 | 7.89E-10 | Glycolysis_and_Gluconeogenesis | GenMAPP |
| Muscle | 2 | 6.93E-09 | Costamere: CC | GOA |
| Muscle | 3 | 4.37E-07 | superpathway of glycolysis, pyruvate dehydrogenase, TCA, and glyoxylate bypass | HumanCyc |
| Muscle | 4 | 4.86E-07 | Contractile Fiber Part: CC | GOA |
| Muscle | 5 | 6.54E-07 | Z Disc: CC | GOA |
| Muscle | 6 | 9.20E-07 | Small Leucine-Rich Proteoglycan (SLRP) Molecules | BioCarta |
| Muscle | 7 | 1.69E-06 | aspartate degradation II | HumanCyc |
| Muscle | 8 | 4.80E-06 | Myofibril: CC | GOA |
| Muscle | 9 | 4.87E-06 | gluconeogenesis | HumanCyc |
| Muscle | 10 | 5.38E-06 | Contractile Fiber: CC | GOA |
| Malaria | 1 | 1.60E-08 | Immune Response-Regulating Signal Transduction: BP | GOA |
| Malaria | 2 | 1.60E-08 | Immune Response-Regulating Cell Surface Receptor Signaling Pathway: BP | GOA |
| Malaria | 3 | 1.60E-08 | Immune Response-Activating Signal Transduction: BP | GOA |
| Malaria | 4 | 1.60E-08 | Immune Response-Activating Cell Surface Receptor Signaling Pathway: BP | GOA |
| Malaria | 5 | 1.60E-08 | Antigen Receptor-Mediated Signaling Pathway: BP | GOA |
| Malaria | 6 | 1.67E-08 | T Cell Receptor Signaling Pathway: BP | GOA |
| Malaria | 7 | 1.76E-08 | Regulation Of T Cell Receptor Signaling Pathway: BP | GOA |
| Malaria | 8 | 2.69E-08 | Regulation Of Antigen Receptor-Mediated Signaling Pathway: BP | GOA |
| Malaria | 9 | 2.64E-07 | Activation Of Csk By cAMP-Dependent Protein Kinase Inhibits Signaling Through The T Cell Receptor | BioCarta |
| Malaria | 10 | 5.02E-07 | Locomotion: BP | GOA |
| AD | 1 | 1.13E-11 | Proton-Transporting Two-Sector ATPase Complex: CC | GOA |
| AD | 2 | 1.13E-11 | Hydrogen-Translocating V-Type ATPase Complex: CC | GOA |
| AD | 3 | 9.31E-11 | Long-Term Memory: BP | GOA |
| AD | 4 | 4.91E-10 | aspartate degradation II | HumanCyc |
| AD | 5 | 1.78E-09 | Proton-Transporting ATP Synthase Complex: CC | GOA |
| AD | 6 | 1.78E-09 | Proton-Transporting ATP Synthase Complex (sensu Eukaryota): CC | GOA |
| AD | 7 | 1.78E-09 | Hydrogen-Translocating F-Type ATPase Complex: CC | GOA |
| AD | 8 | 1.95E-09 | Hydrogen Ion Transporter Activity: MF | GOA |
| AD | 9 | 6.44E-09 | Monovalent Inorganic Cation Transporter Activity: MF | GOA |
| AD | 10 | 7.75E-09 | Ubiquinol-Cytochrome-C Reductase Activity: MF | GOA |
Pearson correlation was used, and was calibrated against a corpus of experiments from GEO (see Methods).
The p-value reported is the product of the p-values for the two related experiments.
Treatments considered unrelated for the purpose of evaluation experiments.
| Muscle | Malaria |
| Muscle | Glioma |
| Malaria | Glioma |
| Malaria | Obesity |
| AD | Malaria |
| AD | Obesity |
| Glioma | Obesity |
Each treatment has two experiments, for a total of 28 unrelated experiment pairs.