| Literature DB >> 19259416 |
Irina Dinu1, Qi Liu, John D Potter, Adeniyi J Adewale, Gian S Jhangri, Thomas Mueller, Gunilla Einecke, Konrad Famulsky, Philip Halloran, Yutaka Yasui.
Abstract
Gene-set analysis of microarray data evaluates biological pathways, or gene sets, for their differential expression by a phenotype of interest. In contrast to the analysis of individual genes, gene-set analysis utilizes existing biological knowledge of genes and their pathways in assessing differential expression. This paper evaluates the biological performance of five gene-set analysis methods testing "self-contained null hypotheses" via subject sampling, along with the most popular gene-set analysis method, Gene Set Enrichment Analysis (GSEA). We use three real microarray analyses in which differentially expressed gene sets are predictable biologically from the phenotype. Two types of gene sets are considered for this empirical evaluation: one type contains "truly positive" sets that should be identified as differentially expressed; and the other type contains "truly negative" sets that should not be identified as differentially expressed. Our evaluation suggests advantages of SAM-GS, Global, and ANCOVA Global methods over GSEA and the other two methods.Entities:
Year: 2008 PMID: 19259416 PMCID: PMC2623289 DOI: 10.4137/cin.s867
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
P-values of the 17 pathways that include CDKN2A gene by the six methods (with p-value or FDR < 0.05 in bold).
| GSEA
| |||||||
|---|---|---|---|---|---|---|---|
| Gene set | SAM-GS | Global | ANCOVA global | Tian | Tomfohr | P | FDR |
| arfPathway | < | 0.284 | 1.000 | 0.143 | 0.807 | ||
| breast_cancer_estrogen_signalling | < | 0.506 | 0.750 | 0.443 | 0.824 | ||
| cell_cycle_arrest | < | 0.053 | 0.056 | 0.174 | 1.000 | 0.904 | 0.892 |
| cellcyclePathway | < | < | 0.926 | 1.000 | 0.371 | 0.638 | |
| g1Pathway | < | < | < | 0.476 | 1.000 | 0.190 | 0.549 |
| SA_G1_AND_S_PHASES | < | 0.164 | 1.000 | 0.163 | 0.824 | ||
| SIG_PIP3_signaling_in_B_lymphocytes | < | 0.898 | 1.000 | 0.497 | 0.815 | ||
| SIG_PIP3SIGINCARDIACMYOCTES | < | 0.712 | 1.000 | 0.285 | 0.605 | ||
| ST_Phosphoinositide_3_Kinase_Pathway | < | 0.822 | 1.000 | 0.122 | 0.743 | ||
| SIG_InsulinReceptorPathwayIn CardiacMyocytes | 0.218 | 0.999 | 0.249 | 0.608 | |||
| p53_signalling | 0.002 | 0.042 | 0.036 | 0.010 | 1.000 | 0.038 | 0.594 |
| ST_Integrin_Signaling_Pathway | 1.000 | 0.175 | 0.515 | ||||
| HTERT_UP | 0.056 | 0.056 | 0.164 | 1.000 | 0.510 | 0.703 | |
| drug_resistance_and_metabolism | 0.104 | 0.102 | 0.822 | 1.000 | 0.526 | 0.789 | |
| CR_CELL_CYCLE | 0.116 | 0.109 | 0.600 | 1.000 | 0.623 | 0.757 | |
| Cell_Cycle | 0.103 | 0.091 | 0.160 | 1.000 | 0.720 | 0.852 | |
| PROLIF_GENES | 0.097 | 0.070 | 0.246 | 1.000 | 0.404 | 0.645 | |
P-values of the 18 pathways that include PTEN gene by the six methods (with p-value or FDR < 0.05 in bold).
| GSEA
| |||||||
|---|---|---|---|---|---|---|---|
| Gene set | SAM-GS | Global | ANCOVA global | Tian | Tomfohr | P | FDR |
| igf1mtorPathway | < | 0.258 | 0.808 | 0.141 | 0.987 | ||
| ptenPathway | <0.001 | 0.008 | 0.006 | 0.012 | 1.000 | 0.000 | 0.106 |
| SA_PTEN_PATHWAY | <0.001 | 0.003 | 0.008 | 0.014 | 1.000 | 0.023 | 0.734 |
| tumor_supressor | < | 0.876 | 1.000 | 0.482 | 0.956 | ||
| INS | 0.002 | 0.031 | 0.022 | 0.004 | 1.000 | 0.038 | 0.664 |
| eif4Pathway | 0.280 | 0.987 | 0.717 | 0.979 | |||
| mtorPathway | 0.113 | 0.123 | 0.302 | 1.000 | 0.204 | 0.974 | |
| SIG_PIP3_signaling_in_B_lymphocytes | 0.061 | 0.058 | 0.222 | 1.000 | 0.548 | 0.936 | |
| metPathway | 0.806 | 1.000 | 0.732 | 0.977 | |||
| ST_Phosphoinositide_3_Kinase_Pathway | 0.080 | 0.066 | 0.276 | 1.000 | 0.116 | 1.000 | |
| SIG_CHEMOTAXIS | 0.085 | 0.073 | 0.588 | 1.000 | 0.721 | 0.964 | |
| SIG_InsulinReceptorPathwayIn CardiacMyocytes | 0.076 | 0.072 | 0.120 | 1.000 | 1.000 | ||
| ST_Integrin_Signaling_Pathway | 0.076 | 0.066 | 0.538 | 1.000 | 0.915 | 0.976 | |
| SIG_PIP3SIGINCARDIACMYOCTES | 0.092 | 0.318 | 0.320 | 0.130 | 1.000 | 0.216 | 1.000 |
| cell_proliferation | 0.127 | 0.170 | 0.171 | 0.950 | 1.000 | 0.409 | 0.894 |
| PROLIF_GENES | 0.143 | 0.222 | 0.225 | 0.310 | 1.000 | 0.377 | 0.914 |
| CR_SIGNALLING | 0.154 | 0.299 | 0.302 | 0.042 | 1.000 | 0.278 | 1.000 |
| breast_cancer_estrogen_signalling | 0.207 | 0.438 | 0.423 | 0.472 | 1.000 | 0.780 | 1.000 |
P-values of the 25 pathways that include TP53 gene by the six methods (with p-value or FDR < 0.05 in bold).
| GSEA
| |||||||
|---|---|---|---|---|---|---|---|
| Gene set | SAM-GS | Global | ANCOVA global | Tian | Tomfohr | P | FDR |
| atmPathway | < | 0.204 | 1.000 | 0.534 | 0.928 | ||
| Cell_Cycle | < | < | < | 0.359 | 0.120 | 0.620 | |
| DNA_DAMAGE_SIGNALLING | < | < | < | 0.928 | 0.207 | 0.339 | 1.000 |
| g1Pathway | < | < | < | 0.428 | 0.360 | 0.948 | |
| g2Pathway | < | < | < | 0.145 | 0.476 | 0.882 | |
| p53hypoxiaPathway | < | < | 1.000 | 0.323 | 1.000 | ||
| p53Pathway | < | < | < | 0.321 | 0.065 | 1.000 | |
| radiation_sensitivity | < | 0.006 | 0.270 | 1.000 | 0.342 | 1.000 | |
| SA_G1_AND_S_PHASES | < | < | < | 0.346 | 0.035 | 0.136 | 1.000 |
| drug_resistance_and_metabolism | < | 0.020 | 1.000 | 0.128 | 1.000 | ||
| p53_signalling | < | 0.818 | 1.000 | 0.855 | 0.956 | ||
| CR_CELL_CYCLE | < | 0.399 | 0.340 | 0.886 | |||
| chemicalPathway | 0.052 | 0.980 | 1.000 | 0.291 | 0.899 | ||
| breast_cancer_estrogen_signalling | 0.111 | 0.123 | 0.564 | 1.000 | 0.540 | 0.904 | |
| ST_Fas_Signaling_Pathway | 0.103 | 0.090 | 0.114 | 1.000 | 0.485 | 0.964 | |
| atrbrcaPathway | 0.049 | 0.152 | 0.619 | 0.160 | 0.678 | ||
| mitochondr | 0.065 | 0.104 | 0.095 | 1.000 | 0.758 | 0.976 | |
| CR_DEATH | 0.068 | 0.221 | 0.229 | 0.140 | 1.000 | 0.376 | 0.935 |
| cell_cycle_checkpoint | 0.122 | 0.119 | 0.106 | 0.855 | 0.420 | ||
| tumor_supressor | 0.187 | 0.173 | 0.177 | 1.000 | 0.167 | 0.883 | |
| RAP_UP | 0.248 | 0.308 | 0.277 | 0.120 | 1.000 | 0.178 | 0.932 |
| arfPathway | 0.300 | 0.469 | 0.431 | 0.384 | 1.000 | 0.098 | 0.673 |
| ST_JNK_MAPK_Pathway | 0.579 | 0.707 | 0.705 | 0.362 | 1.000 | 0.790 | 0.951 |
| telPathway | 0.674 | 0.719 | 0.713 | 0.790 | 1.000 | 0.563 | 0.919 |
| tidPathway | 0.767 | 0.803 | 0.781 | 0.582 | 1.000 | 0.597 | 0.935 |
P-values of the 17 pathways that include PRKACB gene by the six methods (with p-value or FDR < 0.05 in bold).
| GSEA
| |||||||
|---|---|---|---|---|---|---|---|
| Gene set | SAM-GS | Global | ANCOVA global | Tian | Tomfohr | P | FDR |
| crebPathway | 0.022 | 0.026 | 0.032 | 0.006 | 1.000 | 0.009 | 0.409 |
| gpcrPathway | 0.057 | 0.058 | 0.122 | 1.000 | 0.495 | ||
| pparaPathway | 0.070 | 0.087 | 0.083 | 1.000 | 0.079 | 0.471 | |
| nos1Pathway | 0.119 | 0.138 | 0.132 | 0.350 | 1.000 | 0.428 | 0.649 |
| badPathway | 0.124 | 0.148 | 0.134 | 0.144 | 1.000 | 0.082 | 0.617 |
| gata3Pathway | 0.130 | 0.224 | 0.204 | 0.446 | 1.000 | 0.292 | 0.605 |
| amiPathway | 0.131 | 0.133 | 0.119 | 0.998 | 0.285 | 0.533 | |
| cskPathway | 0.131 | 0.133 | 0.119 | 0.998 | 0.285 | 0.525 | |
| chrebpPathway | 0.135 | 0.209 | 0.193 | 0.286 | 1.000 | 0.353 | 0.597 |
| no1Pathway | 0.166 | 0.260 | 0.283 | 1.000 | 0.078 | 0.557 | |
| ck1Pathway | 0.243 | 0.372 | 0.373 | 0.428 | 1.000 | 0.291 | 0.564 |
| mprPathway | 0.243 | 0.285 | 0.263 | 0.886 | 0.999 | 0.591 | 0.760 |
| mcalpainPathway | 0.283 | 0.383 | 0.369 | 0.060 | 1.000 | 0.146 | 0.509 |
| shh_lisa | 0.317 | 0.361 | 0.362 | 0.880 | 1.000 | 0.431 | 0.656 |
| CR_PROTEIN_MOD | 0.368 | 0.405 | 0.392 | 0.362 | 1.000 | 0.395 | 0.647 |
| nfatPathway | 0.424 | 0.534 | 0.556 | 0.352 | 1.000 | 0.198 | 0.529 |
| vipPathway | 0.449 | 0.546 | 0.569 | 1.000 | 0.309 | 0.601 | |
P-values of the 18 pathways that include PRKAR2B gene by the six methods (with p-value or FDR < 0.05 in bold).
| GSEA
| |||||||
|---|---|---|---|---|---|---|---|
| Gene set | SAM-GS | Global | ANCOVA global | Tian | Tomfohr | P | FDR |
| INS | 0.002 | 0.031 | 0.022 | 0.004 | 1.000 | 0.038 | 0.664 |
| amiPathway | 0.054 | 1.000 | 0.117 | 0.777 | |||
| cskPathway | 0.054 | 1.000 | 0.117 | 0.743 | |||
| mprPathway | 0.128 | 0.180 | 0.161 | 0.148 | 0.984 | 0.138 | 0.756 |
| gpcrPathway | 0.217 | 0.202 | 0.169 | 0.386 | 1.000 | 0.063 | 0.661 |
| no1Pathway | 0.355 | 0.270 | 0.248 | 0.528 | 1.000 | 0.642 | 0.931 |
| nfatPathway | 0.386 | 0.335 | 0.347 | 0.396 | 1.000 | 0.898 | 0.948 |
| vipPathway | 0.403 | 0.350 | 0.328 | 0.462 | 1.000 | 0.394 | 0.970 |
| nos1Pathway | 0.413 | 0.454 | 0.431 | 0.926 | 1.000 | 0.153 | 0.744 |
| mcalpainPathway | 0.621 | 0.629 | 0.601 | 0.190 | 1.000 | 0.625 | 1.000 |
| pparaPathway | 0.665 | 0.674 | 0.668 | 0.130 | 1.000 | 0.549 | 0.934 |
| shh_lisa | 0.671 | 0.689 | 0.661 | 0.294 | 1.000 | 0.496 | 0.963 |
| ck1Pathway | 0.704 | 0.758 | 0.762 | 0.174 | 1.000 | 0.123 | 0.793 |
| crebPathway | 0.732 | 0.798 | 0.806 | 0.708 | 1.000 | 0.119 | 0.756 |
| chrebpPathway | 0.745 | 0.809 | 0.804 | 0.232 | 1.000 | 0.155 | 0.753 |
| CR_PROTEIN_MOD | 0.760 | 0.805 | 0.788 | 0.852 | 1.000 | 0.608 | 0.920 |
| badPathway | 0.830 | 0.876 | 0.854 | 0.480 | 1.000 | 0.747 | 0.948 |
| gata3Pathway | 0.921 | 0.883 | 0.880 | 0.522 | 1.000 | 0.555 | 0.933 |
P-values of the 25 pathways that include RAC1 gene by the six methods (with p-value or FDR < 0.05 in bold).
| GSEA
| |||||||
|---|---|---|---|---|---|---|---|
| Gene set | SAM-GS | Global | ANCOVA global | Tian | Tomfohr | P | FDR |
| Raccycd Pathway | < | 0.560 | 0.997 | 0.192 | 0.968 | ||
| actinY Pathway | 0.074 | 0.067 | 0.092 | 0.970 | 0.119 | 1.000 | |
| NFKB_INDUCED | 0.082 | 0.082 | 0.054 | 1.000 | 0.327 | 1.000 | |
| Ptdins Pathway | 0.058 | 0.058 | 0.069 | 0.004 | 1.000 | 0.006 | 0.405 |
| Creb Pathway | 0.066 | 0.058 | 0.069 | 0.922 | 0.990 | 0.290 | 0.873 |
| edg1 Pathway | 0.134 | 0.155 | 0.148 | 0.558 | 1.000 | 0.658 | 0.934 |
| Fml ppathway | 0.135 | 0.200 | 0.193 | 0.006 | 1.000 | 0.023 | 0.578 |
| Bcr Pathway | 0.278 | 0.359 | 0.356 | 0.032 | 1.000 | 0.013 | 0.309 |
| Arf Pathway | 0.300 | 0.469 | 0.431 | 0.384 | 1.000 | 0.098 | 0.673 |
| Cell_motility | 0.369 | 0.430 | 0.455 | 0.456 | 1.000 | 0.205 | 0.891 |
| G13_Signaling_Pathway | 0.386 | 0.425 | 0.449 | 0.298 | 1.000 | 0.246 | 1.000 |
| Wnt_Signaling | 0.511 | 0.539 | 0.554 | 0.548 | 1.000 | 0.461 | 0.850 |
| Mapk Pathway | 0.609 | 0.683 | 0.649 | 0.750 | 1.000 | 0.940 | 0.981 |
| SA_B_CELL_RECEPTOR_COMPLEXES | 0.615 | 0.604 | 0.595 | 0.382 | 1.000 | 0.761 | 0.958 |
| cell_adhesion | 0.679 | 0.668 | 0.675 | 0.264 | 1.000 | 0.964 | 0.970 |
| Tcr Pathway | 0.709 | 0.779 | 0.772 | 0.036 | 1.000 | 0.453 | 0.853 |
| p38mapk Pathway | 0.715 | 0.762 | 0.765 | 0.376 | 1.000 | 0.222 | 0.996 |
| Ucalpain Pathway | 0.750 | 0.760 | 0.758 | 0.488 | 1.000 | 0.825 | 0.983 |
| pyk2 Pathway | 0.779 | 0.863 | 0.846 | 0.230 | 1.000 | 0.013 | 0.477 |
| rac1 Pathway | 0.785 | 0.720 | 0.707 | 0.264 | 1.000 | 0.697 | 0.965 |
| CR_CAM | 0.827 | 0.880 | 0.884 | 0.448 | 1.000 | 0.917 | 0.968 |
| ST_MONOCYTE_AD_PATHWAY | 0.884 | 0.889 | 0.901 | 0.878 | 1.000 | 0.359 | 0.851 |
| Ras Pathway | 0.911 | 0.921 | 0.912 | 0.642 | 1.000 | 0.895 | 1.000 |
| at1rPathway | 0.954 | 0.971 | 0.974 | 0.624 | 1.000 | 0.165 | 0.887 |
| nkcellsPathway | 0.974 | 0.992 | 0.986 | 0.872 | 1.000 | 0.939 | 0.991 |
ROC analysis comparing the six methods for each of the three microarray datasets.
| Phenotype | Gene-set analysis method | Area under the ROC curve (95% Confidence Interval) |
|---|---|---|
| SAM-GS | 0.993 (0.977, 1.000) | |
| Global | 0.945 (0.873, 1.000) | |
| ANCOVA Global | 0.948 (0.879, 1.000) | |
| Tian | 0.360 (0.163, 0.557) | |
| Tomfohr | 0.545 (0.396, 0.694) | |
| GSEA p-value | 0.395 (0.195, 0.594) | |
| GSEA FDR | 0.180 (0.038, 0.321) | |
| SAM-GS | 0.918 (0.822, 1.000) | |
| Global | 0.847 (0.715, 0.979) | |
| ANCOVA Global | 0.832 (0.691, 0.972) | |
| Tian | 0.506 (0.309, 0.703) | |
| Tomfohr | 0.528 (0.435, 0.620) | |
| GSEA p-value | 0.500 (0.301, 0.699) | |
| GSEA FDR | 0.265 (0.086, 0.445) | |
| SAM-GS | 0.854 (0.752, 0.957) | |
| Global | 0.828 (0.714, 0.942) | |
| ANCOVA Global | 0.831 (0.718, 0.944) | |
| Tian | 0.625 (0.464, 0.785) | |
| Tomfohr | 0.647 (0.525, 0.769) | |
| GSEA p-value | 0.529 (0.361, 0.697) | |
| GSEA FDR | 0.482 (0.315, 0.649) |
These methods have significantly smaller area under the ROC curve compared to SAM-GS, Global, or ANCOVA Global methods (p < 0.05).