| Literature DB >> 21943316 |
Seungyeoun Lee1, Jinheum Kim, Sunho Lee.
Abstract
BACKGROUND: Many gene-set analysis methods have been previously proposed and compared through simulation studies and analysis of real datasets for binary phenotypes. We focused on the survival phenotype and compared the performances of Gene Set Enrichment Analysis (GSEA), Global Test (GT), Wald-type Test (WT) and Global Boost Test (GBST) methods in a simulation study and on two ovarian cancer data sets. We considered two versions of GSEA by allowing different weights: GSEA1 uses equal weights, yielding results similar to the Kolmogorov-Smirnov test; while GSEA2's weights are based on the correlation between genes and the phenotype.Entities:
Mesh:
Year: 2011 PMID: 21943316 PMCID: PMC3196970 DOI: 10.1186/1471-2105-12-377
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
The estimated size of the five tests for p = 200 based on 500 iterations
| GSEA1 | GSEA2 | GT | WT | GBST | |||
|---|---|---|---|---|---|---|---|
| 80 | 20 | 0.0 | 0.058 | 0.052 | 0.050 | 0.050 | 0.050 |
| 0.1 | 0.060 | 0.052 | 0.058 | 0.058 | 0.062 | ||
| 0.3 | 0.034 | 0.054 | 0.054 | 0.066 | 0.046 | ||
| 0.5 | 0.062 | 0.042 | 0.040 | 0.032 | 0.052 | ||
| 50 | 0.0 | 0.050 | 0.048 | 0.044 | 0.042 | 0.052 | |
| 0.1 | 0.050 | 0.056 | 0.058 | 0.058 | 0.036 | ||
| 0.3 | 0.054 | 0.040 | 0.054 | 0.048 | 0.052 | ||
| 0.5 | 0.068 | 0.060 | 0.058 | 0.058 | 0.060 | ||
| 50 | 20 | 0.0 | 0.054 | 0.042 | 0.036 | 0.032 | 0.044 |
| 0.1 | 0.062 | 0.056 | 0.046 | 0.046 | 0.058 | ||
| 0.3 | 0.056 | 0.036 | 0.044 | 0.038 | 0.052 | ||
| 0.5 | 0.046 | 0.042 | 0.038 | 0.046 | 0.046 | ||
| 50 | 0.0 | 0.070 | 0.046 | 0.052 | 0.044 | 0.046 | |
| 0.1 | 0.044 | 0.048 | 0.042 | 0.046 | 0.038 | ||
| 0.3 | 0.050 | 0.038 | 0.060 | 0.058 | 0.060 | ||
| 0.5 | 0.046 | 0.046 | 0.058 | 0.040 | 0.050 | ||
Note: n = Sample size; m = the size of gene set; c= censoring proportion. GSEA1 = GSEA with the equal weight; GSEA2 = weighted GSEA; GT = Global Test; WT = Wald-type Test; GBST = Global Boost Test
The estimated power of the five tests for p = 200, n = 80, and m = 50 based on 200 replications
| Case(A) | Case (B) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Case | GSEA1 | GSEA2 | GT | WT | GBST | GSEA1 | GSEA2 | GT | WT | GBST | ||
| (I) | 0.0 | 0.1 | 0.145 | 0.185 | 0.295 | 0.275 | 0.250 | 0.120 | 0.185 | 0.380 | 0.380 | 0.345 |
| 0.3 | 0.290 | 0.370 | 0.650 | 0.630 | 0.635 | 0.250 | 0.355 | 0.750 | 0.745 | 0.805 | ||
| 0.5 | 0.400 | 0.450 | 0.825 | 0.805 | 0.820 | 0.400 | 0.525 | 0.900 | 0.905 | 0.950 | ||
| 0.3 | 0.1 | 0.145 | 0.155 | 0.155 | 0.165 | 0.155 | 0..65 | 0.110 | 0.240 | 0.270 | 0.265 | |
| 0.3 | 0.245 | 0.225 | 0.430 | 0.420 | 0.415 | 0.205 | 0.230 | 0.510 | 0.520 | 0.540 | ||
| 0.5 | 0.335 | 0.330 | 0.615 | 0.605 | 0.620 | 0.275 | 0.395 | 0.725 | 0.695 | 0.765 | ||
| (II) | 0.0 | 0.1 | 0.165 | 0.225 | 0.425 | 0.485 | 0.320 | 0.075 | 0.135 | 0.330 | 0.320 | 0.295 |
| 0.3 | 0.890 | 0.965 | 1.000 | 1.000 | 0.980 | 0.235 | 0.375 | 0.755 | 0.745 | 0.755 | ||
| 0.5 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.430 | 0.605 | 0.900 | 0.890 | 0.925 | ||
| 0.3 | 0.1 | 0.140 | 0.190 | 0.310 | 0.330 | 0.215 | 0.100 | 0.080 | 0.270 | 0.285 | 0.255 | |
| 0.3 | 0.730 | 0.815 | 0.955 | 0.970 | 0.880 | 0.215 | 0.295 | 0.520 | 0.520 | 0.570 | ||
| 0.5 | 0.985 | 0.995 | 1.000 | 1.000 | 1.000 | 0.265 | 0.395 | 0.670 | 0.640 | 0.745 | ||
| (III) | 0.0 | 0.1 | 0.120 | 0.115 | 0.310 | 0.335 | 0.245 | 0.125 | 0.135 | 0.270 | 0.310 | 0.280 |
| 0.3 | 0.355 | 0.460 | 0.790 | 0.765 | 0.735 | 0.270 | 0.375 | 0.745 | 0.725 | 0.805 | ||
| 0.5 | 0.610 | 0.615 | 0.895 | 0.885 | 0.885 | 0.385 | 0.495 | 0.885 | 0.890 | 0.925 | ||
| 0.3 | 0.1 | 0.085 | 0.095 | 0.215 | 0.195 | 0.190 | 0.070 | 0.075 | 0.215 | 0.220 | 0.225 | |
| 0.3 | 0.260 | 0.330 | 0.610 | 0.585 | 0.535 | 0.165 | 0.275 | 0.550 | 0.505 | 0.595 | ||
| 0.5 | 0.390 | 0.445 | 0.710 | 0.700 | 0.725 | 0.305 | 0.350 | 0.735 | 0.740 | 0.810 | ||
| (IV) | 0.0 | 0.1 | 0.105 | 0.205 | 0.440 | 0.460 | 0.300 | 0.100 | 0.170 | 0.385 | 0.415 | 0.360 |
| 0.3 | 0.815 | 0.885 | 0.990 | 0.990 | 0.965 | 0.255 | 0.340 | 0.755 | 0.755 | 0.760 | ||
| 0.5 | 0.995 | 1.000 | 1.000 | 1.000 | 1.000 | 0.440 | 0.550 | 0.875 | 0.890 | 0.930 | ||
| 0.3 | 0.1 | 0.160 | 0.215 | 0.305 | 0.310 | 0.265 | 0.135 | 0.165 | 0.240 | 0.255 | 0.235 | |
| 0.3 | 0.585 | 0.760 | 0.925 | 0.920 | 0.830 | 0.185 | 0.285 | 0.560 | 0.550 | 0.555 | ||
| 0.5 | 0.960 | 0.970 | 1.000 | 1.000 | 0.985 | 0.320 | 0.410 | 0.680 | 0.685 | 0.760 | ||
Note: m = the size of gene set; c= censoring proportion; m= the proportion of significant genes in a gene set. GSEA1 = GSEA ith the equal weight; GSEA2 = weighted GSEA; GT = Global Test; WT = Wald-type Test; GBST = Global Boost Test. For Case (I), all genes are independent; for Case (II), there is within-correlation among significant genes; for Case (III), there is an autoregressive correlation between significant genes; and for Case (IV), there is an unstructured correlation between significant genes
Figure 1The estimated size and power of the five tests, GSEA1, GSEA2, GT, WT and GBST, over the proportions of significant genes in each gene-set, using four different correlation structures, two different scenarios for generating the Cox regression coefficient vector, and censoring level . The solid and the dashed lines correspond to Case (A), that genes are positively associated with the survival times, and to the Case (B), that genes are randomly associated with the survival time, respectively. The circle, square, diamond, triangle and inverted triangle symbols correspond to GSEA1, GSEA2, GT, WT and GBST, respectively.
Figure 2Plots of estimated size and power of the five tests, GSEA1, GSEA2, GT, WT and GBST, over the proportions of significant genes in each gene-set, using four different correlation structures, two different scenarios for generating the Cox regression coefficient vector, and censoring level . The solid and dashed lines correspond to Case (A) that genes are positively associated with survival time, and to Case (B), that genes are randomly associated with the survival time, respectively. The circle, square, diamond, triangle and inverted triangle symbols correspond to GSEA1, GSEA2, GT, WT and GBST, respectively.
Pathways with more genes associated with overall survival as identified by either GT or WT at the nominal 0.01 level of the permutation test
| Pathway name | Gene set size | p-value | q-value | ||||||
|---|---|---|---|---|---|---|---|---|---|
| GSEA1 | GSEA2 | GT | WT GBST | GSEA1 | GSEA2 | GT | WT GBST | ||
| Pentose phosphate pathway | 39 | 0.010 | 0.003 | 0.000 | 0.000 0.003 | 0.921 | 0.612 | 0.000 | 0.000 0.087 |
| Histidine metabolism | 54 | 0.070 | 0.012 | 0.000 | 0.000 0.000 | 0.921 | 0.638 | 0.000 | 0.000 0.000 |
| Tryptophan metabolism | 86 | 0.034 | 0.024 | 0.002 | 0.003 0.009 | 0.921 | 0.701 | 0.061 | 0.085 0.140 |
| One carbon pool by folate | 28 | 0.065 | 0.027 | 0.010 | 0.000 0.010 | 0.921 | 0.701 | 0.108 | 0.000 0.140 |
| DNA replication | 52 | 0.021 | 0.013 | 0.009 | 0.002 0.037 | 0.921 | 0.638 | 0.108 | 0.077 0.207 |
| Colorectal cancer | 165 | 0.139 | 0.145 | 0.003 | 0.010 0.000 | 0.921 | 0.811 | 0.073 | 0.136 0.000 |
| Nucleotide excision repair | 56 | 0.047 | 0.068 | 0.014 | 0.003 0.054 | 0.921 | 0.811 | 0.129 | 0.085 0.253 |
| Urea cycle and metabolism of amino groups | 40 | 0.031 | 0.259 | 0.010 | 0.036 0.003 | 0.921 | 0.878 | 0.108 | 0.191 0.087 |
| 201 | 0.081 | 0.026 | 0.003 | 0.009 0.072 | 0.921 | 0.701 | 0.074 | 0.136 0.275 | |
| Aminophosphonate metabolism | 21 | 0.267 | 0.028 | 0.000 | 0.009 0.056 | 0.921 | 0.701 | 0.000 | 0.136 0.253 |
| Pantothenate and CoA biosynthesis | 19 | 0.252 | 0.061 | 0.189 | 0.010 0.003 | 0.921 | 0.811 | 0.337 | 0.136 0.087 |
| 197 | 0.332 | 0.161 | 0.004 | 0.013 0.003 | 0.921 | 0.811 | 0.074 | 0.166 0.087 | |
| Inositol metabolism | 7 | 0.234 | 0.274 | 0.035 | 0.044 0.004 | 0.921 | 0.878 | 0.207 | 0.204 0.089 |
| Tyrosine metabolism | 81 | 0.531 | 0.147 | 0.017 | 0.018 0.003 | 0.970 | 0.811 | 0.133 | 0.171 0.087 |
| Lysine degradation | 56 | 0.124 | 0.225 | 0.004 | 0.018 0.137 | 0.921 | 0.878 | 0.074 | 0.171 0.362 |
| Starch and sucrose metabolism | 72 | 0.373 | 0.133 | 0.002 | 0.033 0.250 | 0.928 | 0.811 | 0.068 | 0.191 0.427 |
| Glycerophospholipid metabolism | 89 | 0.535 | 0.255 | 0.004 | 0.028 0.036 | 0.970 | 0.878 | 0.074 | 0.190 0.207 |
| Androgen and estrogen metabolism | 53 | 0.386 | 0.168 | 0.005 | 0.035 0.147 | 0.935 | 0.811 | 0.085 | 0.191 0.372 |
| 240 | 0.258 | 0.606 | 0.001 | 0.036 0.121 | 0.921 | 0.892 | 0.051 | 0.191 0.358 | |
| Starch and sucrose metabolism | 72 | 0.373 | 0.133 | 0.002 | 0.033 0.250 | 0.928 | 0.811 | 0.068 | 0.191 0.427 |
Note: The underlined pathways are identified by Crijns et al.(2009).