| Literature DB >> 22166939 |
Joanna M Biernacka1, Gregory D Jenkins, Liewei Wang, Ann M Moyer, Brooke L Fridley.
Abstract
Gene-set analysis (GSA) evaluates the overall evidence of association between a phenotype and all genotyped single nucleotide polymorphisms (SNPs) in a set of genes, as opposed to testing for association between a phenotype and each SNP individually. We propose using the Gamma Method (GM) to combine gene-level P-values for assessing the significance of GS association. We performed simulations to compare the GM with several other self-contained GSA strategies, including both one-step and two-step GSA approaches, in a variety of scenarios. We denote a 'one-step' GSA approach to be one in which all SNPs in a GS are used to derive a test of GS association without consideration of gene-level effects, and a 'two-step' approach to be one in which all genotyped SNPs in a gene are first used to evaluate association of the phenotype with all measured variation in the gene and then the gene-level tests of association are aggregated to assess the GS association with the phenotype. The simulations suggest that, overall, two-step methods provide higher power than one-step approaches and that combining gene-level P-values using the GM with a soft truncation threshold between 0.05 and 0.20 is a powerful approach for conducting GSA, relative to the competing approaches assessed. We also applied all of the considered GSA methods to data from a pharmacogenomic study of cisplatin, and obtained evidence suggesting that the glutathione metabolism GS is associated with cisplatin drug response.Entities:
Mesh:
Year: 2011 PMID: 22166939 PMCID: PMC3330217 DOI: 10.1038/ejhg.2011.236
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Single SNP and gene-level results for the CDDP pharmacogenomic study
| 16 | 200 | 14 | 0.0099 | 0.543 | 0.184 | 0.504 | 1.000 | |
| 10 | 44 | 4 | 0.0440 | 0.468 | 0.218 | 0.307 | 0.495 | |
| 17 | 37 | 10 | 0.0889 | 0.904 | 0.902 | 0.817 | 0.371 | |
| 13 | 526 | 34 | 0.0032 | 0.552 | 0.663 | 0.329 | NA | |
| 6 | 115 | 16 | 0.0228 | 0.782 | 0.780 | 0.513 | 9.75E-04 | |
| 1 | 25 | 3 | 0.0729 | 0.533 | 0.870 | 0.622 | 0.219 | |
| 3 | 2 | 1 | 0.9353 | 0.995 | 0.995 | 0.755 | 0.934 | |
| 14 | 4 | 2 | 0.0965 | 0.313 | 0.579 | 0.355 | 0.653 | |
| 5 | 61 | 7 | 0.0055 | 0.153 | 0.636 | 0.596 | 0.256 | |
| 19 | 7 | 2 | 0.3785 | 0.896 | 0.908 | 0.677 | 0.937 | |
| 6 | 16 | 3 | 0.0794 | 0.460 | 0.610 | 0.586 | 0.102 | |
| 1 | 21 | 4 | 0.1503 | 0.853 | 0.756 | 0.914 | 0.047 | |
| 8 | 18 | 5 | 0.0147 | 0.171 | 0.330 | 0.173 | 0.218 | |
| 20 | 13 | 3 | 0.0874 | 0.537 | 0.485 | 0.417 | 0.255 | |
| 6 | 14 | 2 | 0.3756 | 0.848 | 0.757 | 0.977 | 0.361 | |
| 6 | 40 | 4 | 0.0670 | 0.677 | 0.778 | 0.911 | 0.410 | |
| 6 | 64 | 6 | 0.0461 | 0.652 | 0.312 | 0.392 | 0.021 | |
| 1 | 5 | 3 | 0.0227 | 0.113 | 0.055 | 0.08 | 0.131 | |
| 1 | 4 | 2 | 0.1858 | 0.554 | 0.411 | 0.354 | 0.452 | |
| 1 | 11 | 2 | 0.0645 | 0.288 | 0.430 | 0.215 | 0.193 | |
| 1 | 7 | 2 | 0.0779 | 0.310 | 0.161 | 0.248 | 0.046 | |
| 1 | 7 | 2 | 0.0085 | 0.049 | 0.094 | 0.111 | 0.133 | |
| 10 | 28 | 2 | 0.0506 | 0.304 | 0.134 | 0.07 | 0.143 | |
| 10 | 27 | 3 | 0.0244 | 0.237 | 0.315 | 0.09 | 0.104 | |
| 11 | 15 | 2 | 0.0015 | 0.008 | 0.008 | 0.001 | 0.011 | |
| 22 | 2 | 2 | 0.5955 | 0.831 | 0.835 | 0.929 | 0.929 | |
| 14 | 16 | 3 | 0.0188 | 0.170 | 0.192 | 0.216 | 0.670 | |
Only computed for genes with P
Simulation scenarios
| 16 | ABCC1 | 43 | 83 | 200 | Y | |||||
| 10 | ABCC2 | 10 | 16 | 44 | Y | |||||
| 17 | ABCC3 | 21 | 28 | 37 | Y | |||||
| 13 | ABCC4 | 139 | 254 | 526 | Y | |||||
| 6 | GCLC | 46 | 76 | 115 | S | S | M | S | 2 S | Y |
| 1 | GCLM | 8 | 11 | 25 | Y | |||||
| 3 | GPX1 | 2 | 2 | 2 | S | Y | ||||
| 14 | GPX2 | 3 | 4 | 4 | Y | |||||
| 5 | GPX3 | 16 | 24 | 61 | Y | |||||
| 19 | GPX4 | 4 | 6 | 7 | S | S | M | 2 S | Y | |
| 6 | GPX5 | 5 | 10 | 16 | Y | |||||
| 1 | GPX7 | 10 | 14 | 21 | Y | |||||
| 8 | GSR | 7 | 10 | 18 | Y | |||||
| 20 | GSS | 7 | 8 | 13 | Y | |||||
| 6 | GSTA1 | 1 | 3 | 14 | S | Y | ||||
| 6 | GSTA3 | 7 | 14 | 40 | S | Y | ||||
| 6 | GSTA4 | 16 | 25 | 64 | Y | |||||
| 1 | GSTM1 | 3 | 3 | 5 | N | |||||
| 1 | GSTM2 | 1 | 2 | 4 | N | |||||
| 1 | GSTM3 | 3 | 5 | 11 | N | |||||
| 1 | GSTM4 | 2 | 3 | 7 | N | |||||
| 1 | GSTM5 | 3 | 6 | 7 | S | N | ||||
| 10 | GSTO1 | 3 | 4 | 28 | N | |||||
| 10 | GSTO2 | 0 | 3 | 27 | N | |||||
| 11 | GSTP1 | 2 | 7 | 15 | N | |||||
| 22 | GSTT2 | 2 | 2 | 2 | N | |||||
| 14 | GSTZ1 | 5 | 8 | 16 | S | S | M | 2 S | N | |
The number of SNPs per gene shows the total number of SNPs for each gene available in the original data, as well as the number of SNPs after tag SNP selection with an r2 threshold of 0.6 or 0.9. Data sets analyzed in the simulations were those based on tag SNP selection with these two thresholds.
Scenarios are described in terms of the number of small (S, odds ratio = 1.2) or medium (M, odds ratio = 1.5) SNP effects simulated in each gene:
Scenario 1: one small effect in each of five different genes (five causal SNPs).
Scenario 2: one small effect in each of one large gene and two small genes (three causal SNPs).
Scenario 3: one moderate effect in each of one large gene and two small genes (three causal SNPs).
Scenario 4: one small effect in each of three genes that are on the same chromosome (three causal SNPs).
Scenario 5: two small effects in each of three genes (six causal SNPs).
Summary of power for PC-GM and other GSA methods
| Two-step | PC-GM | ||||||
| STT=0.01 | 0.770 | 0.838 | 0.875 | 0.866 | 0.893 | 0.960 | |
| STT=0.05 | 0.780 | 0.838 | 0.890 | 0.882 | 0.915 | 0.980 | |
| STT=0.10 | 0.770 | 0.845 | 0.890 | 0.888 | 0.943 | 0.980 | |
| STT=0.15 | 0.780 | 0.830 | 0.895 | 0.889 | 0.950 | 0.990 | |
| STT=0.20 | 0.770 | 0.820 | 0.895 | 0.884 | 0.950 | 0.990 | |
| STT=1/e | 0.610 | 0.700 | 0.810 | 0.800 | 0.900 | 0.940 | |
| GMRE-GM | |||||||
| STT=0.01 | 0.720 | 0.770 | 0.880 | 0.850 | 0.933 | 0.960 | |
| STT=0.05 | 0.730 | 0.798 | 0.895 | 0.873 | 0.943 | 0.980 | |
| STT=0.10 | 0.760 | 0.800 | 0.890 | 0.879 | 0.953 | 0.980 | |
| STT=0.15 | 0.740 | 0.798 | 0.880 | 0.878 | 0.963 | 0.980 | |
| STT=0.20 | 0.690 | 0.785 | 0.900 | 0.863 | 0.953 | 0.970 | |
| STT=1/e | 0.540 | 0.630 | 0.780 | 0.770 | 0.910 | 0.960 | |
| GMFE-GM | |||||||
| STT=0.01 | 0.710 | 0.745 | 0.815 | 0.810 | 0.870 | 0.940 | |
| STT=0.05 | 0.730 | 0.785 | 0.845 | 0.836 | 0.893 | 0.970 | |
| STT=0.10 | 0.730 | 0.800 | 0.860 | 0.848 | 0.903 | 0.970 | |
| STT=0.15 | 0.720 | 0.795 | 0.865 | 0.855 | 0.913 | 0.980 | |
| STT=0.20 | 0.710 | 0.770 | 0.875 | 0.848 | 0.905 | 0.980 | |
| STT=1/e | 0.610 | 0.660 | 0.800 | 0.780 | 0.880 | 0.960 | |
| MinP-GM | |||||||
| STT=0.01 | 0.208 | 0.782 | 0.901 | 0.816 | 1.000 | 1.000 | |
| STT=0.05 | 0.239 | 0.783 | 0.927 | 0.832 | 1.000 | 1.000 | |
| STT=0.10 | 0.239 | 0.757 | 0.926 | 0.828 | 1.000 | 1.000 | |
| STT=0.15 | 0.246 | 0.727 | 0.916 | 0.823 | 1.000 | 1.000 | |
| STT=0.20 | 0.249 | 0.704 | 0.904 | 0.816 | 0.999 | 1.000 | |
| STT=1/e | 0.229 | 0.621 | 0.857 | 0.785 | 0.997 | 1.000 | |
| One-step | PC | 0.100 | 0.290 | 0.500 | 0.580 | 0.970 | 1.000 |
| GMRE | 0.070 | 0.290 | 0.620 | 0.600 | 0.990 | 1.000 | |
| GM | |||||||
| STT=0.01 | 0.187 | 0.782 | 0.908 | 0.810 | 1.000 | 1.000 | |
| STT=0.05 | 0.137 | 0.712 | 0.902 | 0.786 | 1.000 | 1.000 | |
| STT=0.10 | 0.122 | 0.587 | 0.862 | 0.742 | 1.000 | 1.000 | |
| STT=0.15 | 0.107 | 0.492 | 0.812 | 0.706 | 0.999 | 1.000 | |
| STT=0.20 | 0.097 | 0.419 | 0.752 | 0.674 | 0.998 | 1.000 | |
| STT=1/e | 0.082 | 0.251 | 0.529 | 0.569 | 0.949 | 0.987 |
Abbreviations: GM, Gamma Method; GMFE, global model with fixed effects; GMRE, global model with random effects; minP, minimum SNP P-value for gene.
For approaches that use the GM, the STT is listed after the name of the method.
For each GSA method, the distribution of power over all investigated scenarios (disease association models 1–5 with different levels of LD and gene set size) is summarized.
Power for GSA methods under the five-disease-model scenarios of the simulation study, and P-values from application of the methods to the CDDP pharmacogenomic study
| Two-step | PC-GM | ||||||
| STT=0.01 | 0.883 | 0.858 | 0.863 | 0.858 | 0.868 | 0.023 | |
| STT=0.05 | 0.893 | 0.868 | 0.875 | 0.895 | 0.878 | 0.043 | |
| STT=0.10 | 0.893 | 0.865 | 0.888 | 0.908 | 0.888 | 0.080 | |
| STT=0.15 | 0.888 | 0.868 | 0.898 | 0.903 | 0.888 | 0.106 | |
| STT=0.20 | 0.883 | 0.855 | 0.888 | 0.900 | 0.893 | 0.135 | |
| STT=1/e | 0.780 | 0.765 | 0.810 | 0.820 | 0.808 | 0.210 | |
| GMRE-GM | |||||||
| STT=0.01 | 0.858 | 0.855 | 0.835 | 0.853 | 0.848 | 0.176 | |
| STT=0.05 | 0.870 | 0.880 | 0.858 | 0.880 | 0.875 | 0.223 | |
| STT=0.10 | 0.883 | 0.885 | 0.860 | 0.878 | 0.890 | 0.279 | |
| STT=0.15 | 0.878 | 0.880 | 0.860 | 0.880 | 0.890 | 0.310 | |
| STT=0.20 | 0.848 | 0.855 | 0.848 | 0.870 | 0.893 | 0.322 | |
| STT=1/e | 0.755 | 0.763 | 0.763 | 0.765 | 0.785 | 0.358 | |
| GMFE-GM | |||||||
| STT=0.01 | 0.830 | 0.805 | 0.820 | 0.775 | 0.818 | 0.596 | |
| STT=0.05 | 0.845 | 0.815 | 0.835 | 0.828 | 0.855 | 0.626 | |
| STT=0.10 | 0.848 | 0.830 | 0.843 | 0.845 | 0.875 | 0.627 | |
| STT=0.15 | 0.850 | 0.838 | 0.848 | 0.850 | 0.890 | 0.608 | |
| STT=0.20 | 0.835 | 0.825 | 0.853 | 0.850 | 0.878 | 0.569 | |
| STT=1/e | 0.765 | 0.773 | 0.775 | 0.785 | 0.808 | 0.443 | |
| MinP-GM | |||||||
| STT=0.01 | 0.902 | 0.789 | 1.0 | 0.387 | 1.00 | 0.655 | |
| STT=0.05 | 0.929 | 0.792 | 1.0 | 0.438 | 1.00 | 0.600 | |
| STT=0.10 | 0.928 | 0.769 | 1.0 | 0.445 | 1.00 | 0.515 | |
| STT=0.15 | 0.920 | 0.748 | 1.0 | 0.445 | 0.999 | 0.448 | |
| STT=0.20 | 0.907 | 0.730 | 1.0 | 0.447 | 0.999 | 0.413 | |
| STT=1/e | 0.862 | 0.656 | 1.0 | 0.413 | 0.993 | 0.402 | |
| One-step | PC | 0.497 | 0.307 | 0.995 | 0.141 | 0.949 | 0.294 |
| GMRE | 0.622 | 0.300 | 0.997 | 0.114 | 0.980 | 0.230 | |
| GM | |||||||
| STT=0.01 | 0.908 | 0.789 | 1.000 | 0.352 | 1.000 | 1.000 | |
| STT=0.05 | 0.901 | 0.728 | 1.000 | 0.299 | 1.000 | 1.000 | |
| STT=0.10 | 0.860 | 0.606 | 1.000 | 0.246 | 1.000 | 1.000 | |
| STT=0.15 | 0.811 | 0.506 | 1.000 | 0.215 | 0.998 | 1.000 | |
| STT=0.20 | 0.756 | 0.428 | 1.000 | 0.192 | 0.993 | 0.991 | |
| STT=1/e | 0.535 | 0.254 | 0.986 | 0.143 | 0.925 | 0.432 | |
For the simulation results, for each disease model (scenarios 1–5) power is averaged over the scenarios with different LD and gene set size.
Figure 1Comparison of power between the various two-step and one-step methods across all the simulation scenarios. All methods using the GM used the STT value of 0.15 (ω≈0.07654).
Figure 2Plot of mean power (average across LD and gene-set size) by STT for the two-step GSA method PC-GM. Note that STT≈0.368 or 1/e corresponds to the commonly used FM for combining P-values.