| Literature DB >> 22859961 |
Marina Evangelou1, Augusto Rendon, Willem H Ouwehand, Lorenz Wernisch, Frank Dudbridge.
Abstract
It has been suggested that pathway analysis can complement single-SNP analysis in exploring genomewide association data. Pathway analysis incorporates the available biological knowledge of genes and SNPs and is expected to improve the chances of revealing the underlying genetic architecture of complex traits. Methods for pathway analysis can be classified as competitive (enrichment) or self-contained (association) according to the hypothesis tested. Although association tests are statistically more powerful than enrichment tests they can be difficult to calibrate because biases in analysis accumulate across multiple SNPs or genes. Furthermore, enrichment tests can be more scientifically relevant than association tests, as they detect pathways with relatively more evidence for association than the remaining genes. Here we show how some well known association tests can be simply adapted to test for enrichment, and compare their performance to some established enrichment tests. We propose versions of the Adaptive Rank Truncated Product (ARTP), Tail Strength Measure and Fisher's combination of p-values for testing the enrichment null hypothesis. We compare the behaviour of these proposed methods with the established Hypergeometric Test and Gene-Set Enrichment Analysis (GSEA). The results of the simulation study show that the modified version of the ARTP method has generally the best performance across the situations considered. The methods were also applied for finding enriched pathways for body mass index (BMI) and platelet function phenotypes. The pathway analysis of BMI identified the Vasoactive Intestinal Peptide pathway as significantly associated with BMI. This pathway has been previously reported as associated with BMI and the risk of obesity. The ARTP method was the method that identified the largest number of enriched pathways across all tested pathway databases and phenotypes. The simulation and data application results are in agreement with previous work on association tests and suggests that the ARTP should be preferred for both enrichment and association testing.Entities:
Mesh:
Year: 2012 PMID: 22859961 PMCID: PMC3409204 DOI: 10.1371/journal.pone.0041018
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Mean type-I error of the methods.
| Method | Mean Type-I Error |
| FM | 0.050 |
| FM | 0.051 |
| Hypergeometric | 0.028 |
| Hypergeometric | 0.023 |
| Hypergeometric | 0.019 |
| Hypergeometric | 0.038 |
| TSM | 0.028 |
| TSM | 0.057 |
| TSM | 0.051 |
| GSEA | 0.049 |
| ARTP | 0.048 |
| ARTP | 0.046 |
Mean type-I error of the methods across all null scenarios of the simulation study. TSM refers to the approximate Normal distribution of the TSM. FM and TSM refer to the permutation procedures for estimating the significance of the FM and TSM statistic. ARTP and TSM are the empirical distributions of ARTP and TSM respectively.
Mean power of the methods for the different pathway sizes.
| Method | Pathway Size | Mean Power |
| ARTP | 20 | 0.743 |
| 60 | 0.892 | |
| 100 | 0.925 | |
| FM | 20 | 0.730 |
| 60 | 0.889 | |
| 100 | 0.925 | |
| GSEA | 20 | 0.639 |
| 60 | 0.826 | |
| 100 | 0.867 | |
| TSM | 20 | 0.619 |
| 60 | 0.837 | |
| 100 | 0.894 | |
| Hypergeometric | 20 | 0.560 |
| 60 | 0.729 | |
| 100 | 0.803 |
The mean power of the methods is computed for all the scenarios for the three different tested pathway sizes across all other variables.
Figure 1Power of the five methods for different pathway sizes.
Plots illustrate the power of the methods when the total number of associated SNPs equals 100, the proportion of associated SNPs within each pathway is 0.4 and the effect sizes are and .
Power of the methods as the proportion of pathway SNPs with effects changes.
| Proportion ( | ARTP |
| GSEA | TSM | Hypergeometric |
| 40% | 0.940 | 0.909 | 0.881 | 0.691 | 0.659 |
| 60% | 0.985 | 0.979 | 0.970 | 0.878 | 0.852 |
| 100% | 1 | 1 | 0.990 | 0.996 | 0.985 |
Power of the methods for a pathway of size 20. 50 genes in total have effects. The effect size of the pathway genes is 4 and the effect size of the rest of the genes is 1.
Figure 2Power of the five methods for different proportions () of associated SNPs within a pathway of size 60.
Plots illustrate the power of the methods when the total number of associated SNPs equals 100 (plot (A)) and 200 (plot(B)), the effect sizes are and for both plots.
Power of the methods for the different pathway sizes.
| Pathwaysize |
|
| ARTP | FM | GSEA | TSM | Hypergeometric |
| 20 | 4 | 2 | 0.550 | 0.520 | 0.370 | 0.360 | 0.317 |
| 4 | 1 | 0.803 | 0.760 | 0.620 | 0.539 | 0.464 | |
| 2 | 1 | 0.571 | 0.511 | 0.460 | 0.389 | 0.326 | |
| 60 | 4 | 2 | 0.851 | 0.825 | 0.730 | 0.713 | 0.528 |
| 4 | 1 | 0.974 | 0.959 | 0.880 | 0.868 | 0.725 | |
| 2 | 1 | 0.857 | 0.828 | 0.690 | 0.707 | 0.546 | |
| 100 | 4 | 2 | 0.928 | 0.915 | 0.820 | 0.837 | 0.690 |
| 4 | 1 | 0.994 | 0.980 | 0.860 | 0.925 | 0.804 | |
| 2 | 1 | 0.925 | 0.901 | 0.820 | 0.826 | 0.674 |
Mean power of the methods that have a type-I error 5% across all simulated scenarios.
| Method | Power |
| ARTP | 0.846 |
| FM | 0.840 |
| GSEA | 0.768 |
| TSM | 0.772 |
| Hypergeometric | 0.687 |
Mean power of the methods across all simulated scenarios.
Figure 3Pairwise scatterplot of power for the five methods across all simulated non-null scenarios.
Pathway Analysis of BMI.
| PathwayName | Size | ARTP | FM | GSEA | TSM | Hypergeometric |
| Biocarta:VIP Pathway | 19 | 0.028 | 0.242 | 0.024 | 0.523 | 0.552 |
Table shows the nominal p-values of all five methods for the Biocarta VIP pathway. Biocarta VIP pathway has been reported as being significantly associated with BMI and the risk of obesity.
Performance of the methods when applied on the data of the GWAS.
| Response | KEGG | Biocarta | Reactome |
| BMI | FM ( | ARTP = GSEA ( | TSM ( |
| 25 cm[1ex]Fibrinogen response to ADP | TSM ( | FM = GSEA ( | FM ( |
| Fibrinogen response to collagen | FM = TSM ( | ARTP ( | ARTP ( |
| P-selectin response to collagen | GSEA ( | ARTP = FM = TSM ( | ARTP ( |
| P-selectin response to ADP | GSEA ( | GSEA ( | ARTP ( |
Table shows the method that identifies the largest number of pathways with nominal p-value less than 0.05 for each phenotype and database. The numbers in the brackets represent the number of enriched pathways identified by the equivalent method divided by the total number of enriched pathways identified by all the tested methods.