| Literature DB >> 22373429 |
Ashley Petersen1, Alexandra Sitarik2, Alexander Luedtke3, Scott Powers4, Airat Bekmetjev5, Nathan L Tintle5.
Abstract
Analyzing sets of genes in genome-wide association studies is a relatively new approach that aims to capitalize on biological knowledge about the interactions of genes in biological pathways. This approach, called pathway analysis or gene set analysis, has not yet been applied to the analysis of rare variants. Applying pathway analysis to rare variants offers two competing approaches. In the first approach rare variant statistics are used to generate p-values for each gene (e.g., combined multivariate collapsing [CMC] or weighted-sum [WS]) and the gene-level p-values are combined using standard pathway analysis methods (e.g., gene set enrichment analysis or Fisher's combined probability method). In the second approach, rare variant methods (e.g., CMC and WS) are applied directly to sets of single-nucleotide polymorphisms (SNPs) representing all SNPs within genes in a pathway. In this paper we use simulated phenotype and real next-generation sequencing data from Genetic Analysis Workshop 17 to analyze sets of rare variants using these two competing approaches. The initial results suggest substantial differences in the methods, with Fisher's combined probability method and the direct application of the WS method yielding the best power. Evidence suggests that the WS method works well in most situations, although Fisher's method was more likely to be optimal when the number of causal SNPs in the set was low but the risk of the causal SNPs was high.Entities:
Year: 2011 PMID: 22373429 PMCID: PMC3287885 DOI: 10.1186/1753-6561-5-S9-S48
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Type I error rates across the five approaches for the 500 null sets and the 500 nonspurious gene sets
| Pathway method | Across 500 null sets | Across 500 nonspurious gene sets | ||
|---|---|---|---|---|
| Nominal | Nominal | Nominal | Nominal | |
| No gene-level aggregation | ||||
| WS | 0.492 | 0.190 | 0.043 | 0.004 |
| CMC | 0.232 | 0.054 | 0.037 | 0.003 |
| Gene-level aggregation | ||||
| WS-GSEA | 0.048 | 0.004 | 0.001 | 0.000 |
| WS-KS | 0.040 | 0.003 | 0.000 | 0.000 |
| WS-Fisher | 0.429 | 0.244 | 0.010 | 0.004 |
| CMC-GSEA | 0.063 | 0.007 | 0.002 | 0.000 |
| CMC-KS | 0.044 | 0.004 | 0.004 | 0.000 |
| CMC-Fisher | 0.484 | 0.235 | 0.048 | 0.006 |
WS, weighted sum; CMC, combined multivariate collapsing; GSEA, gene set enrichment analysis; KS, Kolmogorov-Smirnov test; Fisher, Fisher’s combined probability test.
Figure 1Power of pathway analysis methods across gene sets with varying numbers of associated genes