| Literature DB >> 25519333 |
Tian-Xiao Zhang1, Yi-Ran Xie1, John P Rice1.
Abstract
Rare variants have been proposed to play a significant role in the onset and development of common diseases. However, traditional analysis methods have difficulties in detecting association signals for rare causal variants because of a lack of statistical power. We propose a two-stage, gene-based method for association mapping of rare variants by applying four different noncollapsing algorithms. Using the Genome Analysis Workshop18 whole genome sequencing data set of simulated blood pressure phenotypes, we studied and contrasted the false-positive rate of each algorithm using receiver operating characteristic curves. The statistical power of these methods was also evaluated and compared through the analysis of 200 simulated replications in a smaller genotype data set. We showed that the Fisher's method was superior to the other 3 noncollapsing methods, but was no better than the standard method implemented with famSKAT. Further investigation is needed to explore the potential statistical properties of these approaches.Entities:
Year: 2014 PMID: 25519333 PMCID: PMC4143635 DOI: 10.1186/1753-6561-8-S1-S53
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Q-Q plot and histogram for the mixed effects model. Q-Q plot (left) of −log10 scaled p values and histogram (right) for the mixed effects model based on 1237 genes (87,190 variants) from 849 subjects. In Q-Q plot, black line, expected; blue dots, observed.
Figure 2ROC curves for 4 noncollapsing algorithms and 2 standard methods. ROC curves for 4 different pathway algorithms based on 1237 genes from 849 subjects on trait SBP (first visit). In the left plot, false-positive rate (FPR) ranges from 0 to 1. In the right plot, FPR is scaled to be less than 0.1 as only the true-positive rate (TPR) with a low FPR is of interest. Black curve, naïve method; blue curve, Fisher's method; red curve, Simes' method; green curve, GSEA method; purple curve, SKAT; yellow curve, famSKAT.
Comparison of the power of the 4 noncollapsing and 2 standard methods
| 3 | 0.015 | 0.025 | 0 | 0.075 | 0.01 | ||
| 3 | 0 | 0 | 0 | 0.005 | 0.005 | ||
| 3 | 0.015 | 0 | 0.015 | 0.01 | 0.015 | ||
| 3 | 0 | 0 | 0 | 0 | 0 | ||
| 3 | 0 | 0 | 0 | 0 | 0 | ||
| 3 | 0.005 | 0.005 | 0.005 | 0.005 | 0.04 | ||
| 3 | 0.005 | 0 | 0 | 0 | 0 | ||
| 3 | 0.01 | 0.015 | 0 | 0 | 0 | ||
| 3 | 0.09 | 0.145 | 0.135 | 0 | 0.04 | ||
| 3 | |||||||
| 3 | 0.005 | 0.005 | 0 | 0 | 0 | ||
| 3 | 0 | 0.05 | 0 | 0 | 0 | ||
| 3 | 0.005 | 0 | 0.005 | 0.005 | 0.04 | ||
| 3 | 0.01 | 0.02 | 0 | 0.005 | 0.005 | ||
| 3 | 0 | 0 | 0 | 0.005 | 0 | ||
| 3 | 0.025 | 0.005 | 0.04 | 0 | 0.045 | ||
| 3 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 3 | 0 | 0.02 | 0.01 | 0.01 | 0.005 | ||
| 3 | 0.005 | 0.06 | 0.01 | 0.015 | 0.005 | ||
| 3 | 0 | 0 | 0 | 0 | 0 | ||
| 3 | 0.005 | 0 | 0 | 0.02 | 0 | ||
| 3 | 0.01 | 0.005 | 0.01 | 0.02 | 0 | ||
Power is calculated based on the analysis of the 200 simulated phenotypic replicates. The largest power for each gene is highlighted in bold.