| Literature DB >> 27678522 |
Peng Wei1,2, Ying Cao2, Yiwei Zhang3, Zhiyuan Xu3, Il-Youp Kwak3, Eric Boerwinkle2,4, Wei Pan5.
Abstract
With the advance of sequencing technologies, it has become a routine practice to test for association between a quantitative trait and a set of rare variants (RVs). While a number of RV association tests have been proposed, there is a dearth of studies on the robustness of RV association testing for nonnormal distributed traits, e.g., due to skewness, which is ubiquitous in cohort studies. By extensive simulations, we demonstrate that commonly used RV tests, including sequence kernel association test (SKAT) and optimal unified SKAT (SKAT-O), are not robust to heavy-tailed or right-skewed trait distributions with inflated type I error rates; in contrast, the adaptive sum of powered score (aSPU) test is much more robust. Here we further propose a robust version of the aSPU test, called aSPUr. We conduct extensive simulations to evaluate the power of the tests, finding that for a larger number of RVs, aSPU is often more powerful than SKAT and SKAT-O, owing to its high data-adaptivity. We also compare different tests by conducting association analysis of triglyceride levels using the NHLBI ESP whole-exome sequencing data. The QQ plots for SKAT and SKAT-O were severely inflated (λ = 1.89 and 1.78, respectively), while those for aSPU and aSPUr behaved normally. Due to its relatively high robustness to outliers and high power of the aSPU test, we recommend its use complementary to SKAT and SKAT-O. If there is evidence of inflated type I error rate from the aSPU test, we would recommend the use of the more robust, but less powerful, aSPUr test.Entities:
Keywords: SKAT; associate testing; next-generation sequencing; rare variants; robustness
Mesh:
Year: 2016 PMID: 27678522 PMCID: PMC5144964 DOI: 10.1534/g3.116.035485
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Empirical type I error rates of various tests at the significance level of 0.05 for a quantitative trait with an error distribution (Distr), a number of independent SNVs (#SNVs), and with two covariates
| Distr | #SNVs | SKAT | SKAT-O | SPU(1) | SPU(2) | SPU(3) | SPU(4) | SPU( | aSPU | aSPUr |
|---|---|---|---|---|---|---|---|---|---|---|
| 8 | 0.044 | 0.053 | 0.048 | 0.050 | 0.055 | 0.055 | 0.059 | 0.057 | 0.055 | |
| 32 | 0.064 | 0.058 | 0.065 | 0.063 | 0.051 | 0.061 | 0.056 | 0.063 | 0.056 | |
| 64 | 0.050 | 0.047 | 0.047 | 0.053 | 0.049 | 0.055 | 0.054 | 0.047 | 0.052 | |
| 128 | 0.044 | 0.041 | 0.049 | 0.052 | 0.051 | 0.053 | 0.047 | 0.047 | 0.053 | |
| 192 | 0.031 | 0.032 | 0.049 | 0.039 | 0.048 | 0.053 | 0.054 | 0.049 | 0.048 | |
| 256 | 0.019 | 0.031 | 0.051 | 0.025 | 0.040 | 0.035 | 0.037 | 0.041 | 0.033 | |
| 8 | 0.076 | 0.072 | 0.051 | 0.042 | 0.047 | 0.043 | 0.047 | 0.048 | 0.046 | |
| 32 | 0.113 | 0.105 | 0.050 | 0.046 | 0.050 | 0.055 | 0.052 | 0.045 | 0.042 | |
| 64 | 0.132 | 0.109 | 0.039 | 0.034 | 0.039 | 0.049 | 0.051 | 0.040 | 0.047 | |
| 128 | 0.114 | 0.105 | 0.048 | 0.027 | 0.045 | 0.042 | 0.057 | 0.048 | 0.047 | |
| 192 | 0.104 | 0.101 | 0.065 | 0.019 | 0.032 | 0.032 | 0.050 | 0.044 | 0.048 | |
| 256 | 0.087 | 0.074 | 0.042 | 0.007 | 0.013 | 0.022 | 0.043 | 0.026 | 0.062 | |
| 8 | 0.082 | 0.081 | 0.052 | 0.048 | 0.051 | 0.049 | 0.050 | 0.049 | 0.031 | |
| 32 | 0.190 | 0.186 | 0.060 | 0.064 | 0.051 | 0.062 | 0.078 | 0.061 | 0.040 | |
| 64 | 0.289 | 0.268 | 0.043 | 0.036 | 0.033 | 0.036 | 0.100 | 0.062 | 0.042 | |
| 128 | 0.310 | 0.276 | 0.038 | 0.025 | 0.027 | 0.028 | 0.085 | 0.050 | 0.033 | |
| 192 | 0.269 | 0.251 | 0.036 | 0.011 | 0.015 | 0.019 | 0.054 | 0.031 | 0.032 | |
| 256 | 0.310 | 0.282 | 0.036 | 0.006 | 0.013 | 0.016 | 0.064 | 0.037 | 0.036 | |
| 8 | 0.107 | 0.093 | 0.056 | 0.065 | 0.063 | 0.062 | 0.061 | 0.067 | 0.053 | |
| 32 | 0.160 | 0.137 | 0.052 | 0.041 | 0.052 | 0.053 | 0.061 | 0.052 | 0.047 | |
| 64 | 0.165 | 0.144 | 0.052 | 0.038 | 0.037 | 0.043 | 0.057 | 0.045 | 0.038 | |
| 128 | 0.176 | 0.147 | 0.053 | 0.030 | 0.049 | 0.050 | 0.059 | 0.048 | 0.052 | |
| 192 | 0.173 | 0.142 | 0.045 | 0.012 | 0.029 | 0.035 | 0.049 | 0.033 | 0.050 | |
| 256 | 0.151 | 0.115 | 0.043 | 0.007 | 0.025 | 0.027 | 0.047 | 0.039 | 0.050 | |
| 8 | 0.113 | 0.103 | 0.063 | 0.056 | 0.059 | 0.059 | 0.061 | 0.058 | 0.057 | |
| 32 | 0.209 | 0.197 | 0.043 | 0.043 | 0.058 | 0.060 | 0.075 | 0.058 | 0.051 | |
| 64 | 0.276 | 0.259 | 0.045 | 0.038 | 0.040 | 0.044 | 0.062 | 0.053 | 0.059 | |
| 128 | 0.277 | 0.251 | 0.052 | 0.032 | 0.039 | 0.044 | 0.069 | 0.045 | 0.064 | |
| 192 | 0.269 | 0.241 | 0.033 | 0.013 | 0.022 | 0.025 | 0.054 | 0.026 | 0.063 | |
| 256 | 0.287 | 0.249 | 0.035 | 0.010 | 0.019 | 0.023 | 0.051 | 0.034 | 0.056 | |
| 8 | 0.371 | 0.316 | 0.177 | 0.333 | 0.327 | 0.340 | 0.337 | 0.290 | 0.060 | |
| 32 | 0.226 | 0.187 | 0.078 | 0.139 | 0.136 | 0.143 | 0.155 | 0.121 | 0.054 | |
| 64 | 0.147 | 0.120 | 0.058 | 0.077 | 0.083 | 0.084 | 0.083 | 0.080 | 0.055 | |
| 128 | 0.089 | 0.089 | 0.061 | 0.048 | 0.054 | 0.054 | 0.069 | 0.068 | 0.060 | |
| 192 | 0.060 | 0.055 | 0.049 | 0.035 | 0.049 | 0.040 | 0.055 | 0.050 | 0.039 | |
| 256 | 0.045 | 0.041 | 0.041 | 0.027 | 0.045 | 0.035 | 0.060 | 0.040 | 0.047 | |
| 8 | 0.605 | 0.582 | 0.365 | 0.563 | 0.566 | 0.572 | 0.564 | 0.516 | 0.061 | |
| 32 | 0.477 | 0.444 | 0.118 | 0.201 | 0.209 | 0.211 | 0.230 | 0.174 | 0.054 | |
| 64 | 0.349 | 0.298 | 0.089 | 0.096 | 0.117 | 0.118 | 0.142 | 0.104 | 0.057 | |
| 128 | 0.178 | 0.155 | 0.067 | 0.043 | 0.056 | 0.054 | 0.086 | 0.064 | 0.060 | |
| 192 | 0.142 | 0.131 | 0.047 | 0.033 | 0.046 | 0.040 | 0.051 | 0.044 | 0.041 | |
| 256 | 0.112 | 0.099 | 0.040 | 0.020 | 0.043 | 0.034 | 0.068 | 0.037 | 0.048 |
Figure 1Simulation results for a skewed error distribution the first row is for type I errors, and the next two rows for power in set-up I with and set-up II with
Figure 2Simulation results for a heavy-tailed (and nonskewed) error distribution the first row is for type I errors, and the next two rows for power in set-up I with and set-up II with
Figure 3QQ plots for the analysis of triglyceride with 13,978 genes with MAC (A) SKAT (genomic control ), (B) SKAT-O (), (C) T1 (), (D) aSPU (), and (E) aSPUr (). (F) Histogram of covariate-adjusted triglyceride residuals with variant carriers of APOC3 highlighted.
RV association testing results of positive control gene APOC3 (among 13,978 genes with a MAC )
| Phenotype | SKAT | SKAT-O | T1 | aSPU | aSPUr | |
|---|---|---|---|---|---|---|
| TG | GC | 1.89 | 1.78 | 1.13 | 1.02 | 1.04 |
| 0.018 | 0.021 | 0.018 | 0.035 | 0.0036 | ||
| 642 | 620 | 297 | 501 | 62 | ||
| Ln(TG) | GC | 1.05 | 1.06 | 1.03 | 1.01 | 1.03 |
| 6 | 1 | 1 | 2 | 1 | ||
| INV(TG) | GC | 1.03 | 1.05 | 1.03 | 1.02 | 1.04 |
| 6 | 1 | 1 | 2 | 2 |