| Literature DB >> 25519417 |
Andriy Derkach1, Jerry F Lawless2, Daniele Merico3, Andrew D Paterson4, Lei Sun5.
Abstract
The focus of our work is to evaluate several recently developed pooled association tests for rare variants and assess the impact of different gene annotation methods and binning strategies on the analyses of rare variants under Genetic Analysis Workshop 18 real and simulated data settings. We considered the sample of 103 unrelated individuals with sequence data, genotypes of rare variants from chromosome 3, real phenotype of hypertension status and simulated phenotypes of systolic blood pressure (SBP) and diastolic blood pressure (DBP), and covariates of age, sex, and the interaction between age and sex. In the analysis of real phenotype data, we did not obtain significant results for any binning strategy; however, we observed a slight deviation of the p-values from the uniform distribution based on the protein-damaging variant grouping strategy. Evaluation of methods using simulated data showed lack of power even at the conservative level of 0.05 for most of the causal genes on chromosome 3. Nevertheless, analysis of MAP4 produced good power for all tests at various levels of the tests for both DBP and SBP. Our results also confirmed that Fisher's method is not only robust but can also improve power over individual pooled linear and quadratic tests and is often better than other robust tests such as SKAT-O.Entities:
Year: 2014 PMID: 25519417 PMCID: PMC4143759 DOI: 10.1186/1753-6561-8-S1-S9
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Descriptive statistics of different grouping or binning strategies based on annotations of sequence variants on chromosome 3
| Strategies of grouping variants in a gene | Total # of variants | Restricting to variants with MAF ≤ 0.05 | |||
|---|---|---|---|---|---|
| # of genes with ≥1 variant | # of genes with ≥2 variants | Average # of variants per gene with ≥1 variant | Average MAF | ||
| 7435 | 900 | 690 | 4.34 | 0.012 | |
| 4099 | 729 | 479 | 2.95 | 0.011 | |
| 1791 | 462 | 210 | 1.94 | 0.011 | |
| 15,326 (4099 + 11227) | 1034 | 841 | 8.17 | 0.011 | |
| 5987 (1791 + 4196) | 735 | 438 | 4.31 | 0.012 | |
Descriptive statistics for rare variants with minor allele frequencies of 0.05 or less were constructed from the sample of 103 unrelated individuals. The number of genes and average number of variants per gene were slightly reduced when we analyzed the real data because the number of individuals was reduced to 96 after consideration of missing phenotype and covariates.
Figure 1Quantile-quantile plots of -log10 (. (A) Coding (reduced). (B) Protein change (reduced). (C) Protein damage (reduced). See Table 1 for detailed variant annotation and binning strategies A to C. The p-values were obtained using the parametric bootstrap method as described in the text for hypertension status.
Empirical power for the 7 association tests using simulated phenotype data
| Gene | Total # of rare variants | # of causal rare variants | Methods | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 6 | 2 | 0.29 | 0.15 | 0.15 | 0.24 | 0.27 | 0.25 | ||
| 8 | 4 | 0.96 | 0.93 | 0.96 | 0.91 | 0.97 | 0.96 | ||
| 6 | 01 | 0.05 | 0.05 | 0.09 | 0.12 | 0.12 | 0.08 | ||
| 8 | 2 | 0.25 | 0.18 | 0.21 | 0.13 | 0.28 | 0.19 | ||
| 1 | 01 | 0.19 | 0.19 | 0.19 | 0.19 | 0.19 | 0.19 | 0.19 | |
| 5 | 2 | 0.35 | 0.16 | 0.11 | 0.29 | 0.31 | 0.26 | ||
| 8 | 4 | 0.90 | 0.88 | 0.86 | 0.73 | 0.90 | 0.91 | ||
| 6 | 01 | 0.10 | 0.06 | 0.05 | 0.11 | 0.11 | 0.08 | ||
| 8 | 2 | 0.06 | 0.04 | 0.11 | 0.14 | 0.15 | 0.09 | ||
| 1 | 01 | 0.27 | 0.27 | 027 | 0.27 | 0.27 | 0.27 | 0.27 | |
| 6 | 01 | 0.17 | 0.14 | 0.17 | 0.13 | 0.19 | 0.18 | ||
| 13 | 4 | 0.07 | 0.07 | 0.10 | 0.08 | 0.08 | |||
| 4 | 1 | 0.06 | 0.05 | 0.07 | 0.08 | 0.07 | 0.06 | ||
Two continuous phenotypes, systolic blood pressure (SBP) and diastolic blood pressure (DBP), and one binary phenotype, hypertension status, were analyzed. Rare variants (minor allele frequency ≤0.05) were grouped by gene and annotated as coding (strategy a in Table 1). All causal variants have the same direction of effect by the Genetic Analysis Workshop 18 simulation design. Level of tests was set to 0.05 because of a lack of power at a more stringent level. Genes presented are the ones with maximum power (bolded) among the 7 tests greater than 10% at the 0.05 level.
1"Power" for these genes with no causal variants are attributable to linkage disequilibrium between non-causal rare variants in these genes and causal variants in other genes (see text for details).
Figure 2Receiver operating characteristic curve examining the relationship between power and type I error for the 7 tests. (A) .(B) The phenotype is systolic blood pressure (SBP). For the MAP4 gene, 4 of the 8 rare variants have a causal effect on SBP; in total, they explain 5.8% of variation in SBP.