| Literature DB >> 22373066 |
Qunyuan Zhang1, Doyoung Chung, Aldi Kraja, Ingrid I Borecki, Michael A Province.
Abstract
Because of the low frequency of rare genetic variants in observed data, the statistical power of detecting their associations with target traits is usually low. The collapsing test of collective effect of multiple rare variants is an important and useful strategy to increase the power; in addition, family data may be enriched with causal rare variants and therefore provide extra power. However, when family data are used, both population structure and familial relatedness need to be adjusted for the possible inflation of false positives. Using a unified mixed linear model and family data, we compared six methods to detect the association between multiple rare variants and quantitative traits. Through the analysis of 200 replications of the quantitative trait Q2 from the Genetic Analysis Workshop 17 data set simulated for 697 subjects from 8 extended families, and based on quantile-quantile plots under the null and receiver operating characteristic curves, we compared the false-positive rate and power of these methods. We observed that adjusting for pedigree-based kinship gives the best control for false-positive rate, whereas adjusting for marker-based identity by state slightly outperforms in terms of power. An adjustment based on a principal components analysis slightly improves the false-positive rate and power. Taking into account type-1 error, power, and computational efficiency, we find that adjusting for pedigree-based kinship seems to be a good choice for the collective test of association between multiple rare variants and quantitative traits using family data.Entities:
Year: 2011 PMID: 22373066 PMCID: PMC3287871 DOI: 10.1186/1753-6561-5-S9-S35
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Six methods for comparison
| Method | Model | Detail |
|---|---|---|
| REG | Ignoring population structure and familial relatedness, with no adjustment | |
| PC | Top 10 eigenvectors from principal components analysis are used as the input of | |
| KIN | Kinship matrix based on pedigree data is used as | |
| PC-KIN | Combining PC and KIN methods | |
| IBS | IBS matrix based on genotype data is used as | |
| PC-IBS | Combining PC and IBS methods |
Figure 1Q-Q plots for six different methods. Q-Q plots of −log10 scaled p-values for six different methods based on 1,940 genes from 697 subjects (8 extended families) and 200 replications of quantitative trait Q2 simulated by GAW17 under the null hypothesis. Red curves, observed; black curves, expected.
Figure 2ROC curves for six different methods. ROC curves for six different methods based on 13 genes from 697 subjects (8 extended families) and 200 replications of quantitative trait Q2 simulated by GAW17. FPR is limited to be less than 0.1 because in practice only the true-positive rate (TPR) with a low FPR is of interest.