| Literature DB >> 24788602 |
George Tucker1, Alkes L Price2, Bonnie Berger3.
Abstract
Using a reduced subset of SNPs in a linear mixed model can improve power for genome-wide association studies, yet this can result in insufficient correction for population stratification. We propose a hybrid approach using principal components that does not inflate statistics in the presence of population stratification and improves power over standard linear mixed models.Entities:
Keywords: GWAS; mixed models; population stratification
Mesh:
Year: 2014 PMID: 24788602 PMCID: PMC4096359 DOI: 10.1534/genetics.114.164285
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562
Extent of null statistic inflation as measured by λ [median Wald statistic on test null SNPs divided by the theoretical median under the null distribution (Devlin and Roeder 1999)]
| Mean | Pop. strat., | Pop. strat., | ||
|---|---|---|---|---|
| Linear regression | 3.8 (0.4) | 4.5 (0.5) | 1.01 (0.01) | 1.01 (0.01) |
| Linear regression with PCs | 1.02 (0.01) | 1.03 (0.01) | 1.01 (0.01) | 1.02 (0.01) |
| LMM | 1.01 (0.01) | 1.02 (0.01) | 1.01 (0.01) | 1.01 (0.01) |
| FaST-LMM Select | 1.04 (0.01) | 1.26 (0.03) | 1.01 (0.01) | 0.99 (0.01) |
| PC-Select | 1.01 (0.01) | 1.01 (0.01) | 1.01 (0.01) | 0.99 (0.01) |
We tabulate λ for linear regression, linear regression with PCs, standard LMM, FaST-LMM Select, and PC-Select on simulated genotypes and phenotypes with and without population stratification as the fraction of causal SNPs (P = 0.05, 0.005) varies. Values shown are mean λ over 100 simulations with standard errors (SE) in parentheses. FaST-LMM Select inflates statistics in the presence of population stratification when few SNPs are causal (P = 0.005), which may result in false positives. Pop. strat., population stratification.
Extent of null statistic inflation measured by λ
| Mean | Pop. strat., | Pop. strat., | ||
|---|---|---|---|---|
| Linear regression | 1.58 (0.02) | 1.55 (0.02) | 1.03 (0.01) | 1.04 (0.01) |
| Linear regression with PCs | 1.01 (0.01) | 1.00 (0.01) | 1.01 (0.01) | 1.02 (0.01) |
| LMM | 1.02 (0.01) | 1.01 (0.01) | 1.00 (0.01) | 1.02 (0.01) |
| FaST-LMM Select | 1.02 (0.01) | 1.06 (0.01) | 1.00 (0.01) | 1.02 (0.01) |
| PC-Select | 1.01 (0.01) | 1.01 (0.01) | 1.00 (0.01) | 1.01 (0.01) |
We tabulate λ for linear regression, linear regression with PCs, standard LMM, FaST-LMM Select, and PC-Select on real genotypes and simulated phenotypes with and without population stratification as the fraction of causal SNPs (P = 0.05, 0.005) varies. Values shown are mean λ over 200 simulations with standard errors (SE) in parentheses. FaST-LMM Select inflates statistics in the presence of population stratification when few SNPs are causal (P = 0.005), which may result in false positives.
Figure 1(A and B) Comparison of power for linear regression, linear regression with PCs, standard LMM, FaST-LMM Select, and PC-Select on simulated genotypes and phenotypes (A) and real genotypes and simulated phenotypes (B) with and without population stratification as the fraction of casual SNPs (P = 0.05, 0.005) varies. To measure power, we plot the mean Wald statistic on test causal SNPs. In all cases, PC-Select has the highest power of the methods that do not inflate statistics.