| Literature DB >> 22373282 |
Hua He1, Xue Zhang, Lili Ding, Tesfaye M Baye, Brad G Kurowski, Lisa J Martin.
Abstract
Principal components analysis (PCA) has been successfully used to correct for population stratification in genome-wide association studies of common variants. However, rare variants also have a role in common disease etiology. Whether PCA successfully controls population stratification for rare variants has not been addressed. Thus we evaluate the effect of population stratification analysis on false-positive rates for common and rare variants at the single-nucleotide polymorphism (SNP) and gene level. We use the simulation data from Genetic Analysis Workshop 17 and compare false-positive rates with and without PCA at the SNP and gene level. We found that SNPs' minor allele frequency (MAF) influenced the ability of PCA to effectively control false discovery. Specifically, PCA reduced false-positive rates more effectively in common SNPs (MAF > 0.05) than in rare SNPs (MAF < 0.01). Furthermore, at the gene level, although false-positive rates were reduced, power to detect true associations was also reduced using PCA. Taken together, these results suggest that sequence-level data should be interpreted with caution, because extremely rare SNPs may exhibit sporadic association that is not controlled using PCA.Entities:
Year: 2011 PMID: 22373282 PMCID: PMC3287840 DOI: 10.1186/1753-6561-5-S9-S116
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Scatterplot of the first two principal components
Figure 2Manhattan plot of 1,379 common nonsynonymous SNPs (MAF > 0.05). Top panel: before PCA adjustment. Bottom panel: after PCA adjustment. Dashed line corresponds to the linkage-disequilibrium-adjusted Bonferroni significance level of 9.8 × 10−5.
Figure 3Manhattan plot of 10,648 rare nonsynonymous SNPs (MAF < 0.01). Top panel: before PCA adjustment. Bottom panel: after PCA adjustment. Dashed line corresponds to the linkage-disequilibrium-adjusted Bonferroni significance level of 1.7 × 10−5.
Figure 4Boxplot of the absolute difference in −log10(p-value) before and after PCA by MAF
Figure 5Manhattan plot of genes for the three collapsing methods. Left panels: before PCA adjustment. Right panels: after PCA adjustment. Dashed line corresponds to the linkage-disequilibrium-adjusted Bonferroni significance level of 7.86 × 10−5.
Number of replicates with true discovery for the causal genes before and after PCA adjustment
| Gene | Indicator method | Proportion method | Adaptive-sum test method | |||
|---|---|---|---|---|---|---|
| Before PCA | After PCA | Before PCA | After PCA | Before PCA | After PCA | |
| 0 | 0 | 0 | 0 | 0 | 0 | |
| 0 | 0 | 0 | 0 | 0 | 0 | |
| 33 | 12 | 33 | 12 | 67 | 15 | |
| 2 | 0 | 1 | 0 | 3 | 0 | |
| 0 | 0 | 0 | 0 | 0 | 0 | |
| 0 | 0 | 0 | 0 | 0 | 0 | |
| 94 | 17 | 160 | 50 | 163 | 53 | |
| 0 | 1 | 0 | 1 | 0 | 1 | |
| 15 | 8 | 15 | 8 | 15 | 8 | |