| Literature DB >> 22373407 |
Laura L Faye1, Shelley B Bull.
Abstract
Genome-wide association studies (GWAS) test for disease-trait associations and estimate effect sizes at tag single-nucleotide polymorphisms (SNPs), which imperfectly capture variation at causal SNPs. Sequencing studies can examine potential causal SNPs directly; however, sequencing the whole genome or exome can be prohibitively expensive. Costs can be limited by using a GWAS to detect the associated region(s) at tag SNPs followed by targeted sequencing to identify and estimate the effect size of the causal variant. Genetic effect estimates obtained from association studies can be inflated because of a form of selection bias known as the winner's curse. Conversely, estimates at tag SNPs can be attenuated compared to the causal SNP because of incomplete linkage disequilibrium. These two effects oppose each other. Analysis of rare SNPs further complicates our understanding of the winner's curse because rare SNPs are difficult to tag and analysis can involve collapsing over multiple rare variants. In two-stage analysis of Genetic Analysis Workshop 17 simulated data sets, we find that selection at the tag SNP produces upward bias in the estimate of effect at the causal SNP, even when the tag and causal SNPs are not well correlated. The bias similarly carries through to effect estimates for rare variant summary measures. Replication studies designed with sample sizes computed using biased estimates will be under-powered to detect a disease-causing variant. Accounting for bias in the original study is critical to avoid discarding disease-associated SNPs at follow up.Entities:
Year: 2011 PMID: 22373407 PMCID: PMC3287903 DOI: 10.1186/1753-6561-5-S9-S64
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Distribution of estimates for a common tag SNP and a common causal SNP. Boxplot of estimates of genetic effect at a tag SNP (GWAS stage 1) and causal SNP C6S5380 (sequencing stage 2) on quantitative trait Q2 over 200 replicates with and without selection at a stage 1 tag SNP for additive genetic effect using a selection threshold p < 0.05. p is the MAF of the causal SNP, p is the MAF of the tag SNP, ρ is the correlation between the tag SNP and the causal SNP. Horizontal lines are the null effect size (zero) and the mean of causal SNP genetic effect estimates without selection. Because of sampling variation, the mean is different from the median (band in middle of boxplots).
Bias in genetic effect estimates for a common tag SNP and multiple rare causal SNPs
| Trait | Gene | Population | Tag SNP | Rare SNP collapsing statistic | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SNP | Significance level for additive test | Estimated power for additive test | Mean effect estimate | Correlation between tag SNP and | Mean effect estimate | |||||||
| Over all data sets | Over data sets with significant tag effect | Relative bias (%) | Over all data sets | Over data sets with significant tag effect | Relative bias (%) | |||||||
| Q1 | CEPH | C4S1878 | 0.001 | 0.15 | 0.63 | 0.93 | 49 | 0.41 | 5.51 | 6.19 | 12 | |
| DS | Tuscan | C1S9170 | 0.05 | 0.10 | 0.87 | 2.26 | 160 | 0.40 | 6.69 | 7.08 | 6 | |
| DS | Tuscan | C1S9171 | 0.05 | 0.10 | 0.87 | 2.26 | 160 | 0.40 | 6.69 | 7.08 | 6 | |
| DS | CEPH | C8S911 | 0.05 | 0.10 | 0.67 | 1.49 | 122 | 0.51 | 1.99 | 2.91 | 46 | |
| DS | Chinese | C8S925 | 0.05 | 0.06 | 0.21 | 1.23 | 476 | 0.47 | 0.34 | 1.53 | 350 | |
Results for scenario 2. Values computed as described in the Results section. DS is disease status.
Bias in genetic effect estimates for a rare tag SNP and multiple rare causal SNPs
| Trait | Gene | Population | Tag SNP | Rare SNP collapsing statistic | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SNP | Significance level for additive test | Estimated power for additive test | Mean effect estimate | Correlation between tag SNP and | Mean effect estimate | |||||||
| Over all data sets | Over data sets with significant tag effect | Relative bias (%) | Over all data sets | Over data sets with significant tag effect | Relative bias (%) | |||||||
| Q1 | Chinese | C5S5141 | 0.01 | 0.075 | 0.52 | 1.00 | 94 | 0.46 | 3.26 | 4.05 | 24 | |
| Q2 | Luhya | C8S1797 | 0.05 | 0.105 | 0.29 | 0.98 | 241 | 0.46 | 0.37 | 1.71 | 356 | |
| Q2 | CEPH | C17S1023 | 0.05 | 0.135 | 0.59 | 1.63 | 177 | 0.81 | 2.00 | 3.97 | 98 | |
| Q2 | Tuscan | C9S404 | 0.01 | 0.055 | 0.84 | 2.28 | 173 | 0.51 | 0.99 | 2.53 | 155 | |
| Q2 | Tuscan | C9S471 | 0.01 | 0.055 | 0.84 | 2.28 | 173 | 0.51 | 0.99 | 2.53 | 155 | |
| Q2 | CEPH | C12S193 | 0.01 | 0.070 | 0.34 | 0.93 | 171 | 0.65 | 0.61 | 1.12 | 85 | |
| Q2 | CEPH | C12S200 | 0.01 | 0.080 | 0.41 | 1.15 | 182 | 0.61 | 0.57 | 1.04 | 84 | |
| Q2 | CEPH | C12S203 | 0.05 | 0.075 | 0.21 | 0.53 | 157 | 0.48 | 0.63 | 0.93 | 47 | |
| Q2 | CEPH | C12S211 | 0.01 | 0.070 | 0.34 | 0.87 | 152 | 0.78 | 0.65 | 1.39 | 113 | |
| Q2 | Japanese | C12S193 | 0.05 | 0.070 | 0.21 | 0.72 | 244 | 0.56 | -0.12 | 0.36 | 403 | |
| Q2 | European | C12S193 | 0.05 | 0.200 | 0.31 | 0.79 | 153 | 0.62 | 0.55 | 1.03 | 87 | |
| Q2 | European | C12S200 | 0.01 | 0.070 | 0.30 | 0.92 | 203 | 0.45 | 0.49 | 0.95 | 93 | |
| Q2 | European | C12S203 | 0.05 | 0.070 | 0.18 | 0.50 | 180 | 0.46 | 0.58 | 0.85 | 47 | |
| Q2 | European | C12S211 | 0.01 | 0.055 | 0.32 | 0.90 | 186 | 0.74 | 0.59 | 1.22 | 107 | |
| Q2 | All | C12S203 | 0.05 | 0.140 | 0.16 | 0.41 | 160 | 0.48 | 0.69 | 1.06 | 54 | |
| DS | CEPH | C8S883 | 0.05 | 0.075 | 0.95 | 2.40 | 153 | 0.53 | 2.35 | 13.83 | 489 | |
| DS | Chinese | C8S885 | 0.05 | 0.050 | 0.16 | 1.24 | 692 | 0.87 | 0.38 | 2.68 | 610 | |
| DS | European | C8S911 | 0.05 | 0.130 | 0.63 | 1.27 | 102 | 0.50 | 2.04 | 2.86 | 40 | |
Results for scenario 3. Values computed as described in the Results section. DS is disease status.