| Literature DB >> 22496814 |
Jian Li1, Yan-fang Guo, Yufang Pei, Hong-Wen Deng.
Abstract
Genotype imputation is often used in the meta-analysis of genome-wide association studies (GWAS), for combining data from different studies and/or genotyping platforms, in order to improve the ability for detecting disease variants with small to moderate effects. However, how genotype imputation affects the performance of the meta-analysis of GWAS is largely unknown. In this study, we investigated the effects of genotype imputation on the performance of meta-analysis through simulations based on empirical data from the Framingham Heart Study. We found that when fix-effects models were used, considerable between-study heterogeneity was detected when causal variants were typed in only some but not all individual studies, resulting in up to ∼25% reduction of detection power. For certain situations, the power of the meta-analysis can be even less than that of individual studies. Additional analyses showed that the detection power was slightly improved when between-study heterogeneity was partially controlled through the random-effects model, relative to that of the fixed-effects model. Our study may aid in the planning, data analysis, and interpretation of GWAS meta-analysis results when genotype imputation is necessary.Entities:
Mesh:
Year: 2012 PMID: 22496814 PMCID: PMC3320624 DOI: 10.1371/journal.pone.0034486
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Selection strategy and quality control parameters for sub-population construction.
|
|
|
|
| |
| Pre-QC | After-QC | |||
| Sample 1 | Singletons, all unrelated subjects from the 1st generation (two at most) in each pedigree, plus married-ins in the 2nd and the 3rd generations | 2,200 | 2,023 | 412,432 |
| Sample 2 | One subject from the 2nd generation in each pedigree | 1,071 | 1,055 | 416,800 |
| Sample 3 | One subject from the 3rd generation in each pedigree | 812 | 806 | 417,532 |
Parameter values for quality control (QC): minor allele frequency >0.01, Hardy-Weinberg equilibrium test p-values>0.0001, sample call-rate >0.90, and SNP call-rate >0.90.
Simulation schemes and parameters.
|
| |
| Scenario 1 | Directly-typed in all three sub-samples |
| Scenario 2 | Imputed in all three sub-samples |
| Scenario 3 | Imputed in Samples 1 & 2 but typed in Sample 3 |
| Scenario 4 | Imputed in Sample 1 but typed in Samples 2 & 3 |
Figure 1Assessment of between-study heterogeneity under various scenarios.
The plots in the left column (A, C, E, G) show the mean values of , and those in the right column (B, D, F, H) show the average percentage of simulations with (large between-study heterogeneity). The plots in rows 1–4 are for scenarios 1–4, respectively. Descriptions of scenarios and RAF's are given in Table 2, and “var” values indicate the simulated QTL variance.
Mean power and type-I error rate of (α = 10−7).
|
|
|
| |||||||
| RAF1 | RAF2 | RAF3 | RAF4 | RAF1 | RAF2 | RAF3 | RAF4 | ||
| Scenario 1 | 0 | 3.01 | 2.98 | 3.10 | 2.97 | 2.86 | 3.05 | 2.89 | 2.89 |
| 0.5 | 18.77 | 20.93 | 19.38 | 19.41 | 16.35 | 20.93 | 19.34 | 19.40 | |
| 1.0 | 84.96 | 82.32 | 81.66 | 80.43 | 78.39 | 81.81 | 81.36 | 80.23 | |
| 1.5 | 98.79 | 97.76 | 95.69 | 95.90 | 93.24 | 94.66 | 93.89 | 92.29 | |
| 2.0 | 100.00 | 98.88 | 96.69 | 96.72 | 96.17 | 95.77 | 95.59 | 95.09 | |
| Scenario 2 | 0 | 3.27 | 3.33 | 3.25 | 3.42 | 3.13 | 3.12 | 3.04 | 3.05 |
| 0.5 | 17.20 | 14.01 | 13.71 | 8.91 | 17.20 | 14.01 | 13.60 | 7.72 | |
| 1.0 | 69.90 | 67.77 | 56.32 | 49.29 | 69.90 | 67.77 | 56.00 | 44.66 | |
| 1.5 | 88.20 | 84.35 | 81.19 | 68.76 | 88.09 | 84.35 | 80.77 | 62.83 | |
| 2.0 | 91.51 | 90.22 | 86.61 | 73.87 | 91.29 | 90.11 | 85.87 | 66.15 | |
| Scenario 3 | 0 | 5.57 | 5.49 | 5.48 | 5.50 | 3.55 | 3.52 | 3.53 | 3.52 |
| 0.5 | 10.13 | 7.42 | 7.65 | 4.04 | 10.92 | 8.14 | 7.65 | 5.70 | |
| 1.0 | 37.26 | 34.81 | 29.76 | 26.48 | 38.18 | 37.70 | 36.69 | 33.27 | |
| 1.5 | 46.98 | 44.8 | 41.34 | 34.92 | 46.08 | 47.14 | 46.09 | 42.62 | |
| 2.0 | 49.13 | 47.27 | 44.00 | 38.12 | 53.33 | 52.54 | 51.26 | 49.17 | |
| Scenario 4 | 0 | 5.56 | 5.53 | 5.52 | 5.56 | 3.54 | 3.53 | 3.56 | 3.54 |
| 0.5 | 8.60 | 7.55 | 7.21 | 4.40 | 8.72 | 7.83 | 7.61 | 4.99 | |
| 1.0 | 32.77 | 35.02 | 35.02 | 31.12 | 33.25 | 32.41 | 32.12 | 30.22 | |
| 1.5 | 45.27 | 45.73 | 45.73 | 39.61 | 46.76 | 46.34 | 45.01 | 40.63 | |
| 2.0 | 47.50 | 47.89 | 47.89 | 40.94 | 51.43 | 50.20 | 49.74 | 45.55 | |
Descriptions for Scenarios 1–4 and RAF ranges are given in Table 2. Mean power and type-I error rates were estimated based on 1,000 simulations.
Figure 2Comparison of effect size and standard error estimated by meta-analysis to simulated true values.
The simulated QTLs explain 2.0% of the total trait variance.
Figure 3Power comparison between meta-analysis of different scenarios and association analysis in individual Sample 1.
Sample1_geno and Sample1_impu refer to situations where causal SNPs are typed and imputed, respectively, in Sample 1. QTL variation of 2.0% is used.