| Literature DB >> 22189269 |
Hae-Won Uh1, Joris Deelen, Marian Beekman, Quinta Helmer, Fernando Rivadeneira, Jouke-Jan Hottenga, Dorret I Boomsma, Albert Hofman, André G Uitterlinden, P E Slagboom, Stefan Böhringer, Jeanine J Houwing-Duistermaat.
Abstract
Genotype imputation has become an essential tool in the analysis of genome-wide association scans. This technique allows investigators to test association at ungenotyped genetic markers, and to combine results across studies that rely on different genotyping platforms. In addition, imputation is used within long-running studies to reuse genotypes produced across generations of platforms. Typically, genotypes of controls are reused and cases are genotyped on more novel platforms yielding a case-control study that is not matched for genotyping platforms. In this study, we scrutinize such a situation and validate GWAS results by actually retyping top-ranking SNPs with the Sequenom MassArray platform. We discuss the needed quality controls (QCs). In doing so, we report a considerable discrepancy between the results from imputed and retyped data when applying recommended QCs from the literature. These discrepancies appear to be caused by extrapolating differences between arrays by the process of imputation. To avoid false positive results, we recommend that more stringent QCs should be applied. We also advocate reporting the imputation quality measure (R(T)(2)) for the post-imputation QCs in publications.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22189269 PMCID: PMC3330212 DOI: 10.1038/ejhg.2011.231
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Figure 1Study samples and arrays used. Affy500 stands for the first generation Affymetrix Gene Chip Human Mapping 500K Array, Illumina660 for Illumina Infinium HD Human660W-Quad BeadChips, and Illumina550 for Illumina Infinium II HumanHap 550K and HumanHap550-Duo BeadChips. Sib 2 and controls were all genotyped, and for Sib1 in addition to the overlapping genotyped 60K SNPs, the remaining 457K SNPs were imputed. After post-imputation QC, 451K SNPs were analyzed using ASP–control design.
Study designs and arrays used in Figure 3
| a | ASP–control | Sib 2 and control Sib 1 | 517K 350K | 60K | 457K | 451K | 1.16 |
| b | Case–control | Sib 2 and control | 517K | 517K | 517K | 1.03 | |
| c | ASP–control | Sib 2 and control Sib 1 | 517K 350K | 60K | 60K | 1.06 | |
| d | ASP–control | Sib 2 and control Sib 1 | 517K 350K | 60K | 97K | 157K2 | 1.05 |
No. of SNPs that passed QC at the pre-imputation stage.
No. of SNPs with R2⩾0.98.
Figure 2Comparison of the pre- and the postanalysis imputation information measure. The x axis shows the preanalysis information measure (r2), and the y axis the post-analysis information measure (R2). The blue points indicate the SNPs with no association (P-value >0.05); there is little effect of case–control status, and two measures agree. The red ones are the SNPs that show strong association (P-value <0.001), and the green ones are intermediate cases.
Figure 3Quantile–quantile plots obtained from LLS GWAS analyses. The triangles indicate the SNPs at which the test statistic exceeds 30 (corresponding P-value <5 × 10−8). The 95% concentration bands (shaded gray) are included. (a) ASP–control design: combined data of imputed Affy500 (Sib 1), typed Illumina660 (Sib 2), and typed Illumina550 (control). Deviation form the dashed line indicates inflation of test statistics. (b) Case–control design: genotyped with Illumina660 (Sib 2) and Illumina550 (control). (c) ASP–control design: 60K overlap using combined typed data of Affy500 (Sib 1), Illumina660 (Sib 2), and Illumina550 (control). (d) ASP–control design: as in (a), but only SNPs with R2>0.98. Details are provided in Table 1.
Figure 4Comparison of the MAF between GWAS and replication data. Top: x axis shows MAF of imputed Sib 1 data using Affy500, and y axis MAF of the same SNPs replicated with Sequenom. The green colored did not pass the threshold R2>0.98. Bottom: x axis shows MAF of (genotyped) Sib 2 data using Illumina660, and y axis MAF of the same SNPs replicated with Sequenom. The red-filled circle in both panels indicates the same SNP.