| Literature DB >> 22876189 |
Benjamin F Voight1, Hyun Min Kang, Jun Ding, Cameron D Palmer, Carlo Sidore, Peter S Chines, Noël P Burtt, Christian Fuchsberger, Yanming Li, Jeanette Erdmann, Timothy M Frayling, Iris M Heid, Anne U Jackson, Toby Johnson, Tuomas O Kilpeläinen, Cecilia M Lindgren, Andrew P Morris, Inga Prokopenko, Joshua C Randall, Richa Saxena, Nicole Soranzo, Elizabeth K Speliotes, Tanya M Teslovich, Eleanor Wheeler, Jared Maguire, Melissa Parkin, Simon Potter, N William Rayner, Neil Robertson, Kathleen Stirrups, Wendy Winckler, Serena Sanna, Antonella Mulas, Ramaiah Nagaraja, Francesco Cucca, Inês Barroso, Panos Deloukas, Ruth J F Loos, Sekar Kathiresan, Patricia B Munroe, Christopher Newton-Cheh, Arne Pfeufer, Nilesh J Samani, Heribert Schunkert, Joel N Hirschhorn, David Altshuler, Mark I McCarthy, Gonçalo R Abecasis, Michael Boehnke.
Abstract
Genome-wide association studies have identified hundreds of loci for type 2 diabetes, coronary artery disease and myocardial infarction, as well as for related traits such as body mass index, glucose and insulin levels, lipid levels, and blood pressure. These studies also have pointed to thousands of loci with promising but not yet compelling association evidence. To establish association at additional loci and to characterize the genome-wide significant loci by fine-mapping, we designed the "Metabochip," a custom genotyping array that assays nearly 200,000 SNP markers. Here, we describe the Metabochip and its component SNP sets, evaluate its performance in capturing variation across the allele-frequency spectrum, describe solutions to methodological challenges commonly encountered in its analysis, and evaluate its performance as a platform for genotype imputation. The metabochip achieves dramatic cost efficiencies compared to designing single-trait follow-up reagents, and provides the opportunity to compare results across a range of related traits. The metabochip and similar custom genotyping arrays offer a powerful and cost-effective approach to follow-up large-scale genotyping and sequencing studies and advance our understanding of the genetic basis of complex human diseases and traits.Entities:
Mesh:
Year: 2012 PMID: 22876189 PMCID: PMC3410907 DOI: 10.1371/journal.pgen.1002793
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Summary of Metabochip SNPs by trait: Fine-mapping and replication.
| Consortium | Trait Name | Fine Mapping | Replication SNPs | ||
| # Loci | Size (Mb) | # SNPs | |||
|
| |||||
| DIAGRAM | Type 2 Diabetes | 34 | 6.56 | 16,717 | 5,057 |
| CARDIoGRAM | MI and CAD | 30 | 9.60 | 19,558 | 6,485 |
| Lipids | HDL Cholesterol | 23 | 4.62 | 12,150 | 5,024 |
| LDL Cholesterol | 21 | 4.06 | 9,981 | 5,060 | |
| Triglyceride | 20 | 4.68 | 9,784 | 5,057 | |
| GIANT | Body Mass Index | 24 | 7.48 | 18,211 | 5,055 |
| Waist-to-Hip Ratio | 15 | 2.25 | 5,464 | 5,056 | |
| MAGIC | Fasting Glucose | 19 | 5.05 | 13,644 | 5,058 |
| ICBP | Diastolic Blood Pressure | 20 | 8.34 | 13,239 | 5,060 |
| Systolic Blood Pressure | 21 | 6.01 | 10,641 | 5,059 | |
| QT-IGC | QT Interval | 18 | 4.08 | 10,910 | 5,041 |
|
| |||||
| DIAGRAM | T2D Age of Diagnosis | 0 | 0.00 | 0 | 1,039 |
| T2D Early Onset | 0 | 0.00 | 0 | 1,040 | |
| HaemGen | Mean Platelet Volume | 0 | 0.00 | 0 | 657 |
| Platelet Count | 0 | 0.00 | 0 | 577 | |
| White Blood Cell | 0 | 0.00 | 0 | 598 | |
| Lipids | Total Cholesterol | 0 | 0.00 | 0 | 941 |
| Body Fat | Body Fat Percentage | 0 | 0.00 | 0 | 1,035 |
| GIANT | Height | 0 | 0.00 | 0 | 1,050 |
| Waist Circumference | 2 | 0.50 | 1,374 | 1,048 | |
| MAGIC | 2-Hour Glucose | 3 | 0.61 | 1,249 | 1,038 |
| Glycated Hemoglobin | 5 | 0.46 | 2,181 | 1,045 | |
| Fasting Insulin | 2 | 0.67 | 1,309 | 1,046 | |
| TOTAL | With Redundancy | 257 | 64.97 | 146,453 | 68,126 |
| Unique Regions/SNPs | 257 | 45.52 | 122,241 | 63,450 | |
SNP counts are numbers of SNPs successfully manufactured on the Metabochip array.
Waist-to-hip ratio and waist circumference were adjusted for body mass index.
Summary of Metabochip SNPs by SNP category.
| SNP Category | Chosen for Array | Passed Manufacture | Among 67 HapMap samples | ||
| >95% Called | MAF>0 | MAF<.05 | |||
| Replication | 66,130 | 63,450 (95.9%) | 61,386 (96.7%) | 60,585 (98.7%) | 6,121 (10.1%) |
| Fine-Mapping | 139,877 | 122,241 (87.4%) | 116,779 (95.5%) | 92,731 (79.4%) | 37,552 (40.5%) |
| Prior Trait Association | 2,210 | 2,116 (95.7%) | 2,043 (96.5%) | 2,039 (99.8%) | 235 (11.5%) |
| CNP tags | 6,888 | 6,626 (96.2%) | 6,250 (94.3%) | 6,160 (98.6%) | 941 (15.3%) |
| MHC | 3,203 | 2,909 (90.8%) | 2,550 (87.7%) | 2,537 (99.5%) | 185 (7.3%) |
| Mitochondrial | 144 | 135 (93.8%) | 102 (75.6%) | 66 (64.7%) | 28 (42.4%) |
| Chromosome X/Y | 112 | 107 (95.5%) | 106 (99.1%) | 104 (98.1%) | 0 (0%) |
| Fingerprint | 46 | 43 (93.5%) | 40 (93.0%) | 40 (100%) | 0 (0%) |
| Wildcard | 5,323 | 5,056 (95.0%) | 4,847 (95.9%) | 4,108 (84.8%) | 493 (12.0%) |
| TOTAL (without redundancy) | 217,695 | 196,725 (90.4%) | 188,395 (95.8%) | 163,107 (86.6%) | 44,967 (27.6%) |
Numbers in parenthesis represents the proportion of the SNPs in the previous column. A SNP may fall into multiple categories.
Figure 1Example of signal fine mapping (SFM) and locus fine mapping (LFM) regions.
A SFM region seeks to map the initial association signal. SFM regions were designed using linkage disequilibrium (LD) r2 estimates from the 1000 Genomes Project and HapMap CEU data. Initial boundaries were determined by identifying all SNPs satisfying r2≥.5 with the index SNP, and then expanded to the nearest flanking recombination hotspot, but stopped if there was no hotspot nearby. LFM regions (blue) were similarly designed but expanded to capture functional units of interest such as nearby coding genes. The figure plots LD r2 for SNPs (red dots) within the region and recombination rate (blue lines) as a function of position on the chromosome. Gene positions and structures are displayed in the lower panel. MI = myocardial Infarction; CAD = cardiovascular disease; HDL = high-density lipoprotein; LDL = low-density lipoprotein; T2D = type 2 diabetes.
Figure 2Allele frequency spectrum for Metabochip SNPs by design category.
Blue dots, red squares, and green triangles display fractions of replication, fine-mapping, and all other SNPs (see Table 2) in each of the tabulated minor allele-frequency bins. CNP = copy number polymorphism.
Figure 3Coverage of 257 Metabochip fine-mapping regions.
Fraction of 1000 Genomes Project SNPs in strong linkage disequilibrium (r2≥.8) with HapMap 3 (green squares) or Metabochip (blue dots) SNPs as a function of minor allele frequencies: (A) 1000 Genomes Pilot 1 SNPs, (B) 1000 Genomes Phase 1 SNPs (May 2011 release).
Figure 4Imputation accuracy (estimated r2) in fine mapping regions.
Imputation accuracy for differing numbers of Sardinian individuals as measured by estimated r2 value across the 257 Metabochip fine mapping regions for Metabochip (red squares), Affymetrix 6.0 GWAS SNPs (green triangles), and Affymetrix 500 k GWAS SNPs (blue circles) as a function of minor allele frequency bin.
Figure 5Regional association plots for LDL cholesterol association in the SardiNIA study.
Association plots for a study of 2,432 Sardinian individuals for five Metabochip fine-mapping regions using 1000 Genomes data as reference set and Affymetrix genotypes (left panels : A,C,E,G,H) or Metabochip genotypes (right panels : B,D,F,H,J) as target sets. The figures plot −log10 of the association p-value within the region and recombination rate (blue lines) as a function of position on the chromosome. Blue, green, and red dots and triangles indicate genotyped and imputed SNPs with minor allele frequencies less than 0.02, greater than or equal 0.02 and less than 0.05, and greater than or equal 0.05, respectively. Gene positions and structures are displayed in the lower panel.