Literature DB >> 22876189

The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits.

Benjamin F Voight¹, Hyun Min Kang, Jun Ding, Cameron D Palmer, Carlo Sidore, Peter S Chines, Noël P Burtt, Christian Fuchsberger, Yanming Li, Jeanette Erdmann, Timothy M Frayling, Iris M Heid, Anne U Jackson, Toby Johnson, Tuomas O Kilpeläinen, Cecilia M Lindgren, Andrew P Morris, Inga Prokopenko, Joshua C Randall, Richa Saxena, Nicole Soranzo, Elizabeth K Speliotes, Tanya M Teslovich, Eleanor Wheeler, Jared Maguire, Melissa Parkin, Simon Potter, N William Rayner, Neil Robertson, Kathleen Stirrups, Wendy Winckler, Serena Sanna, Antonella Mulas, Ramaiah Nagaraja, Francesco Cucca, Inês Barroso, Panos Deloukas, Ruth J F Loos, Sekar Kathiresan, Patricia B Munroe, Christopher Newton-Cheh, Arne Pfeufer, Nilesh J Samani, Heribert Schunkert, Joel N Hirschhorn, David Altshuler, Mark I McCarthy, Gonçalo R Abecasis, Michael Boehnke.

Abstract

Genome-wide association studies have identified hundreds of loci for type 2 diabetes, coronary artery disease and myocardial infarction, as well as for related traits such as body mass index, glucose and insulin levels, lipid levels, and blood pressure. These studies also have pointed to thousands of loci with promising but not yet compelling association evidence. To establish association at additional loci and to characterize the genome-wide significant loci by fine-mapping, we designed the "Metabochip," a custom genotyping array that assays nearly 200,000 SNP markers. Here, we describe the Metabochip and its component SNP sets, evaluate its performance in capturing variation across the allele-frequency spectrum, describe solutions to methodological challenges commonly encountered in its analysis, and evaluate its performance as a platform for genotype imputation. The metabochip achieves dramatic cost efficiencies compared to designing single-trait follow-up reagents, and provides the opportunity to compare results across a range of related traits. The metabochip and similar custom genotyping arrays offer a powerful and cost-effective approach to follow-up large-scale genotyping and sequencing studies and advance our understanding of the genetic basis of complex human diseases and traits.

Entities: Chemical

Mesh：

Year: 2012 PMID： 22876189 PMCID： PMC3410907 DOI： 10.1371/journal.pgen.1002793

Source DB: PubMed Journal: PLoS Genet ISSN： 1553-7390 Impact factor: 5.917

Introduction

Recent data emerging from theoretical models [1], [2] and empirical observation through genome-wide association studies (GWAS) (for example [3], [4]) demonstrate that hundreds of genetic loci contribute to complex traits in humans. These data prompt two questions: (1) can additional genetic loci be identified by follow-up of the most significantly associated variants after initial GWAS meta-analysis? and (2) can further investigation via genetic fine-mapping refine association signals at established genetic loci? Systematically addressing these two questions should help improve understanding of the genetic architecture of complex traits and their shared genetic determinants, and suggest hypotheses and disease mechanisms that can be tested in functional experiments or model systems [5]. Addressing these two questions requires genotyping thousands of individuals at many genetic markers. For most currently available genotyping technologies, this kind of characterization is cost-prohibitive. To address this need in the context of type 2 diabetes, coronary artery disease and myocardial infarction, and quantitative traits related to these diseases, we designed the Metabochip, a custom genotyping array that provides accurate and cost-effective genotyping of nearly 200,000 single nucleotide polymorphisms (SNPs) chosen based on GWAS meta-analyses of 23 traits (Table 1). Metabochip SNPs were selected from the catalogs developed by the International HapMap [6] and 1000 Genomes [7] Projects, allowing inclusion of SNPs across a wide range of the allele frequency spectrum. These included 63,450 SNPs to follow-up the top ∼5,000 or ∼1,000 (see Methods) independent association signals for each of the 23 traits, 122,241 SNPs to fine-map 257 loci which showed genome-wide significant evidence for association with one or more of the 23 traits, and 16,992 SNPs chosen for a variety of other reasons (see Methods and Table 2). In designing the array, we sought to maximize assay success rates as well as the number of variants that could be assayed; Illumina custom arrays include a fixed number of “beads” and some sites can be assayed with a single bead while others require two [8].

Table 1

Summary of Metabochip SNPs by trait: Fine-mapping and replication.

Consortium	Trait Name	Fine Mapping			Replication SNPs
		# Loci	Size (Mb)	# SNPs
Tier 1
DIAGRAM	Type 2 Diabetes	34	6.56	16,717	5,057
CARDIoGRAM	MI and CAD	30	9.60	19,558	6,485
Lipids	HDL Cholesterol	23	4.62	12,150	5,024
	LDL Cholesterol	21	4.06	9,981	5,060
	Triglyceride	20	4.68	9,784	5,057
GIANT	Body Mass Index	24	7.48	18,211	5,055
	Waist-to-Hip Ratio*	15	2.25	5,464	5,056
MAGIC	Fasting Glucose	19	5.05	13,644	5,058
ICBP	Diastolic Blood Pressure	20	8.34	13,239	5,060
	Systolic Blood Pressure	21	6.01	10,641	5,059
QT-IGC	QT Interval	18	4.08	10,910	5,041
Tier 2
DIAGRAM	T2D Age of Diagnosis	0	0.00	0	1,039
	T2D Early Onset	0	0.00	0	1,040
HaemGen	Mean Platelet Volume	0	0.00	0	657
	Platelet Count	0	0.00	0	577
	White Blood Cell	0	0.00	0	598
Lipids	Total Cholesterol	0	0.00	0	941
Body Fat	Body Fat Percentage	0	0.00	0	1,035
GIANT	Height	0	0.00	0	1,050
	Waist Circumference*	2	0.50	1,374	1,048
MAGIC	2-Hour Glucose	3	0.61	1,249	1,038
	Glycated Hemoglobin	5	0.46	2,181	1,045
	Fasting Insulin	2	0.67	1,309	1,046
TOTAL	With Redundancy	257	64.97	146,453	68,126
	Unique Regions/SNPs	257	45.52	122,241	63,450

SNP counts are numbers of SNPs successfully manufactured on the Metabochip array.

Waist-to-hip ratio and waist circumference were adjusted for body mass index.

Table 2

Summary of Metabochip SNPs by SNP category.

SNP Category	Chosen for Array	Passed Manufacture	Among 67 HapMap samples
			>95% Called	MAF>0	MAF<.05
Replication	66,130	63,450 (95.9%)	61,386 (96.7%)	60,585 (98.7%)	6,121 (10.1%)
Fine-Mapping	139,877	122,241 (87.4%)	116,779 (95.5%)	92,731 (79.4%)	37,552 (40.5%)
Prior Trait Association	2,210	2,116 (95.7%)	2,043 (96.5%)	2,039 (99.8%)	235 (11.5%)
CNP tags	6,888	6,626 (96.2%)	6,250 (94.3%)	6,160 (98.6%)	941 (15.3%)
MHC	3,203	2,909 (90.8%)	2,550 (87.7%)	2,537 (99.5%)	185 (7.3%)
Mitochondrial	144	135 (93.8%)	102 (75.6%)	66 (64.7%)	28 (42.4%)
Chromosome X/Y	112	107 (95.5%)	106 (99.1%)	104 (98.1%)	0 (0%)
Fingerprint	46	43 (93.5%)	40 (93.0%)	40 (100%)	0 (0%)
Wildcard	5,323	5,056 (95.0%)	4,847 (95.9%)	4,108 (84.8%)	493 (12.0%)
TOTAL (without redundancy)	217,695	196,725 (90.4%)	188,395 (95.8%)	163,107 (86.6%)	44,967 (27.6%)

Numbers in parenthesis represents the proportion of the SNPs in the previous column. A SNP may fall into multiple categories.

SNP counts are numbers of SNPs successfully manufactured on the Metabochip array. Waist-to-hip ratio and waist circumference were adjusted for body mass index. Numbers in parenthesis represents the proportion of the SNPs in the previous column. A SNP may fall into multiple categories. Here, we describe Metabochip array design, and evaluate performance of the array in common genetic analysis steps, including quality control steps such as genomic control calculations, identification of related individuals, and fine-mapping of known disease susceptibility loci. Our results provide practical guidance to investigators and show that for fine-mapping loci the Metabochip provides much greater resolution than prior GWAS arrays.

Methods

Core Features of the Metabochip: Traits and SNPs

The Metabochip was designed by representatives of the Body Fat Percentage [9], CARDIoGRAM (coronary artery disease and myocardial infarction) [10], DIAGRAM (type 2 diabetes) [11], GIANT (anthropometric traits) [3], [12], [13], Global Lipids Genetics (lipids) [4], HaemGen (hematological measures) [14], ICBP (blood pressure) [15], MAGIC (glucose and insulin) [16]–[18], and QT-IGC (QT interval) [19], [20] GWAS meta-analysis consortia. The array is comprised of SNPs selected across two tiers of traits (Table 1). Tier 1 is comprised of eleven traits deemed to be of primary interest: type 2 diabetes (T2D), fasting glucose, coronary artery disease and myocardial infarction (CAD/MI), low density lipoprotein (LDL) cholesterol, high density lipoprotein (HDL) cholesterol, triglycerides, body mass index (BMI), systolic and diastolic blood pressure, QT interval, and waist-to-hip ratio adjusted for BMI (WHR). Tier 2 is comprised of twelve traits of secondary interest: fasting insulin, 2-hour glucose, glycated hemoglobin (HbA1c), T2D age of diagnosis, early onset T2D (diagnosis age<45 years), waist circumference adjusted for BMI, height, body fat percentage, total cholesterol, platelet count, mean platelet volume, and white blood cell count. We included three design classes of SNPs on the Metabochip (Table 2): Replication SNPs: ∼5,000 (Tier 1) or ∼1,000 (Tier 2) SNPs were selected to follow-up the top independent association signals from the largest available GWAS meta-analysis for each of the 23 traits (Supplementary Table S1). Fine-mapping SNPs: SNPs were selected from the catalogs of the International HapMap Project [6] and the August 2009 release of the 1000 Genomes Project [7] to fine-map 257 loci associated at genome-wide significance (P<5×10−8) in preliminary analyses of one or more of the 23 traits (See Figure 1, Supplementary Table S2 and S3, and Supplementary Text for details).

Figure 1

Example of signal fine mapping (SFM) and locus fine mapping (LFM) regions.

Other SNPs: These were comprised of independent SNPs for which genome-wide significant associations had been reported for any trait, SNP tags for copy number polymorphisms (CNPs), the MHC region, and the mitochondrial genome, fingerprint SNPs from GWA array products, a set of chromosome X and Y markers for sex verification, and “wild-card” SNPs based on consortium-specific hypotheses and interests (for example, based on a known pathway or early deep-sequencing studies). A detailed description of how SNPs were selected in each of these categories can be found in the Supplementary Text [21]–[25].

Example of signal fine mapping (SFM) and locus fine mapping (LFM) regions.

A SFM region seeks to map the initial association signal. SFM regions were designed using linkage disequilibrium (LD) r2 estimates from the 1000 Genomes Project and HapMap CEU data. Initial boundaries were determined by identifying all SNPs satisfying r2≥.5 with the index SNP, and then expanded to the nearest flanking recombination hotspot, but stopped if there was no hotspot nearby. LFM regions (blue) were similarly designed but expanded to capture functional units of interest such as nearby coding genes. The figure plots LD r2 for SNPs (red dots) within the region and recombination rate (blue lines) as a function of position on the chromosome. Gene positions and structures are displayed in the lower panel. MI = myocardial Infarction; CAD = cardiovascular disease; HDL = high-density lipoprotein; LDL = low-density lipoprotein; T2D = type 2 diabetes. In total, 217,695 SNPs were chosen for the array (Table 2). 20,970 SNPs (9.6%) failed during the assay manufacturing process, resulting in 196,725 SNPs available for genotyping. A summary file annotating each Metabochip SNP with ascertainment criteria, SNP assay, a list of unintended duplicate SNPs (Supplementary Table S4), and reference strand orientation for alleles is provided at http://www.sph.umich.edu/csg/kang/MetaboChip/.

Data Generation and Quality Control (QC)

We evaluated the utility of the Metabochip and accuracy of its genotype calls in three sample sets: (1) 15,896 northern European individuals from the FUSION, METSIM, HUNT, Tromsø, and Diagen studies [26]–[30] together with 67 HapMap samples genotyped at least two times each and called using Illumina GenomeStudio software by re-clustering these data; (2) 6,614 Sardinian individuals organized in 1,243 extended families from the SardiNIA study [31], [32] called by GenomeStudio software using default cluster data; and (3) 9,715 Nordic individuals from the Malmø Preventive Project, the Scania Diabetes Registry, and the Botnia Study [33]–[35] genotyped using a modified version of the BIRDSEED genotype calling algorithm [36]. We applied standard SNP- and sample-based QC filters based on call rate, Hardy-Weinberg equilibrium deviations, duplicate genotype inconsistencies, and failures of Mendelian inheritance; in the Nordic sample, we also carried out checks based on plate-specific characteristics. These filters resulted in final data sets of 163,222 polymorphic SNPs genotyped in 67 HapMap samples, 142,812 polymorphic SNPs genotyped in 6,164 Sardinians, and 179,165 polymorphic SNPs genotyped in 8,473 Nordic individuals.

Statistical Analysis Using Metabochip: Genomic Control, PCA, and Kinship Estimation

Since Metabochip SNPs were selected to be associated with our 23 traits of interest, performing genomic control correction [37] requires some care. To select a set of (near)-independent SNPs that are not associated with an analysis trait of interest, we focused on SNPs selected to replicate signals unrelated to the trait of interest (for example, QT interval SNPs for a T2D association analysis), also removing SNPs within 250 kb of SNPs previously associated with the trait of interest, and then LD-pruning the remaining SNPs so that no SNP pair is in strong LD (r2>.3). To estimate kinship coefficients or to correct for population stratification using principal components analysis (PCA) or multidimensional scaling (MDS) covariates, we require SNPs that are not too rare and are not in strong pairwise LD. We found that taking SNPs with MAF>.05 and LD-pruning them so that no SNP pair has r2>.3 works well for PCA and MDS (data not shown). The same subset of SNPs can be used for pairwise IBD estimation using the maximum-likelihood method of Milligan [38] implemented in PLINK [39] or the variance-components method of Balding and Nichols [40] implemented in EMMAX [41].

Imputation Preparation and Evaluation

We carried out genotype imputation in the Sardinian data. We imputed variants observed in a reference set of 280 Europeans from the August 2010 1000 Genomes Project data into: (a) 6,164 individuals genotyped on the Metabochip [32], (b) 1,097 individuals genotyped on the Affymetrix 6.0 array, and (c) 1,412 individuals genotyped on the Affymetrix 500 K array [42]. We evaluated mean estimated r2 within fine-mapping regions using minimac ([43]; www.genome.sph.umich.edu/wiki/minimac), and empirically compared the imputation quality using the published Sanger sequencing data in five fine mapping loci [32]. In addition, we evaluated mean estimated r2 across different continental populations by leaving one individual out from the 1000 Genomes reference panel and imputing them using markers present in each platform across the fine mapping regions and a 1 Mb window flanking each region. We also compared association power obtained by imputation into GWAS and Metabochip samples in Metabochip fine-mapping regions by comparing LDL cholesterol association evidence in 2,342 of these individuals genotyped using both the Metabochip and one of the Affymetrix arrays.

Results

Evaluation of Array Design and Genotype Quality

Of 217,695 SNPs chosen for the Metabochip across all design categories, 196,725 (90.4%) were successfully manufactured on the array (Table 2). The 48,846 previously manufactured SNPs had higher success rate (95.4%) than the 168,849 new SNP assays (88.7%). Illumina design score was predictive of the quality of manufactured SNP assays. For example, 25% of SNPs with design score<0.6 failed to produce genotype calls due to poor clustering of the intensity data, compared to 3.1% of SNPs with design score between 0.6 and 1.0 (Supplementary Figure S1). We evaluated genotype calling accuracy for 67 HapMap samples genotyped multiple times using three different calling strategies: (a) Illumina GenomeStudio with reclustering the intensity data using >15,000 samples; (b) Illumina GenomeStudio based on default clusters provided by Illumina; and (c) GenoSNP [44], which calls genotypes based on a within-sample-between-markers analysis of intensity data rather than a between-sample-within-marker analysis. The large majority of Metabochip SNPs yielded high quality genotypes. For the 67 HapMap samples called using GenomeStudio with reclustering, only 8,344 (4.2%) of the 196,725 SNP assays had genotype call rates <95%, while another 25,958 SNPs (13.2%) were monomorphic. Using GenomeStudio and default clusters, these numbers were 12,131 (6.2%) and 25,311 (12.9%), while using GenoSNP, they were 18,107 (9.2%) and 25,532 (13.0%). Using GenomeStudio with reclustering, genotype concordance between Metabochip genotypes for duplicate pairs was 99.998% overall and 99.990% for heterozygotes. Comparing Metabochip genotypes to HapMap 3 genotypes for the 59,935 SNPs in common, genotype concordance was 99.93% overall and 99.84% for heterozygotes, similar to the 99.87% Mendelian consistency rate reported in the HapMap3 data [45]. We observed similar concordance rates for these sample sets using the Illumina caller with default clusters (99.93% overall, 99.84% for heterozygotes), or using GenoSNP [44] (99.85% overall, 99.81% for heterozygotes). Genotype concordance for less common variants was slightly lower than for common variants. For example, among the singleton SNPs in the 67 HapMap samples, 98.9% of heterozygous genotypes were concordant with HapMap3 for the two GenomeStudio call sets and 97.8% for the GenoSNP set. Heterozygous genotype concordances for singleton SNPs between duplicate pairs were 99.76%, 99.70%, and 99.83% for the three call sets.

Frequency Spectrum and Coverage

We evaluated the allele frequency spectrum for Metabochip SNPs in the 67 HapMap samples (Figure 2). Mean MAF of Metabochip SNPs was .152 overall, .109 among fine-mapping SNPs, and .224 among replication SNPs. Among these three SNP sets, 38%, 53%, and 12% of SNPs had MAF<.05, and 14%, 21%, and 2% were monomorphic.

Figure 2

Allele frequency spectrum for Metabochip SNPs by design category.

Allele frequency spectrum for Metabochip SNPs by design category.

Blue dots, red squares, and green triangles display fractions of replication, fine-mapping, and all other SNPs (see Table 2) in each of the tabulated minor allele-frequency bins. CNP = copy number polymorphism. Within the 257 fine-mapping regions (45.52 Mb), 109,855 SNPs were catalogued by the 1000 Genomes Project [7] pilot studies and 240,805 SNPs are in the current Phase 1 release (as of November 2011). Of these, 122,241 fine-mapping SNPs were genotyped on the Metabochip (Supplementary Table S2). In the 1000 Genomes European samples, Metabochip SNPs tag 82.0% and 54.5% of all Pilot and Phase 1 1000 Genomes variants in these regions at r2≥.8, compared to 61.3% and 40.3% coverage using HapMap 3 SNPs (Figure 3). Among SNPs with MAF<.05, Metabochip SNPs tag 61.9% and 33.8% at r2≥.8, compared to 24.3% and 17.0% using HapMap 3. Using genotype imputation, we can impute 82% of 1000 Genomes Phase 1 European SNPs with MAF>0.5% with an estimated r2≥0.8.

Figure 3

Coverage of 257 Metabochip fine-mapping regions.

Fraction of 1000 Genomes Project SNPs in strong linkage disequilibrium (r2≥.8) with HapMap 3 (green squares) or Metabochip (blue dots) SNPs as a function of minor allele frequencies: (A) 1000 Genomes Pilot 1 SNPs, (B) 1000 Genomes Phase 1 SNPs (May 2011 release).

Coverage of 257 Metabochip fine-mapping regions.

Genotype Imputation within the Metabochip Fine-Mapping Regions

We next investigated accuracy of genotype imputation into the 257 Metabochip fine-mapping regions using the 280 Europeans from 1000 Genomes Project [7] as reference set and the 6,164 individuals in the Sardinian Metabochip sample as target. Figure 3 displays estimated r2 values in the Metabochip fine-mapping regions as a function of MAF. Also displayed are estimated r2 values for SNPs in these regions using the 280 European 1000 Genomes project samples as reference set and 1,412 Sardinians genotyped on the Affymetrix 500 K and 1,097 Sardinians genotyped on the Affymetrix 6.0 chips as targets. Imputation accuracy into the Sardinian Metabochip sample is greater in all allele frequency ranges than for the samples genotyped using the GWAS arrays. For example, among SNPs with .02≤MAF<.05, mean estimated r2 for the Affymetrix 500 K, Affymetrix 6.0, and Metabochip samples were .47, .62, and .84, respectively (Figure 4). The improved imputation accuracy for Metabochip compared to GWAS array is primarily due to increased marker density of the Metabochip in these regions.

Figure 4

Imputation accuracy (estimated r2) in fine mapping regions.

Imputation accuracy (estimated r2) in fine mapping regions.

Imputation accuracy for differing numbers of Sardinian individuals as measured by estimated r2 value across the 257 Metabochip fine mapping regions for Metabochip (red squares), Affymetrix 6.0 GWAS SNPs (green triangles), and Affymetrix 500 k GWAS SNPs (blue circles) as a function of minor allele frequency bin. Imputation quality in the Metabochip fine-mapping regions using Metabochip is also improved for non-European individuals compared to imputation using GWAS platforms. Using a leave-one-sample-out approach, we evaluated the average r2 from the 1000 Genomes reference panel into Affymetrix 500 k, Affymetrix 6.0, and Metabochip. For example, among SNPs with .02 In addition, we empirically evaluated the quality of experimentally determined and imputed SNPs within the five fine mapping regions by comparing individual genotypes with those obtained by Sanger sequencing. For 126 SNPs evaluated, the average r2 in analyses based on the Affymetrix 500 k and 6.0 arrays was .46 and .55, respectively. Analyses based on Metabochip showed average r2 = .79. Focusing on 48 SNPs that were imputed in all three analyses, the average r2 was .31 (Affymetrix 500 K), .41 (Affymetrix 6.0), and .57 (Metabochip) (Supplementary Figure S3).

High-Resolution Association Analysis within Metabochip Fine-Mapping Regions

To compare the power and resolution for association testing in the Metabochip fine-mapping regions to that of standard GWAS arrays, we revisited the LDL cholesterol association analysis from the SardiNIA study [32] in 2,342 individuals genotyped for both Metabochip and an Affymetrix (6.0 or 500 k) GWAS chip. Here, we focus on five of the six most strongly associated loci from Willer et al. [46], in and around PCSK9, LDLR, APOE/APOC1/APOC2, SORT1, and APOB (Figure 5A–J), all of which were designated for locus fine mapping by the Global Lipids Genetics Consortium.

Figure 5

Regional association plots for LDL cholesterol association in the SardiNIA study.

Regional association plots for LDL cholesterol association in the SardiNIA study.

Association plots for a study of 2,432 Sardinian individuals for five Metabochip fine-mapping regions using 1000 Genomes data as reference set and Affymetrix genotypes (left panels : A,C,E,G,H) or Metabochip genotypes (right panels : B,D,F,H,J) as target sets. The figures plot −log10 of the association p-value within the region and recombination rate (blue lines) as a function of position on the chromosome. Blue, green, and red dots and triangles indicate genotyped and imputed SNPs with minor allele frequencies less than 0.02, greater than or equal 0.02 and less than 0.05, and greater than or equal 0.05, respectively. Gene positions and structures are displayed in the lower panel. In the SORT1 and APOB regions, the peak association signals for the two data sets are similar (Figure 5A–D). For PCSK9, LDLR, and APOE/APOC1/APOC2, Metabochip based analysis resulted in considerably stronger association signals. For PCSK9 and APOE/APOC1/APOC2, the most strongly associated variants were low-frequency SNPs (MAF = 1.1% for PCSK9, MAF = 3.4% for APOE) that were directly genotyped on the Metabochip but not on the Affymetrix chips (Figure 5E–J). Although the signals from common variants are similar, the peak SNPs were not imputed accurately in the Affymetrix data (estimated r2 = .04 and .08, respectively). Within the LDLR region, there are 165 SNPs in the 1000 Genomes European panel. None of these SNPs are on the Affymetrix chips and only eight could be imputed at estimated r2≥.3 using the Affymetrix data; the locus is also hard to impute using HapMap 2 as a reference, with the peak association signals corresponding to r2 of ∼.40. In contrast, 36 of the 165 SNPs were directly genotyped in Metabochip, and 122 were imputed at estimated r2≥.3. As a result, imputation into the Metabochip data resulted in a substantial association signal (p = 7.3×10−6), while for the Affymetrix data, p>.02 at all markers (Figure 5I–J). These results demonstrate that dense genotyping may substantially improve imputation accuracy, increasing association power even for common variants.

Performing Standard Statistical Analyses Using Metabochip Genotype Data

We carried out kinship estimation between pairs of individuals and calculated genotype-based principal components for inclusion as covariates in genetic association analysis using all Metabochip SNPs that passed QC, and then using the pruned subset of SNPs described in the Methods section. When using all QC-passing SNPs, estimates of pairwise kinship coefficients in the Sardinia sample had inflated variance (Supplementary Table S5), and kinship coefficient estimates for the Nordic sample calculated using PLINK suggested (incorrectly) that essentially all pairs of individuals were related (Supplementary Figure S4). For each analysis, using the pruned set of SNPs gave sensible results, reducing variance in estimated kinship coefficients in the Sardinia sample and removing the artifactual estimates of close relatedness in the Nordic sample. Because many Metabochip SNPs were included specifically due to prior evidence for association of T2D, CAD/MI and related traits, controlling for potential population stratification in Metabochip analysis requires some care. Not surprisingly, carrying out T2D association analysis in the Nordic sample on all SNPs passing QC without inclusion of genotype-based principal components resulted in a large genomic control inflation factor (λGC = 1.44). Including all SNPs that passed QC to estimate principal components (PCs), and then including those PCs as covariates in the association analysis gave reduced but still substantial inflation (λGC = 1.13). When we instead estimated test statistic inflation based only on the 3,772 LD-pruned QT interval replication SNPs (not expected to associate with T2D) we obtained a genomic control inflation factor near unity (λGC = 1.01).

Assessing Overlap among SNPs across Traits

We were interested whether the replication SNP sets submitted by the GWAS consortia for the different traits showed more or less overlap than expected by chance. To address this question, we counted the number of SNPs in common across pairs of traits, and used simulation to test whether the observed overlaps were different than expected under the null hypothesis of genetic independence of pairs of traits (Supplementary Table S6). Not surprisingly, we observed substantial SNP set overlaps (and greater than expected assuming independence) for multiple pairs of correlated traits, notably SBP and DBP (38% proportion of maximum possible overlap), HDL and TG (17%), and TC and LDL (87%). We also observed substantial genetic overlap (4%) between LDL and SBP, which are nearly uncorrelated traits. Overall, we observed an excess of nominally significant SNP set overlaps, consistent with (but in no way proof of) the hypothesis a shared genetic etiology between these cardiometabolic traits.

Discussion

We designed the Metabochip, a custom genotyping array for replication of the top association signals from the largest available GWAS meta-analysis for 23 T2D and CAD/MI related traits and for fine-mapping 257 genome-wide significant association signals for 15 of these traits (Table 1). The Metabochip also includes a set of SNPs representing genome-wide significant associations across a range of human traits; SNPs that tag known copy number polymorphisms, the MHC, and mitochondrial variants; X and Y chromosome SNPs for sex verification, fingerprint SNPs for sample tracking, and “wildcard” SNPs selected by the participating GWAS consortia (Table 2). The array has already been genotyped on DNA samples from hundreds of thousands of individuals and preliminary analyses across the contributing GWAS consortia have identified hundreds of new genome-wide association signals (manuscripts being prepared by each of the consortia). In designing the Metabochip, 90.4% of chosen SNPs were successfully designed and manufactured onto the array, and of these, ∼82% passed QC filters in our three example studies, resulting in very complete coverage of variation in our 257 fine-mapping regions. Of course, as time passes and catalogs of SNPs expand, potential shortcomings in coverage should become apparent. Currently, coverage of 1000 Genomes Pilot Study European SNPs in the fine-mapping regions is 82.0% at a tagging threshold of r2≥.8. Coverage of Phase 1 European SNPs in these regions is 54.5%, and the number increases to 73.7% for SNPs at MAF>0.5%. Using genotype imputation, we can impute 82% of 1000 Genomes Phase 1 European SNPs with MAF>0.5% with estimated r2≥0.8. The resulting data are of high quality, with 99.99% duplicate consistency in heterozygotes and 99.77% Mendelian consistency in heterozygotes in our studies. Further, Metabochip fine-mapping regions provide an excellent target for genotype imputation from relevant reference sets, and in our experience can provide more complete coverage than provided by standard HapMap-based GWAS arrays (Figure 3) for both common and less common variants. A key decision in the fine-mapping of any GWAS signal concerns the size of the region where genetic variation will be examined exhaustively. In designing the Metabochip, we focused on relatively small regions surrounding each lead SNP – these included all variants in strong linkage disequilibrium (r2>.5) and a small shoulder extending .02 cM beyond that (typically, ∼20 kb). This decision was informed by the observation that, in cases where GWAS signals and Mendelian disease loci overlap, they are typically very close together (typically within ∼10 kb of each other and nearly always within <100 kb; see [4] for a discussion of the issue), although there are exceptions to this rule (see [47], for example). Within each fine-mapping region, we selected variants identified by the HapMap consortium and early analyses of the 1000 Genomes Consortium data. The 1000 Genomes Project and other sequence based catalogs of genetic variation are now more extensive that at the time of array design, but (as noted above) our analyses show that the SNPs selected for inclusion in the Metabochip form a useful reagent for genotyping imputation – not only for the imputation of newly discovered SNPs in the fine-mapping regions (see above) but also for the imputation of other types of variants, such as indel polymorphisms, that have become part of newer 1000 Genomes Project analyses (unpublished data). Several other design choices for Metabochip were to some degree arbitrary: which traits to include; balance in numbers of SNPs for replication, fine mapping, and other purposes; and how to prioritize among SNPs available for each purpose. Were we to design a similar chip now, we would take advantage of the now available more extensive and deeply annotated SNP catalogs. In addition, we would likely include a set of randomly ascertained SNPs to facilitate analysis that control for population structure and other artifacts. Finally, with empirical evidence from this and other projects on the relationship between SNP design score and empirical probability of successful design, we would likely replace design score by probability of successful design. This approach would likely result in even higher call rates. Because Metabochip SNPs are highly enriched for trait-associated SNPs and >60% are clustered in the ∼1.5% of the genome that comprises the fine-mapping regions, Metabochip genotype data present some challenges to standard analyses such as relationship estimation, principal components analysis, and genomic control determination. However, as we demonstrated, these challenges can be overcome by focusing on replication SNPs expected to be unrelated to the trait of interest. An alternative approach is to use SNPs that were not associated with the trait(s) of interest in the corresponding GWAS (for example, p-value>.50 for all such traits) and then to LD-prune the resulting set of SNPs to identify a near-independent set. An alternative that is also worthy of investigation in the analysis of case-control samples is the application of principal component factor loadings derived from a controls-only analysis to the combined sample of cases and controls. When this last alternative is considered, it is important to check that PCA axes derived from controls represent all relevant ancestries present in cases. The design of the array, focused on replication and fine-mapping and selecting SNPs from early releases of the HapMap and 1000 Genomes Projects, resulted in a highly non-random ascertainment of SNPs. Thus, we cannot recommend use of Metabochip SNPs for population genetic analyses that rely on unbiased, and/or comprehensive ascertainment schemes for SNPs. The need for follow-up genotyping is a frequent requirement of GWAS and sequencing studies of complex human traits. Approaching array design in a coordinated fashion across related studies and traits can be particularly cost-effective, since per array costs often drop dramatically with increasing numbers of individuals to be genotyped, and (given sufficient numbers of individuals) may increase only modestly with increasing numbers of SNPs. For example, a custom chip designed to genotype the ∼22,000 DIAGRAM-selected type 2 diabetes Metabochip SNPs in the ∼80,000 individuals genotyped on Metabochip by the DIAGRAM consortium studies would have cost ∼$55 compared to the Metabochip cost of $39, delivering only 1/9 as many genotypes at >40% greater cost. Furthermore, examining the association between SNPs tentatively associated with one trait for other related traits can also be informative, highlighting pleiotropy across related traits and helping discover new association signals; for example, two of the ten novel type 2 diabetes loci identified to date by Metabochip analysis by the DIAGRAM consortium were placed on Metabochip for other traits [48]. In the case of the Metabochip, which is less expensive than many smaller trait specific arrays, this opportunity to collect more information and investigate the effects of SNPs associated with other traits actually comes with reduced costs (compared to trait specific arrays), although with the need to organize across multiple consortia and to share the number of SNPs that can be cost-effectively genotyped. The “Immunochip” [49] follows this same paradigm and supports genotyping of ∼200,000 SNPs identified on the basis of GWAS meta-analyses for immunological disorders, while the recently designed “exome chip” (Benjamin Neale, Gonçalo Abecasis, personal communication) supports genotyping of ∼250,000 exonic SNPs identified via large-scale exome sequencing studies totaling >12,000 individuals. These and other similar array products represent valuable tools in ongoing efforts to understand the genetic architecture of complex human traits. Distribution of Illumina design scores by Metabochip SNP category. (TIFF) Click here for additional data file. Imputation accuracy in fine mapping regions across three continental populations for (A) Europeans (B) Africans, and (C) East Asians. (TIFF) Click here for additional data file. Empirical concordance between Sanger sequencing data and imputed genotypes. Empirical r2 was evaluated between Sanger sequencing data and imputed genotypes from Metabochip or (A) Affymetrix 500 K SNPs and (B) Affymetrix 6.0 SNPs across five loci in 256 Sardinians. (TIFF) Click here for additional data file. Distribution of estimates of pairwise genome-wide identity-by-descent (IBD) sharing generated by PLINK for all SNPs and for pruned SNPs. (TIFF) Click here for additional data file. Summary of replication SNP submission. (EPS) Click here for additional data file. Summary of fine-mapping regions. (EPS) Click here for additional data file. Summary of SNPs within fine-mapping loci. (EPS) Click here for additional data file. List of unintended duplicated SNPs. (EPS) Click here for additional data file. Estimation of pairwise kinship coefficients. (EPS) Click here for additional data file. Observed count of SNPs in common (upper) between Tier 1 and Tier 2 replication traits submissions and significance of observed overlap (lower). (EPS) Click here for additional data file. Technical details of SNP selection criteria. (DOC) Click here for additional data file.

49 in total

1. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits.

Authors: Lucia A Hindorff; Praveen Sethupathy; Heather A Junkins; Erin M Ramos; Jayashri P Mehta; Francis S Collins; Teri A Manolio
Journal: Proc Natl Acad Sci U S A Date: 2009-05-27 Impact factor: 11.205

2. Variance component model to account for sample structure in genome-wide association studies.

Authors: Hyun Min Kang; Jae Hoon Sul; Susan K Service; Noah A Zaitlen; Sit-Yee Kong; Nelson B Freimer; Chiara Sabatti; Eleazar Eskin
Journal: Nat Genet Date: 2010-03-07 Impact factor: 38.330

3. A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity.

Authors: D J Balding; R A Nichols
Journal: Genetica Date: 1995 Impact factor: 1.082

4. Common SNPs explain a large proportion of the heritability for human height.

Authors: Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher
Journal: Nat Genet Date: 2010-06-20 Impact factor: 38.330

5. Clinical risk factors, DNA variants, and the development of type 2 diabetes.

Authors: Valeriya Lyssenko; Anna Jonsson; Peter Almgren; Nicoló Pulizzi; Bo Isomaa; Tiinamaija Tuomi; Göran Berglund; David Altshuler; Peter Nilsson; Leif Groop
Journal: N Engl J Med Date: 2008-11-20 Impact factor: 91.245

6. Common variants at 10 genomic loci influence hemoglobin A₁(C) levels via glycemic and nonglycemic pathways.

Authors: Nicole Soranzo; Serena Sanna; Eleanor Wheeler; Christian Gieger; Dörte Radke; Josée Dupuis; Nabila Bouatia-Naji; Claudia Langenberg; Inga Prokopenko; Elliot Stolerman; Manjinder S Sandhu; Matthew M Heeney; Joseph M Devaney; Muredach P Reilly; Sally L Ricketts; Alexandre F R Stewart; Benjamin F Voight; Christina Willenborg; Benjamin Wright; David Altshuler; Dan Arking; Beverley Balkau; Daniel Barnes; Eric Boerwinkle; Bernhard Böhm; Amélie Bonnefond; Lori L Bonnycastle; Dorret I Boomsma; Stefan R Bornstein; Yvonne Böttcher; Suzannah Bumpstead; Mary Susan Burnett-Miller; Harry Campbell; Antonio Cao; John Chambers; Robert Clark; Francis S Collins; Josef Coresh; Eco J C de Geus; Mariano Dei; Panos Deloukas; Angela Döring; Josephine M Egan; Roberto Elosua; Luigi Ferrucci; Nita Forouhi; Caroline S Fox; Christopher Franklin; Maria Grazia Franzosi; Sophie Gallina; Anuj Goel; Jürgen Graessler; Harald Grallert; Andreas Greinacher; David Hadley; Alistair Hall; Anders Hamsten; Caroline Hayward; Simon Heath; Christian Herder; Georg Homuth; Jouke-Jan Hottenga; Rachel Hunter-Merrill; Thomas Illig; Anne U Jackson; Antti Jula; Marcus Kleber; Christopher W Knouff; Augustine Kong; Jaspal Kooner; Anna Köttgen; Peter Kovacs; Knut Krohn; Brigitte Kühnel; Johanna Kuusisto; Markku Laakso; Mark Lathrop; Cécile Lecoeur; Man Li; Mingyao Li; Ruth J F Loos; Jian'an Luan; Valeriya Lyssenko; Reedik Mägi; Patrik K E Magnusson; Anders Mälarstig; Massimo Mangino; María Teresa Martínez-Larrad; Winfried März; Wendy L McArdle; Ruth McPherson; Christa Meisinger; Thomas Meitinger; Olle Melander; Karen L Mohlke; Vincent E Mooser; Mario A Morken; Narisu Narisu; David M Nathan; Matthias Nauck; Chris O'Donnell; Konrad Oexle; Nazario Olla; James S Pankow; Felicity Payne; John F Peden; Nancy L Pedersen; Leena Peltonen; Markus Perola; Ozren Polasek; Eleonora Porcu; Daniel J Rader; Wolfgang Rathmann; Samuli Ripatti; Ghislain Rocheleau; Michael Roden; Igor Rudan; Veikko Salomaa; Richa Saxena; David Schlessinger; Heribert Schunkert; Peter Schwarz; Udo Seedorf; Elizabeth Selvin; Manuel Serrano-Ríos; Peter Shrader; Angela Silveira; David Siscovick; Kjioung Song; Timothy D Spector; Kari Stefansson; Valgerdur Steinthorsdottir; David P Strachan; Rona Strawbridge; Michael Stumvoll; Ida Surakka; Amy J Swift; Toshiko Tanaka; Alexander Teumer; Gudmar Thorleifsson; Unnur Thorsteinsdottir; Anke Tönjes; Gianluca Usala; Veronique Vitart; Henry Völzke; Henri Wallaschofski; Dawn M Waterworth; Hugh Watkins; H-Erich Wichmann; Sarah H Wild; Gonneke Willemsen; Gordon H Williams; James F Wilson; Juliane Winkelmann; Alan F Wright; Carina Zabena; Jing Hua Zhao; Stephen E Epstein; Jeanette Erdmann; Hakon H Hakonarson; Sekar Kathiresan; Kay-Tee Khaw; Robert Roberts; Nilesh J Samani; Mark D Fleming; Robert Sladek; Gonçalo Abecasis; Michael Boehnke; Philippe Froguel; Leif Groop; Mark I McCarthy; W H Linda Kao; Jose C Florez; Manuela Uda; Nicholas J Wareham; Inês Barroso; James B Meigs
Journal: Diabetes Date: 2010-09-21 Impact factor: 9.461

7. Fine mapping of five loci associated with low-density lipoprotein cholesterol detects variants that double the explained heritability.

Authors: Serena Sanna; Bingshan Li; Antonella Mulas; Carlo Sidore; Hyun M Kang; Anne U Jackson; Maria Grazia Piras; Gianluca Usala; Giuseppe Maninchedda; Alessandro Sassu; Fabrizio Serra; Maria Antonietta Palmas; William H Wood; Inger Njølstad; Markku Laakso; Kristian Hveem; Jaakko Tuomilehto; Timo A Lakka; Rainer Rauramaa; Michael Boehnke; Francesco Cucca; Manuela Uda; David Schlessinger; Ramaiah Nagaraja; Gonçalo R Abecasis
Journal: PLoS Genet Date: 2011-07-28 Impact factor: 5.917

8. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk.

Authors: Josée Dupuis; Claudia Langenberg; Inga Prokopenko; Richa Saxena; Nicole Soranzo; Anne U Jackson; Eleanor Wheeler; Nicole L Glazer; Nabila Bouatia-Naji; Anna L Gloyn; Cecilia M Lindgren; Reedik Mägi; Andrew P Morris; Joshua Randall; Toby Johnson; Paul Elliott; Denis Rybin; Gudmar Thorleifsson; Valgerdur Steinthorsdottir; Peter Henneman; Harald Grallert; Abbas Dehghan; Jouke Jan Hottenga; Christopher S Franklin; Pau Navarro; Kijoung Song; Anuj Goel; John R B Perry; Josephine M Egan; Taina Lajunen; Niels Grarup; Thomas Sparsø; Alex Doney; Benjamin F Voight; Heather M Stringham; Man Li; Stavroula Kanoni; Peter Shrader; Christine Cavalcanti-Proença; Meena Kumari; Lu Qi; Nicholas J Timpson; Christian Gieger; Carina Zabena; Ghislain Rocheleau; Erik Ingelsson; Ping An; Jeffrey O'Connell; Jian'an Luan; Amanda Elliott; Steven A McCarroll; Felicity Payne; Rosa Maria Roccasecca; François Pattou; Praveen Sethupathy; Kristin Ardlie; Yavuz Ariyurek; Beverley Balkau; Philip Barter; John P Beilby; Yoav Ben-Shlomo; Rafn Benediktsson; Amanda J Bennett; Sven Bergmann; Murielle Bochud; Eric Boerwinkle; Amélie Bonnefond; Lori L Bonnycastle; Knut Borch-Johnsen; Yvonne Böttcher; Eric Brunner; Suzannah J Bumpstead; Guillaume Charpentier; Yii-Der Ida Chen; Peter Chines; Robert Clarke; Lachlan J M Coin; Matthew N Cooper; Marilyn Cornelis; Gabe Crawford; Laura Crisponi; Ian N M Day; Eco J C de Geus; Jerome Delplanque; Christian Dina; Michael R Erdos; Annette C Fedson; Antje Fischer-Rosinsky; Nita G Forouhi; Caroline S Fox; Rune Frants; Maria Grazia Franzosi; Pilar Galan; Mark O Goodarzi; Jürgen Graessler; Christopher J Groves; Scott Grundy; Rhian Gwilliam; Ulf Gyllensten; Samy Hadjadj; Göran Hallmans; Naomi Hammond; Xijing Han; Anna-Liisa Hartikainen; Neelam Hassanali; Caroline Hayward; Simon C Heath; Serge Hercberg; Christian Herder; Andrew A Hicks; David R Hillman; Aroon D Hingorani; Albert Hofman; Jennie Hui; Joe Hung; Bo Isomaa; Paul R V Johnson; Torben Jørgensen; Antti Jula; Marika Kaakinen; Jaakko Kaprio; Y Antero Kesaniemi; Mika Kivimaki; Beatrice Knight; Seppo Koskinen; Peter Kovacs; Kirsten Ohm Kyvik; G Mark Lathrop; Debbie A Lawlor; Olivier Le Bacquer; Cécile Lecoeur; Yun Li; Valeriya Lyssenko; Robert Mahley; Massimo Mangino; Alisa K Manning; María Teresa Martínez-Larrad; Jarred B McAteer; Laura J McCulloch; Ruth McPherson; Christa Meisinger; David Melzer; David Meyre; Braxton D Mitchell; Mario A Morken; Sutapa Mukherjee; Silvia Naitza; Narisu Narisu; Matthew J Neville; Ben A Oostra; Marco Orrù; Ruth Pakyz; Colin N A Palmer; Giuseppe Paolisso; Cristian Pattaro; Daniel Pearson; John F Peden; Nancy L Pedersen; Markus Perola; Andreas F H Pfeiffer; Irene Pichler; Ozren Polasek; Danielle Posthuma; Simon C Potter; Anneli Pouta; Michael A Province; Bruce M Psaty; Wolfgang Rathmann; Nigel W Rayner; Kenneth Rice; Samuli Ripatti; Fernando Rivadeneira; Michael Roden; Olov Rolandsson; Annelli Sandbaek; Manjinder Sandhu; Serena Sanna; Avan Aihie Sayer; Paul Scheet; Laura J Scott; Udo Seedorf; Stephen J Sharp; Beverley Shields; Gunnar Sigurethsson; Eric J G Sijbrands; Angela Silveira; Laila Simpson; Andrew Singleton; Nicholas L Smith; Ulla Sovio; Amy Swift; Holly Syddall; Ann-Christine Syvänen; Toshiko Tanaka; Barbara Thorand; Jean Tichet; Anke Tönjes; Tiinamaija Tuomi; André G Uitterlinden; Ko Willems van Dijk; Mandy van Hoek; Dhiraj Varma; Sophie Visvikis-Siest; Veronique Vitart; Nicole Vogelzangs; Gérard Waeber; Peter J Wagner; Andrew Walley; G Bragi Walters; Kim L Ward; Hugh Watkins; Michael N Weedon; Sarah H Wild; Gonneke Willemsen; Jaqueline C M Witteman; John W G Yarnell; Eleftheria Zeggini; Diana Zelenika; Björn Zethelius; Guangju Zhai; Jing Hua Zhao; M Carola Zillikens; Ingrid B Borecki; Ruth J F Loos; Pierre Meneton; Patrik K E Magnusson; David M Nathan; Gordon H Williams; Andrew T Hattersley; Kaisa Silander; Veikko Salomaa; George Davey Smith; Stefan R Bornstein; Peter Schwarz; Joachim Spranger; Fredrik Karpe; Alan R Shuldiner; Cyrus Cooper; George V Dedoussis; Manuel Serrano-Ríos; Andrew D Morris; Lars Lind; Lyle J Palmer; Frank B Hu; Paul W Franks; Shah Ebrahim; Michael Marmot; W H Linda Kao; James S Pankow; Michael J Sampson; Johanna Kuusisto; Markku Laakso; Torben Hansen; Oluf Pedersen; Peter Paul Pramstaller; H Erich Wichmann; Thomas Illig; Igor Rudan; Alan F Wright; Michael Stumvoll; Harry Campbell; James F Wilson; Richard N Bergman; Thomas A Buchanan; Francis S Collins; Karen L Mohlke; Jaakko Tuomilehto; Timo T Valle; David Altshuler; Jerome I Rotter; David S Siscovick; Brenda W J H Penninx; Dorret I Boomsma; Panos Deloukas; Timothy D Spector; Timothy M Frayling; Luigi Ferrucci; Augustine Kong; Unnur Thorsteinsdottir; Kari Stefansson; Cornelia M van Duijn; Yurii S Aulchenko; Antonio Cao; Angelo Scuteri; David Schlessinger; Manuela Uda; Aimo Ruokonen; Marjo-Riitta Jarvelin; Dawn M Waterworth; Peter Vollenweider; Leena Peltonen; Vincent Mooser; Goncalo R Abecasis; Nicholas J Wareham; Robert Sladek; Philippe Froguel; Richard M Watanabe; James B Meigs; Leif Groop; Michael Boehnke; Mark I McCarthy; Jose C Florez; Inês Barroso
Journal: Nat Genet Date: 2010-01-17 Impact factor: 38.330

9. Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits.

Authors: Angelo Scuteri; Serena Sanna; Wei-Min Chen; Manuela Uda; Giuseppe Albai; James Strait; Samer Najjar; Ramaiah Nagaraja; Marco Orrú; Gianluca Usala; Mariano Dei; Sandra Lai; Andrea Maschio; Fabio Busonero; Antonella Mulas; Georg B Ehret; Ashley A Fink; Alan B Weder; Richard S Cooper; Pilar Galan; Aravinda Chakravarti; David Schlessinger; Antonio Cao; Edward Lakatta; Gonçalo R Abecasis
Journal: PLoS Genet Date: 2007-07 Impact factor: 5.917

10. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes.

Authors: Andrew P Morris; Benjamin F Voight; Tanya M Teslovich; Teresa Ferreira; Ayellet V Segrè; Valgerdur Steinthorsdottir; Rona J Strawbridge; Hassan Khan; Harald Grallert; Anubha Mahajan; Inga Prokopenko; Hyun Min Kang; Christian Dina; Tonu Esko; Ross M Fraser; Stavroula Kanoni; Ashish Kumar; Vasiliki Lagou; Claudia Langenberg; Jian'an Luan; Cecilia M Lindgren; Martina Müller-Nurasyid; Sonali Pechlivanis; N William Rayner; Laura J Scott; Steven Wiltshire; Loic Yengo; Leena Kinnunen; Elizabeth J Rossin; Soumya Raychaudhuri; Andrew D Johnson; Antigone S Dimas; Ruth J F Loos; Sailaja Vedantam; Han Chen; Jose C Florez; Caroline Fox; Ching-Ti Liu; Denis Rybin; David J Couper; Wen Hong L Kao; Man Li; Marilyn C Cornelis; Peter Kraft; Qi Sun; Rob M van Dam; Heather M Stringham; Peter S Chines; Krista Fischer; Pierre Fontanillas; Oddgeir L Holmen; Sarah E Hunt; Anne U Jackson; Augustine Kong; Robert Lawrence; Julia Meyer; John R B Perry; Carl G P Platou; Simon Potter; Emil Rehnberg; Neil Robertson; Suthesh Sivapalaratnam; Alena Stančáková; Kathleen Stirrups; Gudmar Thorleifsson; Emmi Tikkanen; Andrew R Wood; Peter Almgren; Mustafa Atalay; Rafn Benediktsson; Lori L Bonnycastle; Noël Burtt; Jason Carey; Guillaume Charpentier; Andrew T Crenshaw; Alex S F Doney; Mozhgan Dorkhan; Sarah Edkins; Valur Emilsson; Elodie Eury; Tom Forsen; Karl Gertow; Bruna Gigante; George B Grant; Christopher J Groves; Candace Guiducci; Christian Herder; Astradur B Hreidarsson; Jennie Hui; Alan James; Anna Jonsson; Wolfgang Rathmann; Norman Klopp; Jasmina Kravic; Kaarel Krjutškov; Cordelia Langford; Karin Leander; Eero Lindholm; Stéphane Lobbens; Satu Männistö; Ghazala Mirza; Thomas W Mühleisen; Bill Musk; Melissa Parkin; Loukianos Rallidis; Jouko Saramies; Bengt Sennblad; Sonia Shah; Gunnar Sigurðsson; Angela Silveira; Gerald Steinbach; Barbara Thorand; Joseph Trakalo; Fabrizio Veglia; Roman Wennauer; Wendy Winckler; Delilah Zabaneh; Harry Campbell; Cornelia van Duijn; Andre G Uitterlinden; Albert Hofman; Eric Sijbrands; Goncalo R Abecasis; Katharine R Owen; Eleftheria Zeggini; Mieke D Trip; Nita G Forouhi; Ann-Christine Syvänen; Johan G Eriksson; Leena Peltonen; Markus M Nöthen; Beverley Balkau; Colin N A Palmer; Valeriya Lyssenko; Tiinamaija Tuomi; Bo Isomaa; David J Hunter; Lu Qi; Alan R Shuldiner; Michael Roden; Ines Barroso; Tom Wilsgaard; John Beilby; Kees Hovingh; Jackie F Price; James F Wilson; Rainer Rauramaa; Timo A Lakka; Lars Lind; George Dedoussis; Inger Njølstad; Nancy L Pedersen; Kay-Tee Khaw; Nicholas J Wareham; Sirkka M Keinanen-Kiukaanniemi; Timo E Saaristo; Eeva Korpi-Hyövälti; Juha Saltevo; Markku Laakso; Johanna Kuusisto; Andres Metspalu; Francis S Collins; Karen L Mohlke; Richard N Bergman; Jaakko Tuomilehto; Bernhard O Boehm; Christian Gieger; Kristian Hveem; Stephane Cauchi; Philippe Froguel; Damiano Baldassarre; Elena Tremoli; Steve E Humphries; Danish Saleheen; John Danesh; Erik Ingelsson; Samuli Ripatti; Veikko Salomaa; Raimund Erbel; Karl-Heinz Jöckel; Susanne Moebus; Annette Peters; Thomas Illig; Ulf de Faire; Anders Hamsten; Andrew D Morris; Peter J Donnelly; Timothy M Frayling; Andrew T Hattersley; Eric Boerwinkle; Olle Melander; Sekar Kathiresan; Peter M Nilsson; Panos Deloukas; Unnur Thorsteinsdottir; Leif C Groop; Kari Stefansson; Frank Hu; James S Pankow; Josée Dupuis; James B Meigs; David Altshuler; Michael Boehnke; Mark I McCarthy
Journal: Nat Genet Date: 2012-08-12 Impact factor: 38.330

304 in total

1. Gene-Lifestyle Interactions in Complex Diseases: Design and Description of the GLACIER and VIKING Studies.

Authors: Azra Kurbasic; Alaitz Poveda; Yan Chen; Asa Agren; Elisabeth Engberg; Frank B Hu; Ingegerd Johansson; Ines Barroso; Anders Brändström; Göran Hallmans; Frida Renström; Paul W Franks
Journal: Curr Nutr Rep Date: 2014-12-01

2. A Burden of Rare Variants Associated with Extremes of Gene Expression in Human Peripheral Blood.

Authors: Jing Zhao; Idowu Akinsanmi; Dalia Arafat; T J Cradick; Ciaran M Lee; Samridhi Banskota; Urko M Marigorta; Gang Bao; Greg Gibson
Journal: Am J Hum Genet Date: 2016-02-04 Impact factor: 11.025

3. Genetic association with lipids in Filipinos: waist circumference modifies an APOA5 effect on triglyceride levels.

Authors: Ying Wu; Amanda F Marvelle; Jin Li; Damien C Croteau-Chonka; Alan B Feranil; Christopher W Kuzawa; Yun Li; Linda S Adair; Karen L Mohlke
Journal: J Lipid Res Date: 2013-09-10 Impact factor: 5.922

4. Local ancestry transitions modify snp-trait associations.

Authors: Alexandra E Fish; Dana C Crawford; John A Capra; William S Bush
Journal: Pac Symp Biocomput Date: 2018

Review 5. Genetics of sudden cardiac death caused by ventricular arrhythmias.

Authors: Roos F Marsman; Hanno L Tan; Connie R Bezzina
Journal: Nat Rev Cardiol Date: 2013-12-10 Impact factor: 32.419

6. GENETICS. Strength in small numbers.

Authors: Sarah Tishkoff
Journal: Science Date: 2015-09-18 Impact factor: 47.728

7. Coronary Artery Calcification and Rheumatoid Arthritis: Lack of Relationship to Risk Alleles for Coronary Artery Disease in the General Population.

Authors: Iván Ferraz-Amaro; Robert Winchester; Peter K Gregersen; Richard J Reynolds; Mary Chester Wasko; Anette Oeser; Cecilia P Chung; C Michael Stein; Jon T Giles; Joan M Bathon
Journal: Arthritis Rheumatol Date: 2017-03 Impact factor: 10.995