Literature DB >> 18678618

Comprehensive association study of type 2 diabetes and related quantitative traits with 222 candidate genes.

Kyle J Gaulton¹, Cristen J Willer, Yun Li, Laura J Scott, Karen N Conneely, Anne U Jackson, William L Duren, Peter S Chines, Narisu Narisu, Lori L Bonnycastle, Jingchun Luo, Maurine Tong, Andrew G Sprau, Elizabeth W Pugh, Kimberly F Doheny, Timo T Valle, Gonçalo R Abecasis, Jaakko Tuomilehto, Richard N Bergman, Francis S Collins, Michael Boehnke, Karen L Mohlke.

Abstract

OBJECTIVE: Type 2 diabetes is a common complex disorder with environmental and genetic components. We used a candidate gene-based approach to identify single nucleotide polymorphism (SNP) variants in 222 candidate genes that influence susceptibility to type 2 diabetes. RESEARCH DESIGN AND METHODS: In a case-control study of 1,161 type 2 diabetic subjects and 1,174 control Finns who are normal glucose tolerant, we genotyped 3,531 tagSNPs and annotation-based SNPs and imputed an additional 7,498 SNPs, providing 99.9% coverage of common HapMap variants in the 222 candidate genes. Selected SNPs were genotyped in an additional 1,211 type 2 diabetic case subjects and 1,259 control subjects who are normal glucose tolerant, also from Finland.
RESULTS: Using SNP- and gene-based analysis methods, we replicated previously reported SNP-type 2 diabetes associations in PPARG, KCNJ11, and SLC2A2; identified significant SNPs in genes with previously reported associations (ENPP1 [rs2021966, P = 0.00026] and NRF1 [rs1882095, P = 0.00096]); and implicated novel genes, including RAPGEF1 (rs4740283, P = 0.00013) and TP53 (rs1042522, Arg72Pro, P = 0.00086), in type 2 diabetes susceptibility.
CONCLUSIONS: Our study provides an effective gene-based approach to association study design and analysis. One or more of the newly implicated genes may contribute to type 2 diabetes pathogenesis. Analysis of additional samples will be necessary to determine their effect on susceptibility.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2008 PMID： 18678618 PMCID： PMC2570412 DOI： 10.2337/db07-1731

Source DB: PubMed Journal: Diabetes ISSN： 0012-1797 Impact factor: 9.461

Type 2 diabetes is a metabolic disorder characterized by insulin resistance and pancreatic β-cell dysfunction and is a leading cause of morbidity and mortality in the U.S. and worldwide. The incidence of type 2 diabetes is rapidly increasing, with 1.6 million new cases of diabetes diagnosed in individuals aged ≥20 years in the U.S. in 2007 (available at http://www.diabetes.niddk.nih.gov/dm/pubs/statistics/). While environmental factors play a major role in predisposition to type 2 diabetes, substantial evidence supports the influence of genetic factors on disease susceptibility. For example, the twin concordance rate is an estimated 34% for monozygotic twins and 16% for dizygotic twins (1). However, the underlying genetic variants are just beginning to be identified (2). Numerous published reports (3–5) have identified association between type 2 diabetes and common genetic variants in human populations; however, until very recently, variants in only a few genes have been consistently replicated across populations and with large sample sizes. Among these are the Pro12Ala (rs1801282) variant in peroxisome proliferator–activated receptor γ (PPARG) (6), the Glu23Lys (rs5210) variant in the potassium channel gene KCNJ11 (7), and several variants in the Wnt-receptor signaling pathway member TCF7L2 (8). Recent genome-wide studies have implicated many previously unreported genes in type 2 diabetes susceptibility. The first reported genome-wide association (GWA) scan implicated variants at five susceptibility loci that include TCF7L2 and novel loci near the genes SLC30A8, IDE-KIF11-HHEX, LOC387761, and EXT-ALX4 (9). Three companion GWA studies (10–12), including one by our group, replicated evidence for PPARG, KCNJ11, TCF7L2, SLC30A8, and IDE-KIF11-HHEX and provided new evidence for CDKAL1, CDKN2A-CDKN2B, IGF2BP2, FTO, and a region of chromosome 11 with no annotated genes. Additional GWA studies (13–18) provided additional evidence for TCF7L2, CDKAL1, and SLC30A8. The candidate genes WFS1 (19) and TCF2 (20,21) have also been confirmed in large samples, bringing the current list of type 2 diabetes susceptibility loci to at least 10. The recent discovery of these loci still explains only a small fraction (∼2.3%) of the overall risk of type 2 diabetes (12). Therefore, novel susceptibility genes remain to be identified through increasingly comprehensive analyses of both individual genes and the entire genome. The Finland-U.S. Investigation of Type 2 Diabetes Genetics (FUSION) study aims to identify variants influencing susceptibility to type 2 diabetes and related quantitative traits in the Finnish population (22). FUSION has previously identified modest type 2 diabetes association in Finns with variants in HNF4A (23); four genes known to cause maturity-onset diabetes of the young (5,23,24); PPARG, KCNJ11, ENPP1, SLC2A2, PCK1, TNF, IL6 (5), and TCF7L2 (25); and the loci identified in the GWA studies. As a complementary approach to GWA studies, which are conducted without a priori biological hypotheses, we sought to perform an in-depth analysis of >200 genes likely to influence susceptibility to type 2 diabetes and quantitative trait variation that we selected by applying CandidAtE Search And Rank (CAESAR), a text- and data-mining algorithm (26). We aimed to analyze the full spectrum of HapMap-based common variation in each of these candidate genes. The combination of high throughput genotyping, linkage disequilibrium (LD) information from HapMap (27), the ability to impute ungenotyped variants (28), and the improved functional annotation of the genome makes in-depth candidate gene–based association analysis possible.

RESEARCH DESIGN AND METHODS

The stage 1 sample set consisted of 2,335 Finnish individuals from the FUSION (22,29) and Finrisk 2002 (30) studies (Table 1) (online appendix Table 1A [available at http://dx.doi.org/10.2337/db07-1731]). The sample included 1,161 individuals with type 2 diabetes and 1,174 control subjects with normal glucose tolerance. Diabetes was defined according to 1999 World Health Organization criteria (fasting plasma glucose concentration ≥7.0 mmol/l or 2-h plasma glucose concentration ≥11.1 mmol/l), by report of diabetes medication use, or based on medical record review. Normal glucose tolerance was defined as having fasting glucose <6.1 mmol/l and 2-h glucose <7.8 mmol/l. A total of 120 FUSION offspring with genotyped parents were included for quantitative trait analysis; all offspring had normal glucose tolerance except one type 2 diabetic individual who was included in the case sample.

TABLE 1

Characteristics of the stage 1 and 2 case and control samples

	Stage 1		Stage 2
	Case subjects	Control subjects	Case subjects	Control subjects
n	1,161	1,174	1,215	1,258
Male subjects	653	574	724	768
Female subjects	508	600	491	490
Age of diagnosis (years)	53.0 ± 12.0	N/A	56.0 ± 12.0	N/A
Study age (years)	63.4 ± 11.2	64 ± 11.7	60.0 ± 11.5	59.0 ± 10.6
BMI (kg/m²)	29.8 ± 6.1	26.8 ± 5.0	30.1 ± 6.7	26.4 ± 4.9
Fasting glucose (mmol/l)	8.4 ± 3.9	5.4 ± 0.7	7.2 ± 2.1*	5.4 ± 0.6†

Data are medians±interquartile ranges unless otherwise indicated.

n = 204 and

n = 583 values converted from whole blood to plasma glucose equivalent using a prediction equation from the European Diabetes Epidemiology Group, of which †n = 262 fasted <8 h.

Stage 2 consisted of 2,473 Finnish individuals (Table 1) (online appendix Table 1B) and included 1,215 individuals with type 2 diabetes and 1,258 control subjects with normal glucose tolerance (10). A total of 56 duplicate samples were used for quality control. The sample sets are identical to those used in the FUSION GWA study (10). Study protocols were approved by local ethics committees and/or institutional review boards, and informed consent was obtained from all study participants.

Gene selection.

A total of 222 candidate genes were selected for study using two strategies. Two hundred and seventeen candidate genes were selected using CAESAR, an algorithm that prioritizes candidate genes for complex human traits based on trait-relevant functional annotation (26). Given a trait-relevant input text, CAESAR 1) uses text mining to extract gene symbols and to find and rank terms present in four biomedical ontologies (gene ontology biological process [31], gene ontology molecular function [31], eVOC anatomy [32], and mammalian phenotype ontology [33]) based on frequency of occurrence, 2) uses the ranked ontology terms and extracted gene symbols to data mine several public databases for human genes annotated with the ontology terms or extracted gene symbols, and 3) integrates the resulting gene annotation lists to provide a combined score and rank for each gene. Details of gene selection using custom parameters for CAESAR are provided in the online appendix. Five genes were not ranked high enough to have been included using CAESAR. ENPP1, HFE, WFS1, and ZNHIT3 were included because each had one or more single nucleotide polymorphisms (SNPs) associated with type 2 diabetes (P < 0.1) in a prior study of a subset of FUSION samples (6) (C.J.W., L.L.B., M.B., and K.L.M., unpublished data); in addition, ENPP1 and WFS1 had been previously studied as type 2 diabetes candidate genes. CAPN10 was included because it had been previously studied by FUSION (34) and others (35,36).

SNP selection.

We defined the “transcribed region” of each of the 222 candidate genes as the sequence including the first exon of any transcribed isoform through the last exon of any transcribed isoform, and we aimed to capture variation up to 10 kb upstream and 5 kb downstream of the transcribed region (−10 kb/+5 kb). In this process, we allowed SNPs to be located as far as 50 kb upstream and 50 kb downstream (−50 kb/+50 kb) of the transcribed region if they tagged a −10-kb/+5-kb SNP at r2 > 0.8. Briefly, 3,531 SNPs were selected for stage 1 genotyping as follows. We selected SNPs from the Illumina Infinium II HumanHap300 BeadChip that tagged one or more −10-kb/+5-kb SNPs (r2 > 0.8). Then, to evaluate each gene region more comprehensively, we selected 1) additional tagSNPs and 2) functionally annotated non-HapMap SNPs for genotyping on an Illumina GoldenGate panel. We also included eight SNPs that had been previously genotyped in candidate gene studies on a smaller subset of FUSION samples (5). Additional details of SNP selection are provided in the online appendix.

Genotyping.

Stage 1 genotyping of 317,503 SNPs was performed at the Center for Inherited Disease Research on the HumanHap300 BeadChip using the Illumina Infinium II assay protocol (10), and 1,527 SNPs were genotyped in partnership with the Mammalian Genotyping Core at the University of North Carolina using the Illumina GoldenGate assay. We performed additional genotyping for eight previously reported SNPs (5) using the Sequenom homogeneous MassEXTEND assay and four imputed SNPs using Applied Biosystems TaqMan allelic discrimination assays. There was a genotype consistency rate of >99.88% between each platform, using 79 duplicate samples. Stage 2 genotyping of 31 SNPs was performed using the homogeneous MassEXTEND assay; there was a genotype consistency rate of 100%, using 56 duplicate samples. SNP and sample success rates and quality-control filters are described in the online appendix.

Imputation.

We used MACH, a computationally efficient hidden Markov model–based algorithm (available at http://www.sph.umich.edu/csg/abecasis/MACH/) (28), to impute genotypes in FUSION samples for 7,498 common (minor allele frequency [MAF] > 0.05) HapMap SNPs present in the target regions but not genotyped in our study. To improve the quality of imputation near the ends of the target regions, we used at least 1 Mb of flanking genotype information to impute SNPs in target regions.

Coverage of HapMap SNPs.

Coverage was calculated as the percentage of all common (MAF > 0.05) HapMap Release 21 CEU SNPs in the −10-kb/+5-kb gene regions that are tagged by a genotyped SNP at an r2 threshold of at least 0.8.

Type 2 diabetes association analysis.

Genotyped SNPs were tested for type 2 diabetes association using logistic regression under additive (Padd), dominant, and recessive genetic models with adjustment for 5-year age category, sex, and birth province. Imputed SNPs were tested for type 2 diabetes association using logistic regression under an additive model (Pimpute), with the expected allele count in place of the allele count and adjusted for the same covariates. This approach takes into account the degree of uncertainty of genotype imputation in a computationally efficient manner by replacing allele counts (0, 1, and 2) at the marker locus by predicted allele counts based on estimated probabilities of 0, 1, or 2 copies of a SNP allele (available at http://www.sph.umich.edu/csg/abecasis/MACH/) (28). We accounted for carrying out multiple correlated tests using the P value adjusted for correlated tests (PACT) method (37). The PACT method was used to correct the minimum P value among 1) tests of three genetic models for a single SNP (PSNP) and 2) multiple SNPs and models across a gene region (Pgene). Details are provided in the online appendix. We determined the independence of significant association signals in genes by including one SNP as a covariate in logistic regression and reassessing the evidence for association with the other SNPs.

Quantitative trait analysis.

We tested all genotyped and imputed SNPs for association with 20 type 2 diabetes–related quantitative traits, including, in control subjects only, fasting insulin, fasting glucose, homeostasis model adjustment, and fasting free fatty acids; and, in all samples, BMI, weight, waist circumference, hip circumference, waist-to-hip ratio, waist-to-height2 ratio, total cholesterol, HDL cholesterol, LDL cholesterol, triglyceride level, cholesterol-to-HDL ratio, triglyceride-to-HDL ratio, diastolic blood pressure, systolic blood pressure, pulse, and pulse pressure. For case and control subjects separately, we regressed the quantitative trait variables on age, age2, sex, birth province, and study indicator and transformed the residuals of each quantitative trait to approximate normality using inverse normal scores, which involves ranking the residual values and then converting these to z-scores according to quantiles of the standard normal distribution. We then carried out association analysis on the residuals. To allow for relatedness, regression coefficients were estimated in the context of a variance component model that also accounted for background polygenic effects (38). For genotyped SNPs, we tested for association using the residuals under an additive model. For imputed SNPs, we tested for association using the residuals and the expected allele count in place of the allele count under an additive model. Case and control results were combined using meta-analysis, as described in the online appendix.

RESULTS

We studied 222 candidate genes for type 2 diabetes association in our stage 1 sample of 1,161 type 2 diabetic case subjects and 1,174 control subjects with normal glucose tolerance from the FUSION study (Table 1). Of 10,762 target HapMap SNPs (MAF > 0.05) in the −10-kb/+5-kb gene regions, 3,531 genotyped SNPs cover 10,299 (95.7%) SNPs at an r2 threshold of 0.8. This represents an improvement over the genome-wide HumanHap300 genotyped SNPs, which alone cover 79.0% of the target SNPs at r2 ≥ 0.8 (Table 2). A total of 3,187 of 3,531 genotyped SNPs are located in the −10-kb/+5-kb regions. Of the remaining 7,575 ungenotyped target SNPs, 7,498 were successfully imputed. Altogether, 99.9% of all target variation was genotyped, imputed, or tagged (r2 ≥ 0.8) by an analyzed SNP.

TABLE 2

Coverage of 10,762 HapMap SNPs (MAF > 0.05)* within −10 kb/+5 kb of 222 candidate genes

	SNPs −10 kb/+5 kb of gene
	Number of SNPs analyzed†	Number captured‡	Percent captured‡
SNPs genotyped on GWA panel only	2,150	8,507	79.04
All 3,531 genotyped SNPs	3,531	10,299	95.74
Genotyped and imputed SNPs from GWA panel only	10,596	10,647	98.93
All 3,531 genotyped and 7,498 imputed SNPs	11,029	10,752§	99.91

MAF > 0.05 in HapMap CEU.

Genotyped SNPs are located within −50 kb/+50 kb of a gene but may not be within −10 kb/+5 kb of a gene. Imputed SNPs are all located within −10 kb/+5 kb of a gene.

HapMap SNPs genotyped, imputed, or tagged (r2> 0.8) by a genotyped SNP.

A total of 10,752 includes 3,187 genotyped SNPs, 7,498 imputed SNPs, and 67 SNPs tagged (r2> 0.8) by a genotyped SNP.

We evaluated the significance of genotyped SNPs in each gene region after correcting for multiple SNPs tested while accounting for the LD between SNPs, designated Pgene (37). Given six pairs of adjacent genes (see online appendix), we analyzed 216 distinct gene regions for type 2 diabetes association (online appendix Table 2). SNPs in four gene regions (rs11183212 in ARID2 [Pgene = 0.0029], rs2235718 in FOXC1 [Pgene = 0.0028], rs8069976 in SOCS3 [Pgene = 0.0037], and rs222852 in SLC2A4 [Pgene = 0.0024]) were significantly associated with type 2 diabetes at Pgene < 0.005, although no Pgene result reached a study-wide significance of 0.00023, a threshold determined using a Bonferroni correction. SNPs in 19 genes were significant at Pgene < 0.05, including SNPs in three genes previously implicated in type 2 diabetes susceptibility in FUSION (5) (Table 3). There was an excess of significant Pgene results at both thresholds (4 at Pgene < 0.005 [P = 0.024]; 19 at Pgene < 0.05 [P = 0.013]). The excess of significant results at Pgene < 0.005 is maintained after excluding 1) seven genes showing prior evidence of association with any SNP in FUSION samples (P = 0.022) or 2) five genes not selected by CAESAR (P = 0.022), as no excluded genes were significant at that threshold (see online appendix).

TABLE 3

Gene regions (−10 kb/+5 kb) associated with type 2 diabetes (Pgene < 0.05) in stage 1 samples

Gene symbol	Chromosome	Start position* (bp)	End position* (bp)	Coverage (%)†	SNP‡	P_gene
SLC2A4	17	7,125,835	7,131,125	90.0	rs222852	0.0024
FOXC1	6	1,555,680	1,557,341	100.0	rs2235718	0.0028
ARID2	12	44,409,887	44,588,086	97.8	rs11183212	0.0029
SOCS3	17	73,864,459	73,867,753	100.0	rs8069976	0.0037
FOXC2	16	85,158,443	85,159,948	100.0	rs4843165	0.012
ENPP1§¶	6	132,170,853	132,254,043	94.8	rs9402346	0.014
PRKAA2	1	56,823,041	56,886,142	90.0	rs11206883	0.014
JAK3	19	17,797,961	17,819,800	85.7	rs11888	0.016
CBLB	3	106,859,799	107,070,577	98.8	rs17280845	0.017
SLC2A2§	3	172,196,839	172,227,470	100.0	rs10513684	0.023
PRKAR2B	7	106,279,129	106,396,206	97.0	rs2395836	0.027
EDF1	9	137,032,408	137,036,575	100.0	rs3739942	0.029
PCK2	14	23,633,323	23,643,177	100.0	rs2759407	0.034
PRKAG3	2	219,512,611	219,522,017	100.0	rs6436094	0.037
MECR	1	29,340,001	29,378,070	87.1	rs10915239	0.038
RXRA	9	134,519,422	134,558,376	85.7	rs3118526	0.040
PPARGC1A	4	23,469,914	23,567,969	94.4	rs2970871	0.041
PPARG§	3	12,304,359	12,450,840	99.1	rs1801282	0.042
NR1I3	1	158,012,528	158,021,028	100.0	rs2502807	0.049

Start and end positions of transcribed region (see research design and methods). Positions based on hg17.

Percentage of common (MAF > 0.05) SNPs within −10 kb/+5 kb of a gene and captured at r2 of at least 0.8.

SNP with minimum P value in given gene used to calculate Pgenevalue (see research design and methods).

Gene has previous evidence of association in FUSION.

Selected for study only based on previous evidence of association in FUSION.

To evaluate all 3,531 genotyped SNPs (online appendix Table 3), we permuted the case/control status to estimate whether an excess of significant results was observed. A total of 214 SNPs showed significant type 2 diabetes association at a PSNP threshold of 0.05, and, of these, 26 were associated at a PSNP threshold of 0.005 (Table 4). There was modest, but not significant, excess at both of these PSNP thresholds (observed = 214, expected = 183.3, P = 0.09 and observed = 26, expected = 18.9, P = 0.12, respectively). The most significant PSNP value of 3.6 × 10−4 was observed for rs11183212, an intronic SNP in the ARID2 gene, but when compared with an empirical distribution of the most significant P values, this SNP does not reach a study-wide significance threshold of 6.3 × 10−5, based on 1,000 permutations. In the combined stage 1 and 2 sample, we have >99% power (80% in stage 1 alone) to detect the most strongly associated previously observed type 2 diabetes SNP, rs7903146 in TCF7L2 (9–12), at a study-wide significance level, and substantially less power to detect type 2 diabetes–associated SNPs with smaller effect sizes.

TABLE 4

Type 2 diabetes association for SNPs genotyped in FUSION stage 1 and 2 samples, sorted by combined stages 1 and 2 PSNP

SNP	Gene symbol	Chromosome	Position (bp)*	Risk/nonrisk allele	Risk allele frequency	Stage 1 P_SNP	Stage 2 P_SNP	Combined stage 1 and 2
SNP	Gene symbol	Chromosome	Position (bp)*	Risk/nonrisk allele	Risk allele frequency	Stage 1 P_SNP	Stage 2 P_SNP	Model	P value	Odds ratio (95% CI)	P_SNP
rs4740283	RAPGEF1	9	131,477,850	G/A	0.104	0.0042	0.030	REC	0.000052	3.12 (1.73–5.63)	0.00013
rs2021966†	ENPP1	6	132,192,132	A/G	0.608	0.00018	0.27	REC	0.00010	1.27 (1.13–1.43)	0.00026
rs1042522†‡	TP53	17	7,520,197	G/C	0.263	0.010	0.067	MUL	0.00037	1.18 (1.08–1.30)	0.00086
rs1882095	NRF1	7	128,991,595	T/C	0.381	0.0036	0.061	DOM	0.00043	1.24 (1.10–1.40)	0.00096
rs10513684	SLC2A2	3	172,206,912	C/T	0.918	0.0046	0.20	MUL	0.0010	1.28 (1.11–1.49)	0.0023
rs1801282	PPARG	3	12,368,125	C/G	0.836	0.0025	0.44	MUL	0.0014	1.20 (1.07–1.33)	0.0034
rs222852	SLC2A4	17	7,081,330	A/G	0.610	0.00048	0.18	MUL	0.0029	1.14 (1.04–1.23)	0.0070
rs4843165	FOXC2	16	85,162,542	C/T	0.706	0.0038	0.28§	MUL	0.0033	1.15 (1.05–1.25)	0.0078
rs5400‡	SLC2A2	3	172,215,002	G/A	0.871	0.0065	0.46	MUL	0.0045	1.19 (1.06–1.35)	0.010
rs858341	ENPP1	6	132,202,148	G/A	0.510	0.0039	0.70§	REC	0.0052	1.21 (1.06–1.39)	0.012
rs1349498	RAPGEF4	2	173,418,113	C/T	0.729	0.0015	0.68	DOM	0.0065	1.35 (1.09–1.67)	0.015
rs8069976	SOCS3	17	73,861,445	C/A	0.849	0.0011	0.90	MUL	0.0070	1.17 (1.04–1.31)	0.016
rs3769249	RAPGEF4	2	173,648,169	G/A	0.647	0.0040	0.79	DOM	0.0077	1.27 (1.06–1.51)	0.018
rs17280845	CBLB	3	106,927,226	T/C	0.238	0.00083	0.65	REC	0.010	1.37 (1.07–1.76)	0.027
rs5219‡	KCNJ11	11	17,366,148	T/C	0.476	0.0054	0.45	MUL	0.014	1.11 (1.02–1.20)	0.031
rs10915239	MECR	1	29,344,565	C/A	0.945	0.0046	0.60	REC	0.016	1.26 (1.04–1.51)	0.033
rs11206883	PRKAA2	1	56,815,240	A/G	0.095	0.0014	0.58	MUL	0.026	1.17 (1.02–1.34)	0.054
rs11183212	ARID2	12	44,500,134	G/A	0.200	0.00036	0.68	MUL	0.028	1.12 (1.01–1.24)	0.061
rs2395836	PRKAR2B	7	106,381,475	C/T	0.519	0.0022	0.26	DOM	0.034	1.16 (1.01–1.34)	0.072
rs2970871	PPARGC1A	4	23,566,851	C/T	0.424	0.0012	0.081	REC	0.042	1.17 (1.01–1.36)	0.088
rs11888	JAK3	19	17,796,626	C/T	0.315	0.0014	0.71	MUL	0.075	1.08 (0.99–1.18)	0.15
rs2235718	FOXC1	6	1,552,602	T/C	0.117	0.00068	0.28	REC	0.096	1.55 (0.92–2.59)	0.19
rs3118526	RXRA	9	134,563,302	C/T	0.922	0.0039	0.60	DOM	0.11	0.52 (0.23–1.18)	0.21
rs9313	SORBS1	10	97,061,862	G/T	0.919	0.0045	0.66	MUL	0.11	1.13 (0.97–1.32)	0.21
rs9402346	ENPP1	6	132,185,981	C/G	0.646	0.00062	¶	—	—	—	—
rs1830971	ENPP1	6	132,190,046	A/G	0.648	0.00072	¶	—	—	—	—
rs1409184	ENPP1	6	132,182,184	G/A	0.646	0.00072	¶	—	—	—	—
rs6802898	PPARG	3	12,366,207	C/T	0.835	0.0031	¶	—	—	—	—
rs7796553	NRF1	7	128,974,194	C/T	0.172	0.0039	¶	—	—	—	—
rs943852	RAPGEF1	9	131,480,747	T/C	0.111	0.0042	¶	—	—	—	—

Positions based on hg17.

SNP was originally imputed (see online appendix Table 4).

Nonsynonymous SNP selected for stage 2 genotyping.

Included even though stage 2 sample success rate <90%.

SNP was not successfully genotyped in stage 2 or not selected for genotyping in stage 2 based on high LD with a selected SNP.

Nineteen of 216 gene regions have at least one SNP significantly associated with type 2 diabetes at PSNP < 0.005; among these, Pro12Ala (rs1801282) in PPARG (PSNP = 0.0025) was the only SNP that matched or was in high LD (r2 ≥ 0.8) with a previously reported variant, given the available HapMap LD information. Imputation identified 421 additional SNPs in 59 genes significantly associated with type 2 diabetes (Pimpute < 0.05) (online appendix Table 4), including SNPs in 10 genes that did not contain a significant genotyped SNP (PSNP > 0.05). We genotyped four of these initially imputed SNPs that were both significantly associated with type 2 diabetes (Pimpute < 0.05) and for which the imputation-based P value was at least five times more significant than that for any nearby genotyped SNP; three of four SNPs had highly concordant imputed and genotyped P values (online appendix Table 5). We selected for follow-up genotyping in stage 2 samples 24 SNPs that were either significant at PSNP < 0.005 or, if a nonsynonymous variant, significant at PSNP < 0.01 (Table 1). The most significant SNPs in the combined stage 1 and 2 samples were rs4740283 in RAPGEF1 (PSNP = 0.00013), rs2021966 in ENPP1 (PSNP = 0.00026), Arg72Pro (rs1042522) in TP53 (PSNP = 0.00086), and rs1882095 in NRF1 (PSNP = 0.00096). In total, 16 SNPs were significant at PSNP < 0.05 in the combined stage 1 and 2 samples (Table 4). To evaluate the effect of BMI, we included BMI as an additional covariate in an analysis of the additive model for all genotyped and imputed SNPs. Of 11 SNPs originally significant at Padd < 0.001, all P values were similar (Padd < 0.01) after adjustment (online appendix Table 6A). Of 16 SNPs significant at Padd < 0.001 after adjustment, two SNPs had notably less significant P values (Padd > 0.01) before adjustment; both SNPs are located at the TRIP10/C3 locus (online appendix Table 6B). Four genotyped and 30 imputed SNPs were strongly associated (P < 0.0001) with one or more of 20 quantitative traits after combining case and control subjects by meta-analysis (see research design and methods) (Table 5 and online appendix Table 7). Variants in APOE and PPARA showed strong evidence of association with serum lipid levels, confirming previous reports (39,40). Strong novel associations (P < 1 × 10−5) were observed for rs4912407 in PRKAA2 with triglyceride level (P = 3.68 × 10−6), rs10517844 in CPE with HDL level (P = 2.07 × 10−5), and rs4689388 in WFS1 with LDL level (P = 5.30 × 10−5). We followed-up genotyped SNPs significantly associated (P < 0.0001) with one or more quantitative traits by genotyping the stage 2 samples. No SNP showed study-wide significance in the combined stage 1 and 2 samples (Table 5).

TABLE 5

Quantitative trait association results for SNPs genotyped in FUSION stage 1 and 2 samples

SNP	Gene	Chromosome	Position (bp)	Major/minor allele	Trait	Samples*	Stage 1 P value†	Stage 2 P value†	Combined P value‡
rs9615264	PPARA	22	44,953,108	G/A	HDL level	4,682	1.06E-04	0.13	0.00013
rs10517844	CPE	4	166,691,996	T/C	Cholesterol-to HDL ratio	4,682	4.00E-05	0.66	0.009
					HDL level	4,682	2.07E-05	0.098	0.065
rs4689388	WFS1	4	6,388,128	A/G	LDL level	4,067	5.30E-05	0.94	0.002
rs429358	APOE	19	50,103,781	T/C	Cholesterol-to HDL ratio	2,327	1.78E-10	§	—
					LDL level	2,257	1.09E-06	§	—
					HDL level	2,327	2.36E-06	§	—
					Cholesterol level	2,327	1.51E-05	§	—
rs4912407	PRKAA2	1	56,825,022	G/A	Triglyceride level	2,339	3.68E-06	§	—
					Triglyceride-to HDL ratio	2,339	2.77E-05	§	—

Number of samples corrected to an effective sample size considering the relatedness of some samples.

P value calculated under additive model.

Stage 1 and 2 P values combined by meta-analysis (see research design and methods).

SNP was not successfully genotyped in stage 2.

DISCUSSION

In this study, we evaluated the evidence for type 2 diabetes association for SNPs in 222 candidate genes and provided a framework for thorough analysis of association of common variation to disease using gene-based functional annotation, HapMap LD information, and imputation of genotypes. This framework could be used in the context of a GWA study or an independent investigation of candidate genes. We replicated previous type 2 diabetes association with SNPs in PPARG, KCNJ11, and SLC2A2; identified significant SNPs in genes previously implicated in type 2 diabetes risk, NRF1 and ENPP1; and identified additional genes that may influence susceptibility to type 2 diabetes and related quantitative traits, including RAPGEF1 and TP53. While some of the genes may be significant by chance, one or more may represent true susceptibility genes. We expect that true susceptibility genes identified in our sample set will, in many cases, be shared in additional populations, as the FUSION GWA study identified many of the same risk alleles as other GWA studies of European populations (9–13). To assess the role of 222 genes in susceptibility to type 2 diabetes, we attempted to assess complete coverage of common (MAF > 0.05) SNPs in the HapMap CEU database. The coverage of common HapMap CEU SNPs across all 222 candidate genes using genotyped SNPs was 95.7%, a 16.7% percent improvement over the coverage of 79.0% based on the Illumina HumanHap300 genome-wide panel (Table 2). HapMap provides excellent coverage of common variation in European samples; however, there are additional non-HapMap SNPs in these gene regions (27). Of 122 genotyped SNPs not in HapMap, 10 were not tagged at an r2 threshold of 0.8 by a HapMap SNP, indicating that some of the non-HapMap variation is better covered in our study than the GWA study panel. Our SNP that is most strongly associated with type 2 diabetes in the stage 1 and 2 samples was SNP rs4740283 (PSNP = 0.00013), located 4 kb downstream of Rap guanine nucleotide exchange factor 1 (RAPGEF1). RAPGEF1 is a ubiquitously expressed gene involved in insulin signaling (41) and Ras-mediated tumor suppression (42). rs4740283 is in strong LD with SNPs in the coding region and may affect either a regulatory element or protein function. Variation in this gene may contribute to susceptibility through reduced ability of peripheral tissues to absorb glucose in response to insulin. The second strongest-associated SNP in the stage 1 and 2 samples was Arg72Pro in TP53 (rs1042522, PSNP = 0.00086), which was originally identified by imputation, subsequently genotyped, and not well tagged by any originally genotyped SNP (maximum r2 = 0.27 with rs2909430). TP53 encodes the tumor suppressor protein p53, and the Arg72Pro variant has a functional role in the efficiency of p53 in inducing apoptosis, possibly through reduced localization to the mitochondria (43). The risk allele Arg72 has higher apoptotic potential, which is consistent with a possible link between increased pancreatic β-cell apoptosis, impaired insulin secretion, and type 2 diabetes. We observed significant association with SNPs in two genes previously implicated in type 2 diabetes susceptibility, nuclear respiratory factor 1 (NRF1) and the insulin-dependent facilitated glucose transporter SLC2A2. NRF1 helps regulate mitochondrial transcription and oxidative phosphorylation (44), which has a known role in insulin resistance, and the associated NRF1 variant, rs1882095, is located 1 kb downstream of the gene and not in modest LD (r2 > 0.6) with any HapMap SNP. In SLC2A2 we found supporting evidence in stage 1 for the nonsynonymous variant Thr110Ile (rs5400) (PSNP = 0.0065), as well as a previously unreported variant, rs10513684 (PSNP = 0.0046). The rs10513684 signal became slightly more significant after stage 2 genotyping (PSNP = 0.0023); however, the signal was attenuated (P = 0.18) after inclusion of Thr110Ile in the analysis. Among the most significant type 2 diabetes–associated SNPs is rs2021966 in ENPP1 (PSNP = 0.00026). SNPs in high LD with rs2021966 are located in intron 1, in a region of strong multispecies conservation containing a pseudogene but no known transcripts. Previous studies of ENPP1 have reported associations with rs1044498 and with a related three-SNP haplotype (rs1044498, rs1799774, and rs7754561) and support a modest role in type 2 diabetes susceptibility, possibly acting through obesity (45). In our study, rs1044498 (PSNP = 0.16) and rs7754859 (PSNP = 0.18, r2 = 1 with rs7754561) were not significantly associated with type 2 diabetes (rs1799774 was not tested). The newly identified variants are in very low LD with rs1044498 (r2 < 0.05). Although we observed significant quantitative trait associations in previously implicated genes (APOE and PPARA with serum lipid levels), no quantitative trait associations became more significant after addition of stage 2 samples (Table 5). This is likely due in part to the small number of SNPs selected for follow-up. Stage 2 genotyping of SNPs less significant in stage 1 samples will be necessary to establish whether any novel SNPs contribute to quantitative trait variability. In any gene-based study, the definition of gene boundaries is critical but, by necessity, somewhat arbitrary. We defined a gene region as 10 kb upstream of the first known exon through 5 kb downstream of the last known exon in an attempt to capture the majority of nearby regulatory elements influencing a gene. Regulatory elements, however, can often be found up to several hundred kilobases away from a gene (46). We evaluated whether a broader definition of a gene had a substantial effect on the Pgene results by testing extended gene regions 50 kb upstream and 50 kb downstream of transcribed regions and by including HumanHap300 SNPs from these regions in our analysis. Using the extended gene boundaries, the insulin gene INS would be the most significant gene in our study (Pgene = 0.0019), driven by SNP rs10743152 (PSNP = 0.00015) located 13 kb upstream of the first exon. Other genes that had significant SNPs (Pgene < 0.05) only in the extended gene region were MAP2K1, CDK4, and IRF4. Even using the narrow gene boundaries, several SNPs in our study may influence expression or function of other nearby or even more distant genes. Recent GWA studies have confirmed novel susceptibility variants downstream of HHEX, a gene selected for this study by CAESAR (9–12); the reported SNPs are located outside of the narrow gene region (−10 kb/+5 kb) in a large LD block that includes KIF11 and IDE, and we only detected nominal significance in the narrow HHEX region (PSNP = 0.037 for rs12262390). For some genes, the extent of LD surrounding significant SNPs implicates flanking genes. For example, in ARID2, rs35115 (PSNP = 0.0067) is located in intron 7 but also tags the nonsynonymous variant rs7315731 in SFRS2IP (r2 = 0.93). These examples demonstrate that defining a gene boundary requires a balance between capturing all possible SNPs influencing the gene and introducing SNPs that may be more functionally relevant to other genes. A more sophisticated approach to establish gene boundaries that defines each gene boundary separately by considering the genomic context around the gene may be helpful in future gene-based approaches. Gene-based approaches to interpreting the results of candidate gene and even genome-wide association studies are important because most variation influencing susceptibility to type 2 diabetes and other common complex traits is currently expected to be gene centric, although the definition of a gene is constantly evolving. Detailed coverage of the common variation in these genes represents a critical requirement for an effective and thorough gene-based study. Here, we have identified genes significantly associated with type 2 diabetes and related quantitative traits that are attractive targets for future replication studies. Confirmation in a larger sample set and meta-analyses across studies will be important to help determine the role of these genes.

45 in total

1. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes.

Authors: Struan F A Grant; Gudmar Thorleifsson; Inga Reynisdottir; Rafn Benediktsson; Andrei Manolescu; Jesus Sainz; Agnar Helgason; Hreinn Stefansson; Valur Emilsson; Anna Helgadottir; Unnur Styrkarsdottir; Kristinn P Magnusson; G Bragi Walters; Ebba Palsdottir; Thorbjorg Jonsdottir; Thorunn Gudmundsdottir; Arnaldur Gylfason; Jona Saemundsdottir; Robert L Wilensky; Muredach P Reilly; Daniel J Rader; Yu Bagger; Claus Christiansen; Vilmundur Gudnason; Gunnar Sigurdsson; Unnur Thorsteinsdottir; Jeffrey R Gulcher; Augustine Kong; Kari Stefansson
Journal: Nat Genet Date: 2006-01-15 Impact factor: 38.330

2. Cross-sectional evaluation of the Finnish Diabetes Risk Score: a tool to identify undetected type 2 diabetes, abnormal glucose tolerance and metabolic syndrome.

Authors: Timo Saaristo; Markku Peltonen; Jaana Lindström; Liisa Saarikoski; Jouko Sundvall; Johan Gunnar Eriksson; Jaakko Tuomilehto
Journal: Diab Vasc Dis Res Date: 2005-05 Impact factor: 3.291

3. Mapping genes for NIDDM. Design of the Finland-United States Investigation of NIDDM Genetics (FUSION) Study.

Authors: T Valle; J Tuomilehto; R N Bergman; S Ghosh; E R Hauser; J Eriksson; S J Nylund; K Kohtamäki; L Toivanen; G Vidgren; E Tuomilehto-Wolf; C Ehnholm; J Blaschak; C D Langefeld; R M Watanabe; V Magnuson; D S Ally; W A Hagopian; E Ross; T A Buchanan; F Collins; M Boehnke
Journal: Diabetes Care Date: 1998-06 Impact factor: 19.112

4. Common variants in maturity-onset diabetes of the young genes contribute to risk of type 2 diabetes in Finns.

Authors: Lori L Bonnycastle; Cristen J Willer; Karen N Conneely; Anne U Jackson; Cecily P Burrill; Richard M Watanabe; Peter S Chines; Narisu Narisu; Laura J Scott; Sareena T Enloe; Amy J Swift; William L Duren; Heather M Stringham; Michael R Erdos; Nancy L Riebow; Thomas A Buchanan; Timo T Valle; Jaakko Tuomilehto; Richard N Bergman; Karen L Mohlke; Michael Boehnke; Francis S Collins
Journal: Diabetes Date: 2006-09 Impact factor: 9.461

5. Association of transcription factor 7-like 2 (TCF7L2) variants with type 2 diabetes in a Finnish sample.

Authors: Laura J Scott; Lori L Bonnycastle; Cristen J Willer; Andrew G Sprau; Anne U Jackson; Narisu Narisu; William L Duren; Peter S Chines; Heather M Stringham; Michael R Erdos; Timo T Valle; Jaakko Tuomilehto; Richard N Bergman; Karen L Mohlke; Francis S Collins; Michael Boehnke
Journal: Diabetes Date: 2006-09 Impact factor: 9.461

6. C3G-mediated suppression of oncogene-induced focus formation in fibroblasts involves inhibition of ERK activation, cyclin A expression and alterations of anchorage-independent growth.

Authors: Carmen Guerrero; Susana Martín-Encabo; Alberto Fernández-Medarde; Eugenio Santos
Journal: Oncogene Date: 2004-06-17 Impact factor: 9.867

7. Concordance for type 1 (insulin-dependent) and type 2 (non-insulin-dependent) diabetes mellitus in a population-based cohort of twins in Finland.

Authors: J Kaprio; J Tuomilehto; M Koskenvuo; K Romanov; A Reunanen; J Eriksson; J Stengård; Y A Kesäniemi
Journal: Diabetologia Date: 1992-11 Impact factor: 10.122

8. TC10 and insulin-stimulated glucose transport.

Authors: Shian-Huey Chiang; Louise Chang; Alan R Saltiel
Journal: Methods Enzymol Date: 2006 Impact factor: 1.600

9. Variants of ENPP1 are associated with childhood and adult obesity and increase the risk of glucose intolerance and type 2 diabetes.

Authors: David Meyre; Nabila Bouatia-Naji; Agnès Tounian; Chantal Samson; Cécile Lecoeur; Vincent Vatin; Maya Ghoussaini; Christophe Wachter; Serge Hercberg; Guillaume Charpentier; Wolfgang Patsch; François Pattou; Marie-Aline Charles; Patrick Tounian; Karine Clément; Béatrice Jouret; Jacques Weill; Betty A Maddux; Ira D Goldfine; Andrew Walley; Philippe Boutin; Christian Dina; Philippe Froguel
Journal: Nat Genet Date: 2005-07-17 Impact factor: 38.330

10. The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information.

Authors: Cynthia L Smith; Carroll-Ann W Goldsmith; Janan T Eppig
Journal: Genome Biol Date: 2004-12-15 Impact factor: 13.583

43 in total

1. The p53 Codon 72 Polymorphism Modifies the Cellular Response to Inflammatory Challenge in the Liver.

Authors: Julia I-Ju Leu; Maureen E Murphy; Donna L George
Journal: J Liver Date: 2013

2. Genetics of type 2 diabetes.

Authors: Omar Ali
Journal: World J Diabetes Date: 2013-08-15

3. Tissue-specific apoptotic effects of the p53 codon 72 polymorphism in a mouse model.

Authors: Gregory A Azzam; Amanda K Frank; Monica Hollstein; Maureen E Murphy
Journal: Cell Cycle Date: 2011-05-01 Impact factor: 4.534

Review 4. Epigenomic and transcriptional control of insulin resistance.

Authors: E D Rosen
Journal: J Intern Med Date: 2016-10-14 Impact factor: 8.989

5. Predicting disease-related subnetworks for type 1 diabetes using a new network activity score.

Authors: Shouguo Gao; Shuang Jia; Martin J Hessner; Xujing Wang
Journal: OMICS Date: 2012-08-23

6. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes.

Authors: Yun Li; Cristen J Willer; Jun Ding; Paul Scheet; Gonçalo R Abecasis
Journal: Genet Epidemiol Date: 2010-12 Impact factor: 2.135

7. Common genetic variants in peroxisome proliferator-activated receptor-γ (PPARG) and type 2 diabetes risk among Women's Health Initiative postmenopausal women.

Authors: Kei Hang K Chan; Tianhua Niu; Yunsheng Ma; Nai-chieh Y You; Yiqing Song; Eric M Sobel; Yi-Hsiang Hsu; Raji Balasubramanian; Yongxia Qiao; Lesley Tinker; Simin Liu
Journal: J Clin Endocrinol Metab Date: 2013-02-05 Impact factor: 5.958

8. Association of genetic variants in INS (rs689), INSR (rs1799816) and PP1G.G (rs1799999) with type 2 diabetes (T2D): a case-control study in three ethnic groups from North-West India.

Authors: Jasmine Sokhi; Ruhi Sikka; Priyanka Raina; Ramandeep Kaur; Kawaljit Matharoo; Punit Arora; Ajs Bhanwer
Journal: Mol Genet Genomics Date: 2015-08-07 Impact factor: 3.291

9. Brief report: enrichment of associations in genes with fibrosis, apoptosis, and innate immunity functions with cardiac manifestations of neonatal lupus.

Authors: Paula S Ramos; Miranda C Marion; Carl D Langefeld; Jill P Buyon; Robert M Clancy
Journal: Arthritis Rheum Date: 2012-12

10. Personal phenotypes to go with personal genomes.

Authors: Michael Snyder; Sherman Weissman; Mark Gerstein
Journal: Mol Syst Biol Date: 2009-05-19 Impact factor: 11.429