Guanjie Chen1, Daniel Shriner1, Jianhua Zhang2, Jie Zhou1, Poorni Adikaram3, Ayo P Doumatey1, Amy R Bentley1, Adebowale Adeyemo1, Charles N Rotimi1. 1. Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, Maryland, United States of America. 2. Metabolic Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, Maryland, United States of America. 3. Advanced BioScience Laboratories, Rockville, Maryland, United States of America.
Abstract
Impaired glucose tolerance is a major risk factor for type 2 diabetes (T2D) and several cardiometabolic disorders. To identify genetic loci underlying fasting glucose levels, we conducted an analysis of 9,232 individuals of European ancestry who at enrollment were either nondiabetic or had untreated type 2 diabetes. Multivariable linear mixed models were used to test for associations between fasting glucose and 7.9 million SNPs, with adjustment for age, body mass index (BMI), sex, significant principal components of the genotypes, and cryptic relatedness. Three previously discovered loci were genome-wide significant, with the lead SNPs being rs1260326, a missense variant in GCKR (p = 1.06×10-8); rs560887, an intronic variant in G6PC2 (p = 3.39×10-11); and rs13266634, a missense variant in SLC30A8 (p = 4.28×10-10). Fine mapping, genome-wide conditional analysis, and functional annotation indicated that the three loci were independently associated with fasting glucose. Each copy of an alternate allele at any of these three SNPs was associated with a reduction of 0.012 mmol/L in fasting glucose levels (p = 8.0×10-28), and this association was replicated in trans-ethnic analysis of 14,303 individuals (p = 2.2×10-16). The three SNPs were jointly associated with significantly reduced T2D risk, with an odds ratio (95% CI) of 0.93 (0.88, 0.98) per protective allele. Our findings implicate additive effects across pathophysiological pathways involved in type 2 diabetes, including glycolysis, gluconeogenesis, and insulin secretion. Since none of the individuals homozygous for the alternate alleles at all three loci has T2D, it might be possible to use a genetic predictor of fasting glucose levels to identify individuals at low vs. high risk of developing type 2 diabetes.
Impaired glucose tolerance is a major risk factor for type 2 diabetes (T2D) and several cardiometabolic disorders. To identify genetic loci underlying fasting glucose levels, we conducted an analysis of 9,232 individuals of European ancestry who at enrollment were either nondiabetic or had untreated type 2 diabetes. Multivariable linear mixed models were used to test for associations between fasting glucose and 7.9 million SNPs, with adjustment for age, body mass index (BMI), sex, significant principal components of the genotypes, and cryptic relatedness. Three previously discovered loci were genome-wide significant, with the lead SNPs being rs1260326, a missense variant in GCKR (p = 1.06×10-8); rs560887, an intronic variant in G6PC2 (p = 3.39×10-11); and rs13266634, a missense variant in SLC30A8 (p = 4.28×10-10). Fine mapping, genome-wide conditional analysis, and functional annotation indicated that the three loci were independently associated with fasting glucose. Each copy of an alternate allele at any of these three SNPs was associated with a reduction of 0.012 mmol/L in fasting glucose levels (p = 8.0×10-28), and this association was replicated in trans-ethnic analysis of 14,303 individuals (p = 2.2×10-16). The three SNPs were jointly associated with significantly reduced T2D risk, with an odds ratio (95% CI) of 0.93 (0.88, 0.98) per protective allele. Our findings implicate additive effects across pathophysiological pathways involved in type 2 diabetes, including glycolysis, gluconeogenesis, and insulin secretion. Since none of the individuals homozygous for the alternate alleles at all three loci has T2D, it might be possible to use a genetic predictor of fasting glucose levels to identify individuals at low vs. high risk of developing type 2 diabetes.
Impaired fasting glucose, also referred to as prediabetes, is a risk factor for cardiovascular disease and type 2 diabetes (T2D) [1, 2]. Investigating the genetic architecture of fasting glucose will lead to a better understanding of the mechanisms involved in glucose homeostasis and subsequently the pathophysiology of T2D [3]. Genetic analysis of fasting glucose as a quantitative trait complements genetic analysis of T2D as a dichotomous trait.Genome-wide association studies (GWAS) have been widely used in investigating the genetic architecture of fasting glucose levels. Genetic associations with fasting glucose have been reported in 17 loci in individuals of European ancestry [3-5]. There are more than 240 published loci associated with T2D [6, 7]. Only nine T2D loci (GCKR, GCK, SLC30A8, PROX1, ADCY5, DGKB, GLIS3, TCF7L2, and MTNR1B) overlap with fasting glucose loci, which appear to mediate impairment of the glucose-sensing machinery in pancreatic β islet cells [3]. One trivial explanation is low power. Alternatively, loci affecting physiological levels of fasting glucose among normoglycemic individuals need not be the same as loci that affect pathophysiological levels of fasting glucose when hyperglycemic individuals are also considered. As the genetic architectures of fasting glucose and T2D are incompletely known, we caution against overinterpreting this interim result.The Atherosclerosis Risk in Communities (ARIC) study is a prospective study of atherosclerosis in middle-aged adults [8]. Previously, a GWAS for the average of four fasting glucose measurements taken over nine years was conducted in individuals without prevalent diabetes, and three known loci near MTNR1B (rs10830963), GCK (rs2971669), and G6PC2 (rs853787) were replicated [4]. Here, we defined the outcome as the first fasting glucose measurement from all untreated individuals, i.e., non-diabetic individuals as well as untreated diabetic individuals. We then performed a GWAS using a linear mixed model with a high-density imputation reference panel and identified three associations in loci previously reported to influence fasting glucose (GCKR, G6PC2, and SLC30A8). Associations at two missense variants in GCKR (rs1260326) and SLC30A8 (rs13266634) were identified in individuals with European ancestry and all three associations replicated in trans-ethnic meta-analysis. These three associations also affect risk of T2D, indicating not just physiological relevance to fasting glucose levels but also pathophysiological relevance to T2D.
Materials and methods
The Atherosclerosis Risk in Communities study is a prospective study of clinical atherosclerotic diseases [8]. Individual-level genotype and phenotype data were obtained by authorized access to dbGaP (https://www.ncbi.nlm.nih.gov/gap/). T2D case status was defined as fasting glucose ≥7.0 mmol/L, self-report of a diagnosis by a physician, or current diabetic treatment. For fasting glucose analysis, individuals without T2D (8,902) and with untreated T2D (330) were used; individuals without diabetic treatment were included because their fasting glucose values were unaffected by treatment. The inclusion of untreated cases makes our analysis more powerful than previous analysis of normoglycemic individuals. Selected variables included age, sex, body mass index (BMI), fasting glucose, and T2D status. Among individuals with a reported race of White, a total of 9,232 individuals without T2D or with untreated T2D were included and used for analysis of fasting glucose. Similarly, a total of 9,731 individuals were used for analysis of T2D.Fasting serum samples were assayed for glucose and were measured on the Roche Hitachi 911 analyzer using the hexokinase method (Roche Diagnostics). Age, sex, race, and ethnicity were self-reported. BMI was calculated as body weight (in kilograms) divided by height (in meters) squared. Medication history over a period of two weeks prior to the visit was verified by review of medication containers that participants brought to the visit.
Genotyping and imputation
Genotyping was performed on the Affymetrix Genome-wide Human SNP Array 6.0. After quality control for minor allele frequency (MAF) ≥0.01, genotype call rate ≥0.95, per-individual missingness rate ≤0.05, and a Hardy-Weinberg equilibrium test p-value >10−6, we retained 800,099 autosomal SNPs. Imputation was performed using the Sanger Imputation Service (https://imputation.sanger.ac.uk/) with the IMPUTE2 software [9] and the 1000 Genomes Project Phase 3 reference panel [10]. The resulting imputed SNPs were filtered for MAF ≥0.01 and info score ≥0.7 [11]. After filtering, 7,896,808 SNPs were retained for association analysis. Coordinates were based on the hg19 build. All alleles are reported with respect to the positive strand.
Association analysis
Fasting glucose levels from the first available measurement were included (S1 Fig). Association analyses were performed using a two-stage linear mixed model and an additive genetic model. In Stage 1, residuals were obtained from a regression of fasting glucose on age, sex, and BMI. The resulting residuals were ranked and inverse normalized. In Stage 2, SNP association was tested by regressing the values from Stage 1 on imputed dosages, adjusted for three significant principal components obtained from the R package SNPRelate (version 1.28.0) [12] as fixed effects and cryptic relatedness as a random effect using the emmax test in EPACTS (version 3.3.0) [13]. The genome-wide significance level α was declared to be 5×10−8. To test for secondary signals, the analysis in Stage 2 was repeated with the inclusion of genome-wide significant SNPs as covariates. R (version 4.0.3) was used in the analyses [14].
Replication analysis
The Multi-Ethnic Study of Atherosclerosis (MESA) [15] and the Framingham Heart Study (FHS) [16] are prospective studies designed to identify risk factors for subclinical atherosclerosis. Individual-level genotype and phenotype data were obtained by authorized access to dbGaP. The China America Diabetes Mellitus (CADM) study is a case-control study of T2D in China [17]. The Africa America Diabetes Mellitus (AADM) study is a case-control study of type 2 diabetes in Africans [18, 19]. The Howard Family University Study (HUFS) is a population-based cross-sectional study of African Americans in Washington, D.C. [20].For fasting glucose/type 2 diabetes analysis, we aggregated data from 2,204/2,314 European Americans, 632/697 Chinese Americans, 1,080/1290 Hispanic Americans, and 1,206/1,407 African Americans from MESA; 2,211/4,378 West Africans from AADM; 1,548/1,754 African Americans from HUFS; and 2,430/2,605 African Americans from ARIC; 985/1,883 Chinese from CADM; and 2,007/2,061 from FHS, totaling 14,303/18,389 individuals, respectively. Genotype data comprised approximately one million SNPs using the Affymetrix Genome-wide Human SNP Array 6.0 (ARIC, MESA, and HUFS) or two million SNPs using the Affymetrix Axiom Genome-wide PanAFR Array (AADM). Affymetrix Axiom® Exome Genotyping arrays (~ 300,000 markers) were used in CADM. The Illumina HumanOmni5M array (~4.3M markers) was used in FHS. For in-house data sets (AADM, HUFS, and CADM) for which we collected individual-level data from study participants, we performed sex checks. For the data sets from dbGaP (ARIC, FHS, and MESA), we relied on documentation available within dbGaP. Quality control, genotype imputation, transformation of fasting glucose levels, covariates (including three significant principal components and cryptic relatedness), and association testing were the same as for the discovery analysis. We performed inverse variance-weighted fixed effects meta-analysis using METAL [21]. Coordinates were based on the hg19 build.
Variant annotation
We used SnpEff to annotate variant effects [22]. SnpEff integrates with other tools in sequencing data analysis pipelines and contains two steps, variant annotation and effect prediction. Variant annotation datasets were built using a reference genome (hg19). Two methods, SNAP [23] and I-Mutant3 [24], were used to assess discriminative power, a raw numerical score reflecting direction and reliability of the prediction, for each SNP. Discriminative power is the distance of the actual prediction to the decision boundary (score = 0), which reflects the reliability of the prediction and the severity of the predicted effects [25].PolyPhen-2 is a tool that predicts the possible impact of an amino acid substitution on the structure and function of a human protein [26]. SIFT is a tool that predicts amino acid changes that affect protein function, distinguishing between functionally neutral and deleterious amino acid changes [27]. Combined Annotation Dependent Depletion (CADD) is a tool for scoring the deleteriousness of variants in the human genome [28, 29]. CADD integrates multiple annotations to generate scores that strongly correlate with allelic diversity, pathogenicity of both coding and non-coding variants, and experimentally measured regulatory effects and that highly rank causal variants. Polyphen-2, SIFT, and CADD scores were all retrieved from Ensembl 104 [30].
Fine mapping
Region fine mapping was performed using the R package CAVIARBF (version 0.2.1), an approximate Bayesian method that can incorporate functional annotation [31]. Minimal data requirements are marginal statistical test results and linkage disequilibrium between SNPs. SNPs with MAF ≥ 0.05 within the gene region ±150kb were selected. SNP annotations were coded for the absence (0) or presence (1) of promoter histone marks, enhancer histone marks, DNAse I hypersensitive sites, or bound proteins as provided by HaploReg v4.1 [32]. Bayes factors were calculated conditional on a maximum number of causal SNPs. The estimated Bayes factors and prior probabilities were then used to estimate the posterior inclusion probabilities.
Additive association evaluation
Linear regression and logistic regression were used to determine the joint additive effect across associated independent loci for fasting glucose levels and T2D status, respectively. Whereas rank-based transformations cannot be back transformed, we log-transformed fasting glucose levels in order to be able to obtain effect sizes in original units. We regressed traits on the number of effect alleles, with adjustment for age, BMI, sex, significant principal components (PCs) by study. The analysis was performed using SAS 9.4 (Cary, NC, USA). The R package meta (version 5.1) [33] was used for meta-analysis with an inverse variance-weighted fixed effects method.
Trait loci annotation
An expression QTL (eQTL) is a genomic locus that affects expression levels of mRNA. A splicing QTL (sQTL) is a genomic locus that affects the expression of RNA isoforms generated by alternative splicing events. We retrieved data on eQTL and sQTL annotations from the Genotype-Tissue Expression (GTEx) Portal (https://gtexportal.org).
Protein structure and function predictions
Based on the protein sequence-to-structure-to-function paradigm, we uploaded translated sequences to the I-TASSER online server (https://zhanggroup.org//I-TASSER/) [34-36]. I-TASSER uses template-based fragment assembly simulations of amino acid sequences to predict three-dimensional protein structures, which are then used to find matches in a protein function database to predict protein functions. The predicted protein structures were viewed and analyzed using PyMol [37].
Ethics statement
Ethical approval for the AADM study was obtained from the National Institutes of Health, the Howard University Institutional Review Board, and from ethics committees in Ghana (University of Ghana Medical School Research Ethics Committee and Kwame Nkrumah University of Science and Technology Committee on Human Research Publication and Ethics), Kenya (Moi Teaching & Referral Hospital/Moi University College of Health Sciences Institutional Research and Ethics Committee), and Nigeria (National Health Research Ethics Committee of Nigeria). Ethical approval for HUFS was obtained from the Howard University Institutional Review Board. Ethical approval for CADM was obtained from the institutional review boards of Howard University, National Institutes of Health, and Suizhou Central Hospital (Suizhou, China). Written informed consent was obtained from each participant. All clinical investigation was conducted according to the principles expressed on the Declaration of Helsinki.
Results
Phenotyping, genotyping, and imputation summaries for all discovery and replication studies are presented in S1 Table and S1 and S2 Figs. Within individuals of European ancestry, males had higher BMI than females. In contrast, BMI was higher in females than males among Africans, African Americans, and Hispanic Americans. Males had higher fasting glucose levels than females in all groups.The discovery and replication analyses included totals of 9,232 and 14,303 individuals, respectively. Three loci reached genome-wide significance (Fig 1 and S2 Table). The genomic control variance inflation factor indicated no inflation due to population stratification (l = 1.01; S3 Fig). Two of the lead SNPs were missense mutations and the third lead SNP was intronic (Table 1). Regional association and Bayesian fine mapping indicated that rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8) had the highest marginal posterior inclusion probabilities (PIP) in their respective loci (S4 Fig). Conditional on the lead SNPs rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8), no signal remained (S5 Fig). The effect alleles at all three lead SNPs were associated with lower fasting glucose (Table 1). The associations at all three lead SNPs were replicated in overall meta-analysis (S3 Table). Of the two suggestive loci (Fig 1, 5×10−7 ≤ p < 5×10−8), only the association at the locus on chromosome 11 was replicated (S4 Table).
Fig 1
Manhattan plot for discovery analysis based on individuals with European ancestry.
The x-axis represents chromosomal positions, and the y-axis represents -log10(p-value). The two dotted lines represent -log10 (5×10−8) and -log10 (5×10−7), respectively.
Table 1
Lead SNPs from discovery GWAS.
Chromo-some
Position (bp)
Gene
SNP
Reference/Alternate Allele
Annotation
Alternate Allele Frequency
Beta
Standard Error
P-value
Variance explained
Impact
Sequence Change
2
27730940
GCKR
rs1260326
C/T
Nonsynonymous
0.4082
-0.0113
0.0020
1.06E-08
0.0035
MODERATE
c.1337T>C, p.Leu446Pro
2
169763148
G6PC2
rs560887
C/T
Intron
0.3026
-0.0132
0.0020
4.39E-11
0.0047
LOW
c.441-26T>C
8
118184783
SLC30A8
rs13266634
C/T
Nonsynonymous
0.3159
-0.0120
0.0019
4.28E-10
0.0042
MODERATE
c.973C>T, p.Arg325Trp
Manhattan plot for discovery analysis based on individuals with European ancestry.
The x-axis represents chromosomal positions, and the y-axis represents -log10(p-value). The two dotted lines represent -log10 (5×10−8) and -log10 (5×10−7), respectively.The variant rs1260326 (GCKR) is a missense mutation, resulting in a substitution from leucine to proline at position 446. The coding effect of rs1260326 was estimated by SnpEff as moderately important (Table 1) and annotated as tolerated by SIFT and benign by Polyphen-2 [30]. Position 446 in GCKR is located at the interface with GCK (Fig 2). L446 is closer to the middle of the interface whereas L446P is closer to GCK (S6 Fig). The variant rs560887 (G6PC2) is intronic and estimated to have low impact (Table 1). The variant rs13266634 (SLC30A8) is a missense mutation, resulting in a substitution from arginine to tryptophan at position 325, annotated as a moderate change by SnpEff (Table 1) and as tolerated by SIFT and benign by PolyPhen-2. I-TASSER predicted four possible protein structures based on an amino sequence with R325W. The four predicted protein structures were similar to each other, but all were different from wild type (Fig 3) and consistent with a moderate change in protein structure.
Fig 2
Wild type GCKR protein (pink) interacts with wild type GCK protein (blue).
The position of interaction in GCKR is L446 (rs1260326, red). Green dotted line presents the proximity of the interface between GCK and GCKR.
Fig 3
The SLC30A8 protein structures for Wild Type (Wt, top) and R325W (rs1326634, bottom). Each amino acid sequence yielded four predicted protein structures called models 1 to 4 for Wt and mutant, respectively. Wt-Model 1 (top left) is the 1st 3D structure predicted by comparative molecular modeling through I-TASSER. Wt-Model 1–4 shows the overlap of the four predicted 3D structures for SLC30A8 wild type (top right). The mutant structures (bottom) are labeled correspondingly.
Wild type GCKR protein (pink) interacts with wild type GCK protein (blue).
The position of interaction in GCKR is L446 (rs1260326, red). Green dotted line presents the proximity of the interface between GCK and GCKR.The SLC30A8 protein structures for Wild Type (Wt, top) and R325W (rs1326634, bottom). Each amino acid sequence yielded four predicted protein structures called models 1 to 4 for Wt and mutant, respectively. Wt-Model 1 (top left) is the 1st 3D structure predicted by comparative molecular modeling through I-TASSER. Wt-Model 1–4 shows the overlap of the four predicted 3D structures for SLC30A8 wild type (top right). The mutant structures (bottom) are labeled correspondingly.To determine a best fit model jointly across loci, the three loci and all possible interactions were specified in a full model. Regression with backward selection (SLSTAY = 0.10) was used to eliminate variables (S5 Table). The final model included the three lead SNPs without any interactions. After excluding possible interactions, we found that the effect alleles influence fasting glucose in an additive manner (Fig 4). For each copy of a T allele at any of the three SNPs, an additive effect of -0.012 mmol/L on fasting glucose was identified in the discovery sample (p = 3.0×10−28) and replicated in trans-ethnic meta-analysis (n = 14,303, β = -0.0088, SE = 0.0011, p = 2.15×10−16, S7 Fig). We also estimated the joint additive effect of the three SNPs on the risk of T2D in a total of 28,120 individuals with (n = 4,585) or without (n = 23,535) T2D. The three SNPs were associated with significantly reduced T2D risk, with an odds ratio of 0.93 (95% confidence interval [0.88, 0.98], p = 0.0062, S8 Fig). Notably, none of the individuals with 6 T alleles had T2D, compared to 27% of those with 0 T alleles (Fig 5).
Fig 4
Joint effect size and standard error for fasting glucose at rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8).
The reference group is homozygous for the reference allele at all three SNPs. At each SNP, the T allele is the allele associated with lower fasting glucose.
Fig 5
Joint effect size for the prevalence of T2D at rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8).
The label above each bar provides the number of individuals (% prevalence of T2D). At each SNP, the T allele is the allele associated with protection against T2D.
Joint effect size and standard error for fasting glucose at rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8).
The reference group is homozygous for the reference allele at all three SNPs. At each SNP, the T allele is the allele associated with lower fasting glucose.
Joint effect size for the prevalence of T2D at rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8).
The label above each bar provides the number of individuals (% prevalence of T2D). At each SNP, the T allele is the allele associated with protection against T2D.
Discussion
Based on a genome-wide analysis of fasting glucose, we identified three loci (GCKR, G6PC2, and SLC30A8) that are involved in glucose regulation as previously reported [4, 38]. Here, we showed that the joint effect of these loci was associated with lower fasting glucose levels as well as lower risk of T2D. The missense SNP rs1260326 in GCKR is significantly associated with fasting glucose in non-diabetic and untreated diabetic individuals with European ancestry. This association was replicated in trans-ethnic meta-analysis of European Americans, Chinese, Chinese Americans, Hispanic Americans, African Americans, and Africans. The SNP rs1260326 has been associated with fatty liver, triglycerides, and very low-density lipoprotein cholesterol in obese children and adolescents [39]. The position in GCKR changed by rs1260326 interacts with GCK; the mutation leads to reduced capability to response to fructose-6-phosphate, increased GCK activity in the liver, and reduced glucose levels [40-42].G6PC2 (rs560887) has been reported to be associated with fasting glucose [40, 43] and with the 30 min. incremental insulin response in the oral glucose tolerance test [44]. The encoded protein allows the release of glucose into the bloodstream. rs560887 is an expression QTL (eQTL) for G6PC2 in several tissues but most strongly in subcutaneous adipose tissue, with the alternate allele associated with lower gene expression (S6 Table). It is also a splicing QTL (sQTL) for NOSTRIN in several tissues (S7 Table). NOSTRIN binds the enzyme responsible for production of nitric oxide, which is involved in neurotransmission, inflammatory responses, and vascular homeostasis [45]. An effect on NOSTRIN could explain the association of rs560887 with pulse pressure and other phenotypes [46]. The SNP rs560887 is in strong LD with rs573225 (r2 = 0.90) in EUR but weaker LD in AFR (r2 = 0.60) [32]. rs573225 is 207 bp upstream of G6PC2. Like rs560887, rs573225 was associated with lower fasting glucose (β = -0.010, SE = 0.002, p = 6.57×10−8) in our discovery study and was replicated (β = -0.005, SE = 0.0018, p = 0.0031). Also like rs560887, rs573225 is an eQTL for G6PC2 (S6 Table) and an sQTL for NOSTRIN (S7 Table). However, rs573225 has a phred-scaled CADD score of 15.97, compared to 0.210 for rs560887, indicating that rs573225 is more strongly deleterious than rs560887 [30]. rs573225 maps to the highly conserved 2nd position of a predicted regulatory motif for HNF4, with the alternate allele associated with weaker binding of HNF4 [32] and lower expression of G6PC2 (S6 Table). Thus, annotations not included in the fine mapping analysis (specifically, CADD scores and predicted regulatory motifs) provide evidence that rs573225 might be a better candidate causal variant and that rs560887 might simply be tagging rs573225.We found that the missense variant rs13266634 in SLC30A8 was associated with fasting glucose levels and was previously reported to be associated with T2D risk as well as glucose and proinsulin levels [3, 47]. The T allele at rs13266634 is associated with enhanced insulin secretion from pancreatic β cells and inhibited hepatic insulin clearance, leading to increased peripheral insulin levels and decreased peripheral glucose levels [48]. SLC30A8 is a transmembrane transporter, with the ligand zinc binding to a histidine-rich region from positions 197 to 205 [49]. The position in SLC30A8 changed by rs13255534, position 325, is located on the surface of the protein and maps to the cytoplasmic tail at a point where the protein bends back on itself [49]. Therefore, rs13266634 might not affect binding affinity but might affect either protein stability or interaction with other cytoplasmic components of the transport process.Functional studies that follow up on findings of genetic associations are critical. One way to assess function is based on analysis of predicted amino acid sequences [50]. Two of the three genetic variants identified in our study were missense. Wild type and mutant amino acid sequences were uploaded onto the I-TASSER server and predicted protein structures were imported into PyMOL for predicted protein function. Moderate protein structure differences were predicted at both rs1260326 (GCKR) and rs13266634 (SLC30A8), leading to predicted changes in protein function. The protein structures modeled by I-TASSER suggest that both rs1260326 and rs13266634 have the potential to change the corresponding protein structures and functions, which might result in altered glucose levels. The predicted structure of GCKR revealed that position 446 is located at the proximity of the interface between GCK and GCKR; therefore, L446P could affect the relative positioning of GCK and GCRK at the interface. This alteration could potentially impact the interaction efficiency of the two proteins, which can be assessed in vitro through either immunoprecipitation or fluorescence resonance energy transfer. Mutations in mice can be created using CRISPR editing technology so that the functional impacts of both GCKR-L446P and SLC30A8-R325W mutations could be tested in vivo. Structural information can also facilitate the rational design and development of targeted drugs and antibodies.An intergenic locus on chromosome 11 33.4 kb upstream of MTNR1B reached suggestive levels of significance in the discovery study and was replicated. There are two variants with r2≥0.8 in Europeans for the lead SNP rs6483204: rs3847554 and rs6483205 [32]. The variant rs3847554 has been previously reported as associated with fasting plasma glucose [51], but the association at rs3847554 did not replicate in our study due to heterogeneous effect sizes. The variants rs6843204 and rs3847554 are eQTLs for SLC36A4 in esophagus mucosa. SLC36A4 is a non-proton-coupled amino acid transporter. There is no evidence based on histone marks, proteins bound, or binding motifs that rs6483204 could be causal [32]. For rs3847554, the only evidence is a change in a binding motif for CDCL5 [32].
Conclusions
We analyzed GWAS data with 23,535 individuals, either nondiabetics or untreated diabetics, and identified and replicated three independent SNPs in GCKR, G6PC2, and SLC30A8 associated with fasting glucose levels. Each copy of the alternate allele at any of these three SNPs was associated with a reduction of 0.012 mmol/L in fasting glucose. The alternate allele at rs1260326 (GCKR) is associated with increased glycolysis, the alternate allele at rs560887 (G6PC2) is associated with decreased gluconeogenesis, and the alternate allele at rs13266634 (SLC30A8) is associated with increased insulin secretion. Each copy of the alternate allele at any of the three SNPs was associated with a 7% reduced risk of T2D, indicating that the associations are not just physiologically relevant but also pathophysiologically relevant.
Density plots for fasting glucose (mmol/L) in the discovery study.
Untransformed (left) and log-transformed (right).(PDF)Click here for additional data file.
Density plots for log transformed fasting glucose (mmol/L) in the replication studies.
WHT, CHI, HIS, and AA refer to European Americans, Chinese, Hispanic Americans, and African Americans, respectively.(PDF)Click here for additional data file.
Quantile-quantile plot of p-values for fasting glucose levels in individuals with European ancestry in ARIC.
The x-axis represents expected p-values, and the y-axis represents observed p-values. All p-values are transformed as–log10(p-value).(PDF)Click here for additional data file.
Three panels represent GCKR, G6PC2, and SLC30A8, respectively.
(Top) Region association plot: The x-axis represents position in Mb. The y-axis represents -log10
p-values. Sky-blue lines represent recombination rates (cM/Mb) from the 1000 Genomes Project. (Bottom) Posterior inclusion probabilities (PIP) based on fine mapping. The x-axis represents position in Mb. The y-axis represents PIP values.(PDF)Click here for additional data file.
Genome-wide conditional analysis of fasting glucose in individuals with European ancestry.
Row 1: Conditioning on rs1260326 (GCKR) abolished the peak at GCKR. Row2: Conditioning on rs1260326 (GCKR) and rs13266634 (SLC30A8) abolished the peaks at GCKR and SLC30A8. Row 3: Conditioning on rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8) eliminated all genome-wide significant signals.(PDF)Click here for additional data file.
Model structure of GCK with GCKR wild type (green) and L446P mutant (red) complexes.
(PDF)Click here for additional data file.
Forest plot from meta-analysis of fasting glucose levels in nine replication studies (n = 14,303).
(PDF)Click here for additional data file.
Forest plot from meta-analysis of risk of T2D in discovery and replication studies (n = 28,120).
(PDF)Click here for additional data file.
Study characteristics for discovery and replication studies.
(XLSX)Click here for additional data file.
Genome-wide significant association results in the discovery analysis.
(XLSX)Click here for additional data file.
Trans-ethnic replication analysis results.
(XLSX)Click here for additional data file.
Genome-wide suggestive association results.
(XLSX)Click here for additional data file.
Parameter estimation in backward regression.
(XLSX)Click here for additional data file.
GTEx expression QTL data.
(XLSX)Click here for additional data file.
GTEx splicing QTL data.
(XLSX)Click here for additional data file.25 Nov 2021
PONE-D-21-24235
Additive genetic effect of GCKR, G6PC2, and SLC30A8 variants on fasting glucose levels and risk of type 2 diabetes
PLOS ONE
Dear Dr. Chen,Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.
Specifically, the Reviewers have raised a number of concerns related to the methods and results that require further clarification in the text.Please submit your revised manuscript by Jan 08 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.Please include the following items when submitting your revised manuscript:A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.We look forward to receiving your revised manuscript.Kind regards,Nicholette D. Palmer, Ph.D.Academic EditorPLOS ONEJournal Requirements:When submitting your revision, we need you to address these additional requirements.1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found athttps://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf2. Thank you for including your ethics statement: "All in-house studies received ethics approval from Institutional Review Boards and written consent was obtained from each participant.".a. Please amend your current ethics statement to include 1) the full name of the ethics committee/institutional review board(s) that approved the specific study/studies for which data collection was conducted by the authors 2) details as to whether consent was informedb. Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research.3. Thank you for stating the following in the Acknowledgments Section of your manuscript:“The Atherosclerosis Risk in Communities study has been funded in whole or in part with federal funds from the National Heart, Lung, and Blood Institute, National Institute of Health, Department of Health and Human Services, under contract numbers HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I, and HHSN268201700005I.”We note that you have provided additional information within the Acknowledgements Section that is not currently declared in your Funding Statement. Please note that funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:“This research was supported by the Intramural Research Program of the Center for Research on Genomics and Global Health (CRGGH). The CRGGH is supported by the National Human Genome Research Institute, the National Institute of Diabetes and Digestive and Kidney Diseases, the Center for Information Technology, and the Office of the Director at the National Institutes of Health (1ZIAHG200362).”Please include your amended statements within your cover letter; we will change the online submission form on your behalf.4. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript.[Note: HTML markup is below. Please do not edit.]Reviewers' comments:Reviewer's Responses to Questions
Comments to the Author1. Is the manuscript technically sound, and do the data support the conclusions?The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: YesReviewer #2: Yes********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: YesReviewer #2: Yes********** 3. Have the authors made all data underlying the findings in their manuscript fully available?The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: YesReviewer #2: No********** 4. Is the manuscript presented in an intelligible fashion and written in standard English?PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: YesReviewer #2: Yes********** 5. Review Comments to the AuthorPlease use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Chen et al conducted GWAS with 23,535 individuals, either nondiabetics or298 untreated diabetics, and identified and replicated three independent SNPs in GCKR, G6PC2, and SLC30A8 associated with fasting glucose levels.Here are my comments:1. In Abstract line24, authors should clarify whether the they have multiple outcomes in the regression model. If it is only one outcome in the model, the 'multivariate' should be 'multivariable'.2. In the association analysis part, The fasting glucose is log-transformed. It seems the density plot still have tail after transformation. I suggest authors check the normal test such as Shapiro–Wilk test before and after transformation.3. If the log-transformation for FG works, the residual should follow normal distribution. If this is the case, author should provide the reason why they also need the inverse normal transformation for the residuals.4. How was the distribution of FG in the replication?5. Authors should describe the statistical method for T2D.6. In line 203, author should include the backward selection criteria.Reviewer #2: The Chen et al performed a GWAS of fasting glucose in non-diabetic and untreated diabetic individuals of European ancestry. 3 loci displayed significant associations (SNPs with P<5x10-8). The association of the 3 lead SNPs (after fine mapping) was replicated in trans-ethnic meta-analysis of European, Chinese, Hispanic and African ancestry. The authors further test the possible consequences of non-synonymous SNPs using various in silico tools. Finally, a genetic score combining the FPG reducing alleles was derived. The score was associated with FPG and T2D.The manuscript is interesting and easy to read. I have the following comments/suggestions:Methods :- in replications studies that are family studies (eg, FHS and HUFS), did you only keep unrelated individuals as in discovery ?- I suppose all the alleles shown in the manuscript tables/text are on the + strand ? please specify.- Replication studies : please clearly state the genome build used in methods and tables (same as discovery study so, hg19 ? ).- All R packages: provide version.- Reference the R software and provide version.Row 82: Is there a missing "." here ? if not, the sentence is not really clear.Row 83: "Among Europeans" : please specify how the EUR ancestry was determined (self report, genetic data ? or both ?).Row 88: use "race and ethnicity" instead or "race" only.Row 89: T2D definitions in row 79 and 91 are slightly different, please clarify.Row 96: GWAS QC: no sex check control performed? no ethnicity check? Also maybe provide some references for the QC of replication studies?Row 107: Can the author provide a brief justification as to why they used a two-stage mixed model? (i.e. first regressing on Age sex BMI on log FPG, then use ranked inverse normalized residuals for stage 2? Also, does this mean that all FPG related results (Beta, SE, etc) we see in the tables are back transformed and are in mmol/l? if yes, this should be mentioned.Row 111: How many PCs adjusted for?Row 118: What was the random factor in the mixed models?Row 118-119: Please provide a full dbGAP accession number with the version (phsxxxxxx.vx.px format).Row 121: Add citation to the dbGAP database itself.Row 126: Consider giving the number of case controls by ethnicity form each study in a supplementary table.Row 131: For each study (both stage 1 and replication studies) and for each ethnic group, provide a supplementary table with genotyping array, imputation panel used, filtering criteria (e.g. info score and MAF) and the resulting number of SNPs used (in the analysis in addition to the paragraph lines 131 to 138).Row 164: Add ref to R package Meta.Row 152: How do you define the presence of "LD between SNP".Results:Row 188: "The associations at all three lead SNPs were replicated", please specify that this is for the overall Meta-A, and briefly describe the replication results by ethnic groups.Row 188: "Of the two suggestive loci": Please specify what you mean by suggestive locus ( 5*10-7< P <5*10-8 ?)Row 192 + 198: SIFT, Polyphen-2 and CADD not mentioned at all in the methods. Please correct. also Ensembl is referenced instead of the programs and original papers themselves, if SIFT, Polyphen-2 and CADD results were extracted from Ensembl, then this should be specified in the methods, and references should include the original SIFT, Polyphen-2 and CADD papers in addition to Ensembl.row 193 + Figure S5: "The model protein structure of GCKR with L446P predicted an altered interaction between GCKR and GCK Figure S5)": not very clear from the figure S5 how the "interaction is altered", do you have a metric you use? a prediction P-value? also, maybe you could add arrows that point to where we're supposed to look on Figure S5 + specify what the purple color means?Row 230: eQTL and sQTL analysis not described in methods and results.Row 209: Please correct "nomarl".Row 209: "compared with normal values are less ...". Please consider changing the phrasing.Row 210: Please correct "mmol/".Discussion:Row 236: it is a bit confusing that authors describe rs560887 as lead SNP after fine mapping, having a low functional impact (row 195), but then discuss eQTL and sQTL effects in the discussion, and later suggest that rs573225 might be a better candidate. Was rs573225 included in fine mapping? what was its posterior probability? has the fine mapping performed in the different ethnic groups or restricted to EUR?Row 258-261: " based on the sequence to structure... for predicted protein function": should not be included in discussion, it reads more like a methods paragraph.Introduction:Row 37: Please replace ".." by "."Abstract:Row 34: The authors should not just mention the results of 0 vs 6 T alleles and emphasize the gradual change of outcome based on the number of risk alleles (linear trend) instead.Row 37: "Since non of the individuals homozygous for the ... identify individuals at low vs. high risk of developing type 2 diabetes": Not sure about this statement... I think if you derive a genetic score using more than 3 SNPs it will have a far better predictive power than a score with 3 SNPs.All Tables, Figures and Supplementary Figures: Please add abbreviations and units when applicable.Table S3:-Add an Ethnicity column.-Show meta-analysis heterogeneity results.-Show meta-A results by ethnic group (Europeans, Africans + African descent, Chinese, Latinos)Figure S3: Figure Resolution too low.Figure S6: The figure title does not reflect the fact that you are testing the number of protective alleles.Figure S7: The figure title does not reflect the fact that you are re testing the number of protective alleles.References: I believe I found a couple of references that do not match what is being said in the main text, please double check the references.********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.If you choose “no”, your identity will remain anonymous but your review may still be made public.Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: NoReviewer #2: Yes: Amel Lamri[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.26 Jan 2022Response to editor1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found athttps://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdfResponse: We revised the manuscript following the PLOS ONE style templates.2. Thank you for including your ethics statement: "All in-house studies received ethics approval from Institutional Review Boards and written consent was obtained from each participant.".a. Please amend your current ethics statement to include 1) the full name of the ethics committee/institutional review board(s) that approved the specific study/studies for which data collection was conducted by the authors 2) details as to whether consent was informed.Response: The ethics statement has been amended as requested (lines 186-198).b. Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).Response: The amended ethics statement now appears in the Ethics Statement field of the submission form.3. Thank you for stating the following in the Acknowledgments Section of your manuscript:“The Atherosclerosis Risk in Communities study has been funded in whole or in part with federal funds from the National Heart, Lung, and Blood Institute, National Institute of Health, Department of Health and Human Services, under contract numbers HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I, and HHSN268201700005I.”We note that you have provided additional information within the Acknowledgements Section that is not currently declared in your Funding Statement. Please note that funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:“This research was supported by the Intramural Research Program of the Center for Research on Genomics and Global Health (CRGGH). The CRGGH is supported by the National Human Genome Research Institute, the National Institute of Diabetes and Digestive and Kidney Diseases, the Center for Information Technology, and the Office of the Director at the National Institutes of Health (grant 1ZIAHG200362 to C.N.R.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.”Please include your amended statements within your cover letter; we will change the online submission form on your behalf.Response: All funding statements in the Acknowledgements section are acknowledgements required by dbGaP. We did nothing to obtain any of those grants nor did any of those grants fund our study. The current Funding Statement is correct, and the funding information provided in the current Funding Statement is the only information that should be published.4. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript.Response: The ethics statement appears only in the Materials and methods section (lines 186-198).Response to reviewersReviewer #1: Chen et al conducted GWAS with 23,535 individuals, either nondiabetics or298 untreated diabetics, and identified and replicated three independent SNPs in GCKR, G6PC2, and SLC30A8 associated with fasting glucose levels.Here are my comments:1. In Abstract line24, authors should clarify whether the they have multiple outcomes in the regression model. If it is only one outcome in the model, the 'multivariate' should be 'multivariable'.Response: As there is only one outcome, we have changed multivariate to multivariable (line 22).2. In the association analysis part, The fasting glucose is log-transformed. It seems the density plot still have tail after transformation. I suggest authors check the normal test such as Shapiro–Wilk test before and after transformation.Response: The GWAS analysis utilized the rank-based inverse normalized transformation (line 102). This transformation does not depend on log transformation. The distribution of fasting glucose values is right-skewed because we included untreated T2D cases. Shapiro-Wilk normality testing is limited to a sample size of 5000 in R and 2000 in SAS. The Kolmogorov-Smirnov test is also used for normality testing, but we did not present results of that test because it has low power (PMID 23843808).3. If the log-transformation for FG works, the residual should follow normal distribution. If this is the case, author should provide the reason why they also need the inverse normal transformation for the residuals.Response: We used the rank-based inverse normal transformation of fasting glucose values in the GWAS analysis (line 102). We note that transformation of the dependent variable does not guarantee that residuals are normally distributed.4. How was the distribution of FG in the replication?Response: We added density plots for log-transformed fasting glucose for all replication studies in S2 Fig (line 201).5. Authors should describe the statistical method for T2D.Response: We detailed the statistical association method (logistic regression) for T2D (lines 163-164).6. In line 203, author should include the backward selection criteria.Response: We added that the select stay = 0.10 (SLSTAY = 0.10) criterion was used for backward selection (lines 247-248).Reviewer #2: The Chen et al performed a GWAS of fasting glucose in non-diabetic and untreated diabetic individuals of European ancestry. 3 loci displayed significant associations (SNPs with P<5x10-8). The association of the 3 lead SNPs (after fine mapping) was replicated in trans-ethnic meta-analysis of European, Chinese, Hispanic and African ancestry. The authors further test the possible consequences of non-synonymous SNPs using various in silico tools. Finally, a genetic score combining the FPG reducing alleles was derived. The score was associated with FPG and T2D.The manuscript is interesting and easy to read. I have the following comments/suggestions:Methods:- in replications studies that are family studies (eg, FHS and HUFS), did you only keep unrelated individuals as in discovery?Response: In the discovery analysis, we did not keep only unrelated individuals. We allowed for cryptic relatedness by using the mixed model in EPACTS for association testing (line 105). Similarly, in the replication analysis, we did not keep only unrelated individuals but rather we adjusted for relatedness (line 131).- I suppose all the alleles shown in the manuscript tables/text are on the + strand? please specify.Response: All alleles are reported with respect to the positive strand (lines 95-96).- Replication studies: please clearly state the genome build used in methods and tables (same as discovery study so, hg19?).Response: We used hg19 in all analyses (lines 95 and 133).- All R packages: provide version.Response: We provided version numbers for all R packages (lines 104, 153, and 169).- Reference the R software and provide version.Response: We have added a reference and a version (line 108).Row 82: Is there a missing "." here ? if not, the sentence is not really clear.Response: We added the missing punctuation (line 75).Row 83: "Among Europeans" : please specify how the EUR ancestry was determined (self report, genetic data ? or both ?).Response: We clarified the text (line 78).Row 88: use "race and ethnicity" instead or "race" only.Response: We have added ethnicity (line 82).Row 89: T2D definitions in row 79 and 91 are slightly different, please clarify.Response: We have clarified the definition (lines 71-73).Row 96: GWAS QC: no sex check control performed? no ethnicity check? Also maybe provide some references for the QC of replication studies?Response: For in-house data sets (AADM, HUFS, and CADM) for which we collected individual-level data from study participants, we performed sex checks (lines 127-128). For the data sets from dbGaP (ARIC, FHS, and MESA), we relied on documentation available within dbGaP (lines 128-129). We do not know what an ethnicity check is, so we are certain we did not perform one. No additional references are required.Row 107: Can the author provide a brief justification as to why they used a two-stage mixed model? (i.e. first regressing on Age sex BMI on log FPG, then use ranked inverse normalized residuals for stage 2? Also, does this mean that all FPG related results (Beta, SE, etc) we see in the tables are back transformed and are in mmol/l? if yes, this should be mentioned.Response: For discovery and replication GWAS, fasting glucose values were ranked and inverse normalized, because skewness resulting from the inclusion of untreated cases made the log transformation insufficient to achieve approximate normality (lines 102 and 130). As rank-based transformations cannot be meaningfully back transformed, we relied on the log transformation and back transformation for the estimation of βs and SEs in the original units of mmol/l (lines 165-166). Obtaining residuals in stage 1 is done for computational convenience, as the performance of additional analyses is easier with precomputed residuals.Row 111: How many PCs adjusted for?Response: In the discovery GWAS, we adjusted for the top 3 PCs (line 104). For each replication study, we also adjusted for the top three PCs (line 131).Row 118: What was the random factor in the mixed models?Response: There is no mention of random factors or mixed models in line 118. The random effect in the mixed model was the standard genetic relationship (kinship) matrix used in EPACTS (line 105).Row 118-119: Please provide a full dbGAP accession number with the version (phsxxxxxx.vx.px format).Response: The dbGaP accession numbers are provided in the Acknowledgements section (lines 354, 365, and 372).Row 121: Add citation to the dbGAP database itself.Response: We added the URL to dbGaP to the main text (line 71).Row 126: Consider giving the number of case controls by ethnicity form each study in a supplementary table.Response: We added this information to S1 Table (line 201).Row 131: For each study (both stage 1 and replication studies) and for each ethnic group, provide a supplementary table with genotyping array, imputation panel used, filtering criteria (e.g. info score and MAF) and the resulting number of SNPs used (in the analysis in addition to the paragraph lines 131 to 138).Response: We provided this information in S1 Table (line 201).Row 164: Add ref to R package Meta.Response: We added a citation for the package (line 169).Row 152: How do you define the presence of "LD between SNP".Response: For fine mapping, LD between SNPs is not defined in terms of presence. As is standard, we used a square matrix of pairwise covariance values.Results:Row 188: "The associations at all three lead SNPs were replicated", please specify that this is for the overall Meta-A, and briefly describe the replication results by ethnic groups.Response: We clarified the text (line 214). Results by ethnic group are provided in S3 Table (line 214).Row 188: "Of the two suggestive loci": Please specify what you mean by suggestive locus ( 5*10-7
Response: We clarified the text (line 215).Row 192 + 198: SIFT, Polyphen-2 and CADD not mentioned at all in the methods. Please correct. also Ensembl is referenced instead of the programs and original papers themselves, if SIFT, Polyphen-2 and CADD results were extracted from Ensembl, then this should be specified in the methods, and references should include the original SIFT, Polyphen-2 and CADD papers in addition to Ensembl.Response: The reviewer is correct that we accessed and retrieved information from Ensembl, and thus we cited Ensembl as appropriate. We did no formal analysis with SIFT, Polyphen-2, or CADD values, hence there are no methods or results to report. We discussed our results in the context of these preexisting data. Citing secondary references in this context is not the norm. However, we added descriptions and citations for SIFT, Polyphen-2, and CADD (lines 143-150).row 193 + Figure S5: "The model protein structure of GCKR with L446P predicted an altered interaction between GCKR and GCK Figure S5)": not very clear from the figure S5 how the "interaction is altered", do you have a metric you use? a prediction P-value? also, maybe you could add arrows that point to where we're supposed to look on Figure S5 + specify what the purple color means?Response: We redrew S6 Fig with the addition of labels. Also, we revised the text to indicate, as depicted in the figure, that L446 (yellow) and L446P (purple) are positioned differently (lines 227-228).Row 230: eQTL and sQTL analysis not described in methods and results.Response: There is no analysis to describe in either the Methods or Results sections. We simply retrieved and discussed existing information in the GTEx database. We provided eQTL and sQTL descriptions and accession information (lines 172-176 and 374-375).Row 209: Please correct "nomarl".Response: We deleted the sentence.Row 209: "compared with normal values are less ...". Please consider changing the phrasing.Response: We deleted the sentence.Row 210: Please correct "mmol/".Response: We deleted the sentence.Discussion:Row 236: it is a bit confusing that authors describe rs560887 as lead SNP after fine mapping, having a low functional impact (row 195), but then discuss eQTL and sQTL effects in the discussion, and later suggest that rs573225 might be a better candidate. Was rs573225 included in fine mapping? what was its posterior probability? has the fine mapping performed in the different ethnic groups or restricted to EUR?Response: It is correct that rs560087 was the lead SNP from the discovery analysis. The data are not conclusive whether rs560087, rs573225, both, or neither is/are likely causal. Regional fine mapping was performed using the discovery data (that is, restricted to EUR). The marginal posterior inclusion probability for rs560087 was 0.67 (S4 Fig). The lead SNP rs560087 is in strong LD with rs573225 in EUR (lines 287-288). The tagged marker rs573225 was included in the fine mapping analysis. The marginal posterior inclusion probability for rs573225 was 0.000404. The fine mapping analysis accounted for the absence or presence of promoter histone marks, enhancer histone marks, DNAse I hypersensitive sites, or bound proteins (lines 156-158). The fine mapping analysis did not account for eQTL annotation, sQTL annotation, predicted regulatory motifs, or CADD scores. Thus, we discuss these additional annotations. First, both rs560087 and rs573225 are annotated as both eQTL and sQTL (lines 281-284 and 290-292). Second, rs573225 had a higher CADD score than rs560087 (lines 292-293). Third, unlike rs560087, rs573225 was annotated with a predicted regulatory motif (lines 293-296). Therefore, there are two lines of evidence not accounted for by the CAVIARBF analysis suggesting that rs573225 might be a better candidate than rs560087 (lines 296-298).Row 258-261: " based on the sequence to structure... for predicted protein function": should not be included in discussion, it reads more like a methods paragraph.Response: We deleted the sentence.Introduction:Row 37: Please replace ".." by "."Response: We fixed the punctuation (line 35).Abstract:Row 34: The authors should not just mention the results of 0 vs 6 T alleles and emphasize the gradual change of outcome based on the number of risk alleles (linear trend) instead.Response: We revised the sentence to emphasize the additive nature of the genetic protection (lines 32-33).Row 37: "Since non of the individuals homozygous for the ... identify individuals at low vs. high risk of developing type 2 diabetes": Not sure about this statement... I think if you derive a genetic score using more than 3 SNPs it will have a far better predictive power than a score with 3 SNPs.Response: We are sure about our statement. The reviewer’s claim may very well be true, but given the data presented in this manuscript, the claim must be viewed as speculative and unsubstantiated, and as such, inappropriate for the Abstract.All Tables, Figures and Supplementary Figures: Please add abbreviations and units when applicable.Response: We have verified that all tables, figures, and supplementary files include applicable abbreviations, footnotes, and units.Table S3:-Add an Ethnicity column.Response: We added the column.-Show meta-analysis heterogeneity results.Response: We added one column to present heterogeneity (Cochran’s Q value).-Show meta-A results by ethnic group (Europeans, Africans + African descent, Chinese, Latinos)Response: We added meta-analysis results by group.Figure S3: Figure Resolution too low.Response: We redrew the figure.Figure S6: The figure title does not reflect the fact that you are testing the number of protective alleles.Response: The meta-analysis is a priori agnostic as to direction of effect (i.e., the test is two-tailed). In fact, we are not testing the number of protective alleles.Figure S7: The figure title does not reflect the fact that you are re testing the number of protective alleles.Response: The meta-analysis is a priori agnostic as to direction of effect (i.e., the test is two-tailed). In fact, we are not testing the number of protective alleles.References: I believe I found a couple of references that do not match what is being said in the main text, please double check the references.Response: We do not know to which specific references the reviewer is referring. After adding the new references requested by the reviewer, we checked all references.Submitted filename: Response letter.docxClick here for additional data file.20 May 2022Additive genetic effect of GCKR, G6PC2, and SLC30A8 variants on fasting glucose levels and risk of type 2 diabetesPONE-D-21-24235R1Dear Dr. Chen,We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.Kind regards,Nicholette D. Palmer, Ph.D.Academic EditorPLOS ONEAdditional Editor Comments (optional):Reviewers' comments:Reviewer's Responses to Questions
Comments to the Author1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressedReviewer #2: All comments have been addressed********** 2. Is the manuscript technically sound, and do the data support the conclusions?The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: YesReviewer #2: Yes********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: YesReviewer #2: Yes********** 4. Have the authors made all data underlying the findings in their manuscript fully available?The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: YesReviewer #2: (No Response)********** 5. Is the manuscript presented in an intelligible fashion and written in standard English?PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: YesReviewer #2: Yes********** 6. Review Comments to the AuthorPlease use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors addressed all my concerns. I have no further comments.Reviewer #2: (No Response)********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.If you choose “no”, your identity will remain anonymous but your review may still be made public.Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: NoReviewer #2: No25 May 2022PONE-D-21-24235R1Additive genetic effect of iGCKR, G6PC2, and SLC30A8 variants on fasting glucose levels and risk of type 2 diabetesDear Dr. Chen:I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.If we can help with anything else, please email us at plosone@plos.org.Thank you for submitting your work to PLOS ONE and supporting open access.Kind regards,PLOS ONE Editorial Office Staffon behalf ofDr. Nicholette D. PalmerAcademic EditorPLOS ONE
Authors: Wenan Chen; Beth R Larrabee; Inna G Ovsyannikova; Richard B Kennedy; Iana H Haralambieva; Gregory A Poland; Daniel J Schaid Journal: Genetics Date: 2015-05-06 Impact factor: 4.562
Authors: Laura J Rasmussen-Torvik; Xiuqing Guo; Donald W Bowden; Alain G Bertoni; Michele M Sale; Jie Yao; David A Bluemke; Mark O Goodarzi; Y Ida Chen; Dhananjay Vaidya; Leslie J Raffel; George J Papanicolaou; James B Meigs; James S Pankow Journal: Genet Epidemiol Date: 2012-04-16 Impact factor: 2.135
Authors: Anubha Mahajan; Daniel Taliun; Matthias Thurner; Neil R Robertson; Jason M Torres; N William Rayner; Anthony J Payne; Valgerdur Steinthorsdottir; Robert A Scott; Niels Grarup; James P Cook; Ellen M Schmidt; Matthias Wuttke; Chloé Sarnowski; Reedik Mägi; Jana Nano; Christian Gieger; Stella Trompet; Cécile Lecoeur; Michael H Preuss; Bram Peter Prins; Xiuqing Guo; Lawrence F Bielak; Jennifer E Below; Donald W Bowden; John Campbell Chambers; Young Jin Kim; Maggie C Y Ng; Lauren E Petty; Xueling Sim; Weihua Zhang; Amanda J Bennett; Jette Bork-Jensen; Chad M Brummett; Mickaël Canouil; Kai-Uwe Ec Kardt; Krista Fischer; Sharon L R Kardia; Florian Kronenberg; Kristi Läll; Ching-Ti Liu; Adam E Locke; Jian'an Luan; Ioanna Ntalla; Vibe Nylander; Sebastian Schönherr; Claudia Schurmann; Loïc Yengo; Erwin P Bottinger; Ivan Brandslund; Cramer Christensen; George Dedoussis; Jose C Florez; Ian Ford; Oscar H Franco; Timothy M Frayling; Vilmantas Giedraitis; Sophie Hackinger; Andrew T Hattersley; Christian Herder; M Arfan Ikram; Martin Ingelsson; Marit E Jørgensen; Torben Jørgensen; Jennifer Kriebel; Johanna Kuusisto; Symen Ligthart; Cecilia M Lindgren; Allan Linneberg; Valeriya Lyssenko; Vasiliki Mamakou; Thomas Meitinger; Karen L Mohlke; Andrew D Morris; Girish Nadkarni; James S Pankow; Annette Peters; Naveed Sattar; Alena Stančáková; Konstantin Strauch; Kent D Taylor; Barbara Thorand; Gudmar Thorleifsson; Unnur Thorsteinsdottir; Jaakko Tuomilehto; Daniel R Witte; Josée Dupuis; Patricia A Peyser; Eleftheria Zeggini; Ruth J F Loos; Philippe Froguel; Erik Ingelsson; Lars Lind; Leif Groop; Markku Laakso; Francis S Collins; J Wouter Jukema; Colin N A Palmer; Harald Grallert; Andres Metspalu; Abbas Dehghan; Anna Köttgen; Goncalo R Abecasis; James B Meigs; Jerome I Rotter; Jonathan Marchini; Oluf Pedersen; Torben Hansen; Claudia Langenberg; Nicholas J Wareham; Kari Stefansson; Anna L Gloyn; Andrew P Morris; Michael Boehnke; Mark I McCarthy Journal: Nat Genet Date: 2018-10-08 Impact factor: 38.330
Authors: Annalisa Buniello; Jacqueline A L MacArthur; Maria Cerezo; Laura W Harris; James Hayhurst; Cinzia Malangone; Aoife McMahon; Joannella Morales; Edward Mountjoy; Elliot Sollis; Daniel Suveges; Olga Vrousgou; Patricia L Whetzel; Ridwan Amode; Jose A Guillen; Harpreet S Riat; Stephen J Trevanion; Peggy Hall; Heather Junkins; Paul Flicek; Tony Burdett; Lucia A Hindorff; Fiona Cunningham; Helen Parkinson Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971
Authors: Adebowale A Adeyemo; Norann A Zaghloul; Guanjie Chen; Ayo P Doumatey; Carmen C Leitch; Timothy L Hostelley; Jessica E Nesmith; Jie Zhou; Amy R Bentley; Daniel Shriner; Olufemi Fasanmade; Godfrey Okafor; Benjamin Eghan; Kofi Agyenim-Boateng; Settara Chandrasekharappa; Jokotade Adeleye; William Balogun; Samuel Owusu; Albert Amoah; Joseph Acheampong; Thomas Johnson; Johnnie Oli; Clement Adebamowo; Francis Collins; Georgia Dunston; Charles N Rotimi Journal: Nat Commun Date: 2019-07-19 Impact factor: 14.919
Authors: Nuala A O'Leary; Mathew W Wright; J Rodney Brister; Stacy Ciufo; Diana Haddad; Rich McVeigh; Bhanu Rajput; Barbara Robbertse; Brian Smith-White; Danso Ako-Adjei; Alexander Astashyn; Azat Badretdin; Yiming Bao; Olga Blinkova; Vyacheslav Brover; Vyacheslav Chetvernin; Jinna Choi; Eric Cox; Olga Ermolaeva; Catherine M Farrell; Tamara Goldfarb; Tripti Gupta; Daniel Haft; Eneida Hatcher; Wratko Hlavina; Vinita S Joardar; Vamsi K Kodali; Wenjun Li; Donna Maglott; Patrick Masterson; Kelly M McGarvey; Michael R Murphy; Kathleen O'Neill; Shashikant Pujar; Sanjida H Rangwala; Daniel Rausch; Lillian D Riddick; Conrad Schoch; Andrei Shkeda; Susan S Storz; Hanzhen Sun; Francoise Thibaud-Nissen; Igor Tolstoy; Raymond E Tully; Anjana R Vatsan; Craig Wallin; David Webb; Wendy Wu; Melissa J Landrum; Avi Kimchi; Tatiana Tatusova; Michael DiCuccio; Paul Kitts; Terence D Murphy; Kim D Pruitt Journal: Nucleic Acids Res Date: 2015-11-08 Impact factor: 16.971
Authors: Robert A Scott; Laura J Scott; Reedik Mägi; Letizia Marullo; Kyle J Gaulton; Marika Kaakinen; Natalia Pervjakova; Tune H Pers; Andrew D Johnson; John D Eicher; Anne U Jackson; Teresa Ferreira; Yeji Lee; Clement Ma; Valgerdur Steinthorsdottir; Gudmar Thorleifsson; Lu Qi; Natalie R Van Zuydam; Anubha Mahajan; Han Chen; Peter Almgren; Ben F Voight; Harald Grallert; Martina Müller-Nurasyid; Janina S Ried; Nigel W Rayner; Neil Robertson; Lennart C Karssen; Elisabeth M van Leeuwen; Sara M Willems; Christian Fuchsberger; Phoenix Kwan; Tanya M Teslovich; Pritam Chanda; Man Li; Yingchang Lu; Christian Dina; Dorothee Thuillier; Loic Yengo; Longda Jiang; Thomas Sparso; Hans A Kestler; Himanshu Chheda; Lewin Eisele; Stefan Gustafsson; Mattias Frånberg; Rona J Strawbridge; Rafn Benediktsson; Astradur B Hreidarsson; Augustine Kong; Gunnar Sigurðsson; Nicola D Kerrison; Jian'an Luan; Liming Liang; Thomas Meitinger; Michael Roden; Barbara Thorand; Tõnu Esko; Evelin Mihailov; Caroline Fox; Ching-Ti Liu; Denis Rybin; Bo Isomaa; Valeriya Lyssenko; Tiinamaija Tuomi; David J Couper; James S Pankow; Niels Grarup; Christian T Have; Marit E Jørgensen; Torben Jørgensen; Allan Linneberg; Marilyn C Cornelis; Rob M van Dam; David J Hunter; Peter Kraft; Qi Sun; Sarah Edkins; Katharine R Owen; John R B Perry; Andrew R Wood; Eleftheria Zeggini; Juan Tajes-Fernandes; Goncalo R Abecasis; Lori L Bonnycastle; Peter S Chines; Heather M Stringham; Heikki A Koistinen; Leena Kinnunen; Bengt Sennblad; Thomas W Mühleisen; Markus M Nöthen; Sonali Pechlivanis; Damiano Baldassarre; Karl Gertow; Steve E Humphries; Elena Tremoli; Norman Klopp; Julia Meyer; Gerald Steinbach; Roman Wennauer; Johan G Eriksson; Satu Mӓnnistö; Leena Peltonen; Emmi Tikkanen; Guillaume Charpentier; Elodie Eury; Stéphane Lobbens; Bruna Gigante; Karin Leander; Olga McLeod; Erwin P Bottinger; Omri Gottesman; Douglas Ruderfer; Matthias Blüher; Peter Kovacs; Anke Tonjes; Nisa M Maruthur; Chiara Scapoli; Raimund Erbel; Karl-Heinz Jöckel; Susanne Moebus; Ulf de Faire; Anders Hamsten; Michael Stumvoll; Panagiotis Deloukas; Peter J Donnelly; Timothy M Frayling; Andrew T Hattersley; Samuli Ripatti; Veikko Salomaa; Nancy L Pedersen; Bernhard O Boehm; Richard N Bergman; Francis S Collins; Karen L Mohlke; Jaakko Tuomilehto; Torben Hansen; Oluf Pedersen; Inês Barroso; Lars Lannfelt; Erik Ingelsson; Lars Lind; Cecilia M Lindgren; Stephane Cauchi; Philippe Froguel; Ruth J F Loos; Beverley Balkau; Heiner Boeing; Paul W Franks; Aurelio Barricarte Gurrea; Domenico Palli; Yvonne T van der Schouw; David Altshuler; Leif C Groop; Claudia Langenberg; Nicholas J Wareham; Eric Sijbrands; Cornelia M van Duijn; Jose C Florez; James B Meigs; Eric Boerwinkle; Christian Gieger; Konstantin Strauch; Andres Metspalu; Andrew D Morris; Colin N A Palmer; Frank B Hu; Unnur Thorsteinsdottir; Kari Stefansson; Josée Dupuis; Andrew P Morris; Michael Boehnke; Mark I McCarthy; Inga Prokopenko Journal: Diabetes Date: 2017-05-31 Impact factor: 9.337