Literature DB >> 28744461

Integrating Genome-Wide Association and eQTLs Studies Identifies the Genes and Gene Sets Associated with Diabetes.

Xiao Liang1, Awen He1, Wenyu Wang1, Li Liu1, Yanan Du1, Qianrui Fan1, Ping Li1, Yan Wen1, Jingcan Hao1, Xiong Guo1, Feng Zhang1.   

Abstract

AIM: To identify novel candidate genes and gene sets for diabetes.
METHODS: We performed an integrative analysis of genome-wide association studies (GWAS) and expression quantitative trait loci (eQTLs) data for diabetes. Summary data was driven from a large-scale GWAS of diabetes, totally involving 58,070 individuals. eQTLs dataset included 923,021 cis-eQTL for 14,329 genes and 4,732 trans-eQTL for 2,612 genes. Integrative analysis of GWAS and eQTLs data was conducted by summary data-based Mendelian randomization (SMR). To identify the gene sets associated with diabetes, the SMR single gene analysis results were further subjected to gene set enrichment analysis (GSEA). A total of 13,311 annotated gene sets were analyzed in this study.
RESULTS: SMR analysis identified 6 genes significantly associated with fasting glucose, such as C11ORF10 (p value = 6.04 × 10-8), MRPL33 (p value = 1.24 × 10-7), and FADS1 (p value = 2.39 × 10-7). Gene set analysis identified HUANG_FOXA2_TARGETS_UP (false discovery rate = 0.047) associated with fasting glucose.
CONCLUSION: Our study provides novel clues for clarifying the genetic mechanism of diabetes. This study also illustrated the good performance of SMR approach and extended it to gene set association analysis for complex diseases.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28744461      PMCID: PMC5506468          DOI: 10.1155/2017/1758636

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Diabetes is a group of metabolic diseases, mainly characterized by raised blood glucose over a prolonged period. Without effective treatments, diabetes will lead to serious secondary disorders, such as heart disease, stroke, chronic kidney failure, and foot ulcers. During the past decades, the prevalence of diabetes continues to increase, caused by aging, obesity, smoking, and other unhealthy lifestyle factors [1]. It was estimated that 334 million individuals would suffer diabetes in 2025 [1]. Diabetes has become one of the major public health problems, bringing heavy economic burden to the society. Genetic factors contribute greatly to the development of diabetes. Extensive genetic studies have been conducted and identified a group of susceptibility genes for diabetes, such as PTEN [2], SREBF1 [3], JAZF1 [4], BCL2 [5], and FAM19A2 [5]. However, the genetic risk of diabetes explained by the identified loci was limited, suggesting the existence of undiscovered susceptibility loci for diabetes. The missing heritability can partly be attributed to the regulatory genetic variants, which are mostly locating outside genes and ignored by traditional genetic studies. Expression quantitative trait loci (eQTLs) are a group of important regulatory loci, which can regulate gene expression levels. The disease-associated SNPs identified by GWAS are significantly enriched in eQTLs, supporting the implication of eQTLs in the pathogenesis of complex diseases [6]. Through genome-wide detecting associations between gene transcript abundance and genomic polymorphisms, a large amount of eQTLs has been identified in human genome [7, 8]. Recently, summary data-based Mendelian randomization (SMR) analysis was proposed to utilize extensive published GWAS as well as eQTLs data. SMR is capable of integrating GWAS summary and eQTLs annotation data to identify novel causal genes, the expression levels of which are associated with target diseases [9]. SMR showed a high power for identifying novel causal genes of complex diseases [9]. In this study, we conducted a genome-wide single gene and gene sets expression association analysis for diabetes. SMR was first applied to a large-scale GWAS data for screening novel susceptibility genes of diabetes. To gain insight into the biological significance of identified genes, we extended SMR to gene set enrichment analysis (GSEA). SMR gene-level analysis results were subjected to GSEA for identifying diabetes associated gene sets with known functional information.

2. Methods

2.1. GWAS Summary Datasets

A large-scale GWAS meta-analysis summary data of diabetes was used in this study [10]. Briefly, this GWAS comprised 58,070 individuals from 29 studies involved in the Meta-Analysis of Glucose and Insulin related traits Consortium. Fasting glucose and fasting insulin were measured from whole blood, plasma, or serum samples. Detailed information of measurements of fasting glucose and fasting insulin is summarized in Supplementary Table S1 and Table S2 in Supplementary Material available online at https://doi.org/10.1155/2017/1758636. Commercial platforms were used for genome-wide SNP genotyping, such as Affymetrix 500K SNP array, Illumina 550K, and Perlegen 600K. Imputation was conducted by MACH [11] or IMPUTE [12] against the HapMap CEU reference genome (build 36). The GWAS meta-analysis was conducted by joint meta-analytical approach [13]. Detailed information of cohorts, genotyping, imputation, meta-analysis, and quality control approaches can be found in the published studies [10].

2.2. SMR Single Gene Analysis

The GWAS meta-analysis summary data of diabetes was input into SMR for single gene expression association analysis of fasting glucose and insulin resistance. SMR is capable of integrating GWAS results with eQTLs annotation information to evaluate the relationships between gene expression levels and complex traits [9]. We applied the eQTLs annotation dataset built by Westra et al. [14]. Briefly, these eQTLs datasets were driven from a meta-analysis of 5,311 peripheral blood samples and replicated in another 2,775 samples. Illumina whole-genome Expression BeadChips were used for gene expression profiling. SNP genotyping was conducted using commercial platforms, such as Illumina 610K quad arrays and Illumina HumanHap300 arrays. Imputation was conducted using MACH [11] or IMPUTE [12] against the HapMap 2 reference panels. 923,021 cis-eQTL for 14,329 gene expression probes and 4,732 trans-eQTL for 2,612 gene expression probes were identified at false discovery rate (FDR) < 0.05 [14]. An expression association testing p value for each gene was calculated by SMR. After Bonferroni correction, the genes with SMR p values < 9.28 × 10−6 (0.05/5389) were considered as significant genes in our study.

2.3. Gene Set Enrichment Analysis

To reveal the functional significance of identified genes, the SMR single gene expression association testing results were further subjected to GSEA [15]. The gene set annotation database (msigdb.v5.1) was obtained from the GSEA Molecular Signatures Database (http://software.broadinstitute.org/gsea/msigdb/index.jsp). 5,000 permutations were conducted to calculate the FDR adjusted p value of each gene set [16]. Significant gene sets were identified at FDR adjusted p value < 0.05. Detailed GSEA procedures can be found in our previous studies [17].

3. Results

3.1. SMR Single Gene Expression Association Analysis

A total of 5,389 genes with both GWAS summary and eQTLs data were analyzed in this study. After strict Bonferroni correction, SMR identified 6 genes significantly associated with fasting glucose (Table 1), including C11ORF10 (p value = 6.04 × 10−8), MRPL33 (p value = 1.24 × 10−7), FADS1 (p value = 2.39 × 10−7), ACP2 (p value = 1.74 × 10−6), NR1H3 (p value = 1.78 × 10−6), and SNX17 (p value = 2.19 × 10−6).
Table 1

List of candidate genes identified by SMR for fasting glucose.

GeneTop SNPMAFSMR
β p value
C11ORF10rs1745470.331−0.0596.04 × 10−8
MRPL33rs37365940.258−0.1181.24 × 10−7
FADS1rs1745480.301−0.0672.39 × 10−7
ACP2rs9017460.297−0.0501.74 × 10−6
NR1H3rs9017460.297−0.0511.78 × 10−6
SNX17rs12603200.392−0.0722.19 × 10−6

Note. MAF, minor allele frequency.

For fasting insulin, SMR detected suggestive association signals for 7 genes (Table 2), including ATRIP (p value = 9.68 × 10−5), MRPL33 (p value = 9.75 × 10−6), ATRIP (p value = 1.90 × 10−4), POLR1E (p value = 2.60 × 10−4), AMT (p value = 3.44 × 10−4), TNFSF13 (p value = 4.55 × 10−4), and POLR1E (p value = 7.82 × 10−4).
Table 2

List of candidate genes identified by SMR for fasting insulin.

GeneTop SNPMAFSMR
β p value
ATRIPrs22285610.129−0.0709.68 × 10−5
MRPL33rs37365940.258−0.0679.75 × 10−5
ATRIPrs22285610.129−0.0841.90 × 10−4
POLR1Ers107584350.166−0.0262.60 × 10−4
AMTrs10500880.4290.0313.44 × 10−4
TNFSF13rs98988760.193−0.0374.55 × 10−4
POLR1Ers109733960.168−0.0287.82 × 10−4

Note. MAF, minor allele frequency.

3.2. Gene Set Enrichment Analysis

A total of 10,987 annotated gene sets were analyzed in this study. GSEA observed significant association between HUANG_FOXA2_TARGETS_UP gene ontology (GO) term and fasting glucose (FDR adjusted p value = 0.047). For fasting insulin, GSEA detected suggestive association signal for chr8p23 GO term (FDR adjusted p value = 0.063).

4. Discussion

It is a challenge to reveal the biological significances of identified loci by GWAS, especially a large part of significant loci locating outside genes [9]. To better understand the genetic basis and make full use of published GWAS data of diabetes, we conducted an eQTL-based single gene and gene set expression association analysis for diabetes. We identified multiple genes and gene sets associated with fasting glucose or fasting insulin. SMR analysis observed the most significant association between fasting glucose and C11ORF10. C11ORF10 is close to another significant gene FADS1 identified by SMR. It has been demonstrated that C11ORF10 played an important role in fatty acid and glucose metabolism [18]. Zabaneh and Balding reported that C11ORF10 and FADS1 were significantly associated with metabolic syndrome [19]. Powell et al. observed that FADS1 knockout mice presented less glucose and insulin excursions during oral glucose tolerance tests along with lower fasting glucose, insulin, triglyceride, and total cholesterol levels [20]. Yao et al. suggested that FADS1-FADS2 gene cluster was significantly associated with type 2 diabetes [21]. Cormier et al. observed that FADS gene cluster could modulate plasma fasting glucose and fasting insulin levels in response to n-3 polyunsaturated fatty acids supplementation [22]. SNX17 is another notable gene associated with fasting glucose. SNX17 encodes sorting nexin 17, which involves receptor binding and phosphatidylinositol binding. It has been demonstrated that the eQTLs of SNX17 was significantly associated with glucometabolic phenotypes [23]. Adachi and Tsujimoto found that SNX17 directly interacted with FEEL-1/stabilin-1, which was implicated in the development of diabetes [24]. TNFSF13 is significantly associated with fasting insulin in this study. Gao et al. reported that the TNFSF13 level in serum was significantly associated with the diabetic status of patients with pancreatic ductal adenocarcinoma-associated diabetes [25]. Besides confirming functional relevance of previously reported candidate genes with diabetes, SMR analysis also identified several novel candidate genes for diabetes, such as MRPL33, ACP2, and NR1H3. To the best of our knowledge, few efforts have been paid to investigate the potential roles of these genes in the development of diabetes. Further biological studies are warranted to confirm our finding and clarify the potential roles of novel candidate genes in the pathogenesis of diabetes. Gene set analysis found that HUANG_FOXA2_TARGETS_UP GO term was significantly associated with fasting glucose. HUANG_FOXA2_TARGETS_UP comprises 45 genes, some of which have been suggested to be implicated in the development of diabetes, such as KAT2B and TNFAIP3. Rabhi et al. found that disruption of KAT2B led to impaired insulin secretion and glucose intolerance in mice [26]. They suggested that KAT2B was a key transcriptional regulator in maintaining normal function of adaptive β cell [26]. TNFAIP3 was suggested to be associated with type 1 diabetes [27]. In summary, we conducted a genome-wide integrative analysis of GWAS and eQTLs data for diabetes. We identified several novel candidate genes and gene sets associated with the risk of diabetes. Our results provide new clues for clarifying the genetic mechanism of diabetes. We also illustrated the good performance of SMR approach and extended it to gene set association analysis for complex diseases. Table S1: The details of analysis metrics and methods of fast glucose for all cohorts. Table S2: The details of analysis metrics and methods of fasting insulin for all cohorts.
  27 in total

1.  PAPA: a flexible tool for identifying pleiotropic pathways using genome-wide association study summaries.

Authors:  Yan Wen; Wenyu Wang; Xiong Guo; Feng Zhang
Journal:  Bioinformatics       Date:  2015-11-14       Impact factor: 6.937

2.  A new multipoint method for genome-wide association studies by imputation of genotypes.

Authors:  Jonathan Marchini; Bryan Howie; Simon Myers; Gil McVean; Peter Donnelly
Journal:  Nat Genet       Date:  2007-06-17       Impact factor: 38.330

3.  Pathway-based approaches for analysis of genomewide association studies.

Authors:  Kai Wang; Mingyao Li; Maja Bucan
Journal:  Am J Hum Genet       Date:  2007-12       Impact factor: 11.025

4.  Association of TNFAIP3 and TNFRSF1A variation with multiple sclerosis in a German case-control cohort.

Authors:  S Hoffjan; A Okur; J T Epplen; S Wieczorek; A Chan; D A Akkad
Journal:  Int J Immunogenet       Date:  2015-02-12       Impact factor: 1.466

5.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

6.  Polymorphisms of rs174616 in the FADS1-FADS2 gene cluster is associated with a reduced risk of type 2 diabetes mellitus in northern Han Chinese people.

Authors:  Min Yao; Jing Li; Tian Xie; Tianbo He; Lijia Fang; Yun Shi; Lianguo Hou; Kaoqi Lian; Ruiying Wang; Lingling Jiang
Journal:  Diabetes Res Clin Pract       Date:  2015-03-25       Impact factor: 5.602

7.  Mapping adipose and muscle tissue expression quantitative trait loci in African Americans to identify genes for type 2 diabetes and obesity.

Authors:  Satria P Sajuthi; Neeraj K Sharma; Jeff W Chou; Nicholette D Palmer; David R McWilliams; John Beal; Mary E Comeau; Lijun Ma; Jorge Calles-Escandon; Jamehl Demons; Samantha Rogers; Kristina Cherry; Lata Menon; Ethel Kouba; Donna Davis; Marcie Burris; Sara J Byerly; Maggie C Y Ng; Nisa M Maruthur; Sanjay R Patel; Lawrence F Bielak; Leslie A Lange; Xiuqing Guo; Michèle M Sale; Kei Hang K Chan; Keri L Monda; Gary K Chen; Kira Taylor; Cameron Palmer; Todd L Edwards; Kari E North; Christopher A Haiman; Donald W Bowden; Barry I Freedman; Carl D Langefeld; Swapan K Das
Journal:  Hum Genet       Date:  2016-05-19       Impact factor: 4.132

8.  Global prevalence of diabetes: estimates for the year 2000 and projections for 2030.

Authors:  Sarah Wild; Gojka Roglic; Anders Green; Richard Sicree; Hilary King
Journal:  Diabetes Care       Date:  2004-05       Impact factor: 19.112

9.  Genome-Wide Association Study of the Modified Stumvoll Insulin Sensitivity Index Identifies BCL2 and FAM19A2 as Novel Insulin Sensitivity Loci.

Authors:  Geoffrey A Walford; Stefan Gustafsson; Denis Rybin; Alena Stančáková; Han Chen; Ching-Ti Liu; Jaeyoung Hong; Richard A Jensen; Ken Rice; Andrew P Morris; Reedik Mägi; Anke Tönjes; Inga Prokopenko; Marcus E Kleber; Graciela Delgado; Günther Silbernagel; Anne U Jackson; Emil V Appel; Niels Grarup; Joshua P Lewis; May E Montasser; Claes Landenvall; Harald Staiger; Jian'an Luan; Timothy M Frayling; Michael N Weedon; Weijia Xie; Sonsoles Morcillo; María Teresa Martínez-Larrad; Mary L Biggs; Yii-Der Ida Chen; Arturo Corbaton-Anchuelo; Kristine Færch; Juan Miguel Gómez-Zumaquero; Mark O Goodarzi; Jorge R Kizer; Heikki A Koistinen; Aaron Leong; Lars Lind; Cecilia Lindgren; Fausto Machicao; Alisa K Manning; Gracia María Martín-Núñez; Gemma Rojo-Martínez; Jerome I Rotter; David S Siscovick; Joseph M Zmuda; Zhongyang Zhang; Manuel Serrano-Rios; Ulf Smith; Federico Soriguer; Torben Hansen; Torben J Jørgensen; Allan Linnenberg; Oluf Pedersen; Mark Walker; Claudia Langenberg; Robert A Scott; Nicholas J Wareham; Andreas Fritsche; Hans-Ulrich Häring; Norbert Stefan; Leif Groop; Jeff R O'Connell; Michael Boehnke; Richard N Bergman; Francis S Collins; Karen L Mohlke; Jaakko Tuomilehto; Winfried März; Peter Kovacs; Michael Stumvoll; Bruce M Psaty; Johanna Kuusisto; Markku Laakso; James B Meigs; Josée Dupuis; Erik Ingelsson; Jose C Florez
Journal:  Diabetes       Date:  2016-07-14       Impact factor: 9.461

10.  Systematic identification of trans eQTLs as putative drivers of known disease associations.

Authors:  Harm-Jan Westra; Marjolein J Peters; Tõnu Esko; Hanieh Yaghootkar; Claudia Schurmann; Johannes Kettunen; Mark W Christiansen; Bruce M Psaty; Samuli Ripatti; Alexander Teumer; Timothy M Frayling; Andres Metspalu; Joyce B J van Meurs; Lude Franke; Benjamin P Fairfax; Katharina Schramm; Joseph E Powell; Alexandra Zhernakova; Daria V Zhernakova; Jan H Veldink; Leonard H Van den Berg; Juha Karjalainen; Sebo Withoff; André G Uitterlinden; Albert Hofman; Fernando Rivadeneira; Peter A C 't Hoen; Eva Reinmaa; Krista Fischer; Mari Nelis; Lili Milani; David Melzer; Luigi Ferrucci; Andrew B Singleton; Dena G Hernandez; Michael A Nalls; Georg Homuth; Matthias Nauck; Dörte Radke; Uwe Völker; Markus Perola; Veikko Salomaa; Jennifer Brody; Astrid Suchy-Dicey; Sina A Gharib; Daniel A Enquobahrie; Thomas Lumley; Grant W Montgomery; Seiko Makino; Holger Prokisch; Christian Herder; Michael Roden; Harald Grallert; Thomas Meitinger; Konstantin Strauch; Yang Li; Ritsert C Jansen; Peter M Visscher; Julian C Knight
Journal:  Nat Genet       Date:  2013-09-08       Impact factor: 38.330

View more
  1 in total

1.  Blood-based analysis of type-2 diabetes mellitus susceptibility genes identifies specific transcript variants with deregulated expression and association with disease risk.

Authors:  Maria-Ioanna Christodoulou; Margaritis Avgeris; Ioanna Kokkinopoulou; Eirini Maratou; Panayota Mitrou; Christos K Kontos; Efthimios Pappas; Eleni Boutati; Andreas Scorilas; Emmanuel G Fragoulis
Journal:  Sci Rep       Date:  2019-02-06       Impact factor: 4.379

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.