Literature DB >> 25407412

I-GSEA4GWAS v2: a web server for functional analysis of SNPs in trait-associated pathways identified from genome-wide association study.

Kunlin Zhang1, Suhua Chang, Liyuan Guo, Jing Wang.   

Abstract

Entities:  

Mesh:

Year:  2015        PMID: 25407412      PMCID: PMC4348241          DOI: 10.1007/s13238-014-0114-4

Source DB:  PubMed          Journal:  Protein Cell        ISSN: 1674-800X            Impact factor:   14.870


× No keyword cloud information.
Dear Editor, The standard data analysis of genome-wide association study (GWAS) is based on single SNP (single nucleotide polymorphism), thus it ignores combined effect of modest SNPs/genes. To solve this problem, pathway-based analysis (PBA) has been introduced to GWAS data analysis. PBA aims to identify biological functions and mechanisms associated with complex trait (Wang et al., 2007; Wang et al., 2010). By now it has been one of the key ways to interpret GWAS data (Wang et al., 2010). The results of PBA are trait-associated pathways, which represent combined effect of modest genes. Further validation study needs to explore candidate causative SNPs from the PBA-identified pathways, by annotating the SNPs of the genes involved in pathways based on genomic features including protein coding features and non-coding features. For coding features, the SNPs impacting protein functions (such as deleterious non-synonymous sites) have been widely investigated, and the information has been well collected in some databases like Ensembl (Flicek et al., 2014). Meanwhile, to assign the biochemical function of the non-coding part of human genome (particular functional elements for gene expression regulation), the Encyclopedia of DNA Elements (ENCODE) project (Bernstein et al., 2012) has identified plenty of regulatory regions, like DNase I hypersensitive sites (DHSs) and transcription factor binding sites (TFBSs) in human genome. Particularly, the result of ENCODE indicates that most of the SNPs identified by GWASs are enriched within non-coding functional elements, with a majority residing in or near ENCODE-defined regions, including DHSs and TFBSs across several cell types (Bernstein et al., 2012). By now there have been several tools, such as GenomeRunner (Dozmorov et al., 2012) and GREAT (McLean et al., 2010), which can annotate and analyze the non-coding genomic features for input genomic regions. However, these tools are general tools and not specific for trait-associated pathways identified from GWAS. Furthermore, coding region annotation and linkage disequilibrium (LD), which is the basic concept of GWAS, need to be considered. On the other hand, except for genomic features, expression quantitative trait loci (eQTLs) analysis is widely utilized to interpret biological mechanisms of GWAS-identified variants (Cookson et al., 2009). Taken together, a PBA tool combined with solutions for functional analysis (considering all above issues) of SNPs in PBA-identified pathways associated with trait will provide an objective and comprehensive way to interpret GWAS data. In our previous work, we have developed a PBA web server, i-GSEA4GWAS (improved gene set enrichment analysis for GWAS), which detects pathways associated with traits by applying an improved gene set enrichment analysis (i-GSEA) (Zhang et al., 2010). Here, we report a new version, i-GSEA4GWAS v2, which is featured by implementing both i-GSEA and follow-up functional analysis for SNPs in trait-associated pathways identified by i-GSEA as well as their linkage disequilibrium (LD) proxies. The functional analysis of i-GSEA4GWAS v2 is based on putative functional SNP annotation data from Ensembl, regulatory regions from ENCODE and eQTLs data. Both annotation analysis and enrichment analysis were conducted for each type of functional elements. Data sources used for functional analysis were shown in Tables S1–3. Details about the annotation and enrichment analysis methods are in Materials and Methods section in Supplementary Materials. Fig. 1 shows the analytical framework of i-GSEA4GWAS v2.
Figure 1

The analytical framework of i-GSEA4GWAS v2

The analytical framework of i-GSEA4GWAS v2 I-GSEA4GWAS v2 is freely available at http://gsea4gwas-v2.psych.ac.cn/. It supports all major browsers. The main options of it are in Table 1. There is no any restriction to use it by academics and non-academics. It is written in Java and JSP and distributed using Apache and Tomcat web servers. I-GSEA4GWAS v2 was run in an upgraded web server (DELL R910 with 256G memory, 40 core CPUs and almost 10T storage) than i-GSEA4GWAS to improve the computing speed and work load. It is platform independent and easy to use by genetic and biological researchers; web browser is the only requirement to use it.
Table 1

The main options of i-GSEA4GWAS v2

OptionParameter name (separated by “,”)Default valueDescription
Select mapping rules of SNPs→genes20 kb, 5 kb, 100 kb, 500 kb, within gene, customized (0–500 kb)20 kbThe threshold of SNP mapping to its nearest gene. Radio box
Gene set databaseKEGG, BioCarta, GO biological process, GO molecular function, GO cellular component, customizedKEGG, BioCarta, GO biological process, GO molecular function, GO cellular componentThe pathway/gene set database used for PBA search. Check box
Number of genes in gene set[0, infinite][20, 200]The size of gene set
Functional analysisYes, noYesAll the options below will work only this option is set to “Yes”. Radio box
Select LD data source1000 genome population, HapMap III population1000 genome populationLD data source (1000 Genomes or HapMap III). Radio box
1000 genome populationEUR, AMR, ASN, AFREUR1000 genome population. Check box
HapMap III populationCEU, CHB, JPT, YRI, ASW, CHD, GIH, LWK, MEX, MKK, TSICEUHapMap III population. Check box
Select functional data sourceEnsembl putative functional variants, ENCODE regulatory feature peaks, Expression quantitative trait loci (eQTLs)Ensembl putative functional variants, ENCODE regulatory feature peaks, expression quantitative trait loci (eQTLs)Functional annotation data source for Ensembl. Check box
Data typeDNase-seq Peaks, FAIRE Peaks, TFBS Peaks (PeakSeq), TFBS Peaks (SPP), Histone PeaksDNase-seq peaks, FAIRE peaks, TFBS peaks (PeakSeq), TFBS peaks (SPP), histone peaksData type of ENCODE regulatory feature peaks. Check box
TissueAll, blastula, blood, bone, brain, breast, cerebellar, cervix, colon, connective, embryonic, epithelium, eye, fetal, foreskin, gingiva, gingival, heart, induced, kidney, liver, luminal, lung, mammary, monocytes, muscle, myometrium, pancreas, pancreatic, prostate, skin, spinal, testis, urothelium, uterusAllTissue of ENCODE regulatory feature peaks. Check box
The main options of i-GSEA4GWAS v2 We applied i-GSEA4GWAS v2 to analyze 312,565 SNP P-values of a schizophrenia GWAS with 871 schizophrenia cases and 863 healthy controls (all of European origin) at discovery stage (Need et al., 2009). The program maps the SNPs to the nearest genes within 20 kb upstream/downstream, searches KEGG (Kanehisa et al., 2014), BioCarta (http://www.biocarta.com/), and GO (Ashburner et al., 2000) for trait-associated pathways, and extracts linkage disequilibrium (LD) proxies for functional analysis from 1000 Genomes EUR. The result identified 19 pathways with FDR < 0.05 (Table S4). Among these 19 pathways, one (‘antigen processing and presentation’) was significantly enriched in Ensembl other putative functional sites, 8 of them were significantly enriched in at least one track of ENCODE regulatory elements and 14 of them were significantly enriched in eQTL, which indicated most SNPs in these pathways were in non-coding regions which may regulate the gene expression (Bernstein et al., 2012). For ‘antigen processing and presentation’, it is suggested that in schizophrenia, cellular mechanisms that are involved in antigen processing and presentation could be less efficient (Craddock et al., 2007). Functional analysis for the significant SNPs and their LD proxies in this pathway indicates that LD proxies (rs9260107 and rs9260118) of rs2860580 in HLA-A, LD proxies (rs2072895 and rs2844846) of rs1362126 in HLA-F and LD proxy (rs3100139) of rs2254835 in B2M are annotated as splice region; LD proxy (rs2072895) of rs1362126 in HLA-F is annotated as deleterious. The three genes are all related with major histocompatibility complex (MHC) class I molecules, which is consistent with the recent findings about the contribution of MHC region site to schizophrenia (Ripke et al., 2013). Four schizophrenia-associated pathways ‘potassium ion transport’, ‘regulation of heart contraction’, ‘voltage gated potassium channel complex’, and ‘potassium channel activity’ were enriched in ENCODE FARE track “wgEncodeOpenChromFaireMedulloPk” (a cell line of brain, as shown in Table S5), indicating many SNPs in these pathways may regulate the gene expression in brain. These functional SNPs and genes we identified may lead important function in schizophrenia and deserve further validation. In summary, this release of i-GSEA4GWAS v2 adds the important features towards an objective and comprehensive GWAS data interpretation, namely functional analysis for SNPs in trait-associated pathways. The functional analysis implemented in our web server for SNPs in trait-associated pathways considers linkage disequilibrium (LD) information, three categories of features (coding, non-coding and eQTLs) and enrichment analysis, which is a very comprehensive tool/module in comparison to available web-based tools for functional analysis (Table S6). The functional analysis result would facilitate to understand all kinds of main features (coding, non-coding and eQTLs) of the pathway related SNPs and select candidate causal SNPs, which will further contribute to the interpretation of GWAS data. To our knowledge, this is the first effort that SNP functional analysis is implemented in a PBA tool for GWAS. In future research, we will continue to update i-GSEA4GWAS v2 with latest pathways and annotation data of genomic features. Below is the link to the electronic supplementary material. Supplementary material 1 (PDF 98 kb)
  13 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  GenomeRunner: automating genome exploration.

Authors:  Mikhail G Dozmorov; Lukas R Cara; Cory B Giles; Jonathan D Wren
Journal:  Bioinformatics       Date:  2011-12-06       Impact factor: 6.937

Review 3.  Analysing biological pathways in genome-wide association studies.

Authors:  Kai Wang; Mingyao Li; Hakon Hakonarson
Journal:  Nat Rev Genet       Date:  2010-12       Impact factor: 53.242

4.  Pathway-based approaches for analysis of genomewide association studies.

Authors:  Kai Wang; Mingyao Li; Maja Bucan
Journal:  Am J Hum Genet       Date:  2007-12       Impact factor: 11.025

5.  i-GSEA4GWAS: a web server for identification of pathways/gene sets associated with traits by applying an improved gene set enrichment analysis to genome-wide association study.

Authors:  Kunlin Zhang; Sijia Cui; Suhua Chang; Liuyan Zhang; Jing Wang
Journal:  Nucleic Acids Res       Date:  2010-04-30       Impact factor: 16.971

Review 6.  Mapping complex disease traits with global gene expression.

Authors:  William Cookson; Liming Liang; Gonçalo Abecasis; Miriam Moffatt; Mark Lathrop
Journal:  Nat Rev Genet       Date:  2009-03       Impact factor: 53.242

7.  Ensembl 2014.

Authors:  Paul Flicek; M Ridwan Amode; Daniel Barrell; Kathryn Beal; Konstantinos Billis; Simon Brent; Denise Carvalho-Silva; Peter Clapham; Guy Coates; Stephen Fitzgerald; Laurent Gil; Carlos García Girón; Leo Gordon; Thibaut Hourlier; Sarah Hunt; Nathan Johnson; Thomas Juettemann; Andreas K Kähäri; Stephen Keenan; Eugene Kulesha; Fergal J Martin; Thomas Maurel; William M McLaren; Daniel N Murphy; Rishi Nag; Bert Overduin; Miguel Pignatelli; Bethan Pritchard; Emily Pritchard; Harpreet S Riat; Magali Ruffier; Daniel Sheppard; Kieron Taylor; Anja Thormann; Stephen J Trevanion; Alessandro Vullo; Steven P Wilder; Mark Wilson; Amonida Zadissa; Bronwen L Aken; Ewan Birney; Fiona Cunningham; Jennifer Harrow; Javier Herrero; Tim J P Hubbard; Rhoda Kinsella; Matthieu Muffato; Anne Parker; Giulietta Spudich; Andy Yates; Daniel R Zerbino; Stephen M J Searle
Journal:  Nucleic Acids Res       Date:  2013-12-06       Impact factor: 16.971

8.  A genome-wide investigation of SNPs and CNVs in schizophrenia.

Authors:  Anna C Need; Dongliang Ge; Michael E Weale; Jessica Maia; Sheng Feng; Erin L Heinzen; Kevin V Shianna; Woohyun Yoon; Dalia Kasperaviciūte; Massimo Gennarelli; Warren J Strittmatter; Cristian Bonvicini; Giuseppe Rossi; Karu Jayathilake; Philip A Cola; Joseph P McEvoy; Richard S E Keefe; Elizabeth M C Fisher; Pamela L St Jean; Ina Giegling; Annette M Hartmann; Hans-Jürgen Möller; Andreas Ruppert; Gillian Fraser; Caroline Crombie; Lefkos T Middleton; David St Clair; Allen D Roses; Pierandrea Muglia; Clyde Francks; Dan Rujescu; Herbert Y Meltzer; David B Goldstein
Journal:  PLoS Genet       Date:  2009-02-06       Impact factor: 5.917

9.  An integrated encyclopedia of DNA elements in the human genome.

Authors: 
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

10.  Altered T-cell function in schizophrenia: a cellular model to investigate molecular disease mechanisms.

Authors:  Rachel M Craddock; Helen E Lockstone; David A Rider; Matthew T Wayland; Laura J W Harris; Peter J McKenna; Sabine Bahn
Journal:  PLoS One       Date:  2007-08-01       Impact factor: 3.240

View more
  18 in total

1.  Using Patterns of Genetic Association to Elucidate Shared Genetic Etiologies Across Psychiatric Disorders.

Authors:  Seung Bin Cho; Fazil Aliev; Shaunna L Clark; Amy E Adkins; Howard J Edenberg; Kathleen K Bucholz; Bernice Porjesz; Danielle M Dick
Journal:  Behav Genet       Date:  2017-03-25       Impact factor: 2.805

2.  A Combined Analysis of Genetically Correlated Traits Identifies Genes and Brain Regions for Insomnia.

Authors:  Kezhi Liu; Ling Zhu; Minglan Yu; Xuemei Liang; Jin Zhang; Youguo Tan; Chaohua Huang; Wenying He; Wei Lei; Jing Chen; Xiaochu Gu; Bo Xiang
Journal:  Can J Psychiatry       Date:  2020-07-10       Impact factor: 4.356

Review 3.  Protein function in precision medicine: deep understanding with machine learning.

Authors:  Burkhard Rost; Predrag Radivojac; Yana Bromberg
Journal:  FEBS Lett       Date:  2016-08-06       Impact factor: 4.124

4.  Genotype-environment correlation by intervention effects underlying middle childhood peer rejection and associations with adolescent marijuana use.

Authors:  Kit K Elam; Sierra Clifford; Ariana Ruof; Daniel S Shaw; Melvin N Wilson; Kathryn Lemery-Chalfant
Journal:  Dev Psychopathol       Date:  2020-12-22

5.  Genome-wide association analysis of chronic lymphocytic leukaemia, Hodgkin lymphoma and multiple myeloma identifies pleiotropic risk loci.

Authors:  Philip J Law; Amit Sud; Jonathan S Mitchell; Marc Henrion; Giulia Orlando; Oleg Lenive; Peter Broderick; Helen E Speedy; David C Johnson; Martin Kaiser; Niels Weinhold; Rosie Cooke; Nicola J Sunter; Graham H Jackson; Geoffrey Summerfield; Robert J Harris; Andrew R Pettitt; David J Allsup; Jonathan Carmichael; James R Bailey; Guy Pratt; Thahira Rahman; Chris Pepper; Chris Fegan; Elke Pogge von Strandmann; Andreas Engert; Asta Försti; Bowang Chen; Miguel Inacio da Silva Filho; Hauke Thomsen; Per Hoffmann; Markus M Noethen; Lewin Eisele; Karl-Heinz Jöckel; James M Allan; Anthony J Swerdlow; Hartmut Goldschmidt; Daniel Catovsky; Gareth J Morgan; Kari Hemminki; Richard S Houlston
Journal:  Sci Rep       Date:  2017-01-23       Impact factor: 4.379

6.  eSNPO: An eQTL-based SNP Ontology and SNP functional enrichment analysis platform.

Authors:  Jin Li; Limei Wang; Tao Jiang; Jizhe Wang; Xue Li; Xiaoyan Liu; Chunyu Wang; Zhixia Teng; Ruijie Zhang; Hongchao Lv; Maozu Guo
Journal:  Sci Rep       Date:  2016-07-29       Impact factor: 4.379

7.  Patient complexity and genotype-phenotype correlations in biliary atresia: a cross-sectional analysis.

Authors:  Guo Cheng; Patrick Ho-Yu Chung; Edwin Kin-Wai Chan; Man-Ting So; Pak-Chung Sham; Stacey S Cherny; Paul Kwong-Hang Tam; Maria-Mercè Garcia-Barceló
Journal:  BMC Med Genomics       Date:  2017-04-17       Impact factor: 3.063

8.  Pathway-Driven Approaches of Interaction between Oxidative Balance and Genetic Polymorphism on Metabolic Syndrome.

Authors:  Ho-Sun Lee; Taesung Park
Journal:  Oxid Med Cell Longev       Date:  2017-01-16       Impact factor: 6.543

9.  Polygenic Risk for Aggression Predicts Adult Substance Use Disorder Diagnoses via Substance Use Offending in Emerging Adulthood and is Moderated by a Family-Centered Intervention.

Authors:  Kit K Elam; Chung Jung Mun; Jodi Kutzner; Thao Ha
Journal:  Behav Genet       Date:  2021-06-11       Impact factor: 2.965

10.  Calcium Signaling Pathway Is Associated with the Long-Term Clinical Response to Selective Serotonin Reuptake Inhibitors (SSRI) and SSRI with Antipsychotics in Patients with Obsessive-Compulsive Disorder.

Authors:  Hidehiro Umehara; Shusuke Numata; Atsushi Tajima; Akira Nishi; Masahito Nakataki; Issei Imoto; Satsuki Sumitani; Tetsuro Ohmori
Journal:  PLoS One       Date:  2016-06-09       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.