Literature DB >> 29784899

Pathway-based analysis of genome-wide association study of circadian phenotypes.

Di-di Zhu1, Jia-Min Yuan2, Rui Zhu1, Yao Wang1, Zhi-Yong Qian1, Jian-Gang Zou1.   

Abstract

Sleepiness affects normal social life, which attracts more and more attention. Circadian phenotypes contribute to obvious individual differences in susceptibility to sleepiness. We aimed to identify candidate single nucleotide polymorphisms (SNPs) which may cause circadian phenotypes, elucidate the potential mechanisms, and generate corresponding SNP-gene-pathways. A genome-wide association studies (GWAS) dataset of circadian phenotypes was utilized in the study. Then, the Identify Candidate Causal SNPs and Pathways analysis was employed to the GWAS dataset after quality control filters. Furthermore, genotype-phenotype association analysis was performed with HapMap database. Four SNPs in three different genes were determined to correlate with usual weekday bedtime, totally providing seven hypothetical mechanisms. Eleven SNPs in six genes were identified to correlate with usual weekday sleep duration, which provided six hypothetical pathways. Our results demonstrated that fifteen candidate SNPs in eight genes played vital roles in six hypothetical pathways implicated in usual weekday bedtime and six potential pathways involved in usual weekday sleep duration.

Entities:  

Year:  2018        PMID: 29784899      PMCID: PMC6163116          DOI: 10.7555/JBR.32.20170102

Source DB:  PubMed          Journal:  J Biomed Res        ISSN: 1674-8301


Introduction

Sleepiness impairs social function, reduces quality of life and causes occupational and motor vehicle accidents[. While behavioral factors, circadian factors (time of day), duration of wakefulness and sleep disorders are closely linked to daytime sleepiness[, there are great interindividual differences in susceptibility to sleepiness[. Accumulating evidence shows that excessive sleepiness is heritable[. In modern society, nearly one-fifth of employees are involved in long-term night shift[. As a result, work performance and scheduling have a significant impact on individual variability in diurnal preference. Studies also indicate that diurnal preference (namely usual weekday bedtime) is heritable[. In addition, usual weekday sleep duration plays a critical role in daytime sleepiness. It has been investigated whether short or long sleep duration has been related to coronary heart disease[, diabetes mellitus[, hypertension[, and mortality[. Likewise, usual day sleep duration is heritable[. To date, several single nucleotide polymorphisms (SNPs) associated with circadian phenotypes in some genes were detected from three genome-wide association studies (GWASs)[, but the functions of these SNPs remain undefined, which is a challenge in interpreting GWAS results[. Thus, pathway-based approaches were optimized gradually, and the Identify Candidate Causal SNPs and Pathways (ICSNPathway) was created to determine potential SNPs and hypothetical mechanisms through GWAS data, using linkage disequilibrium (LD) analysis, functional SNP annotation and pathway-based analysis (PBA)[. Herein, we used bioinformatics methods combining ICSNPathway analysis and HapMap database to identify candidate SNPs and relevant pathways, aiming to develop SNP-gene-pathway hypotheses regarding circadian phenotypes.

Materials and methods

Study population and data extraction

We applied publicly available databases to identify eligible GWASs on circadian phenotypes, which are the National Human Genome Research Institute GWAS catalog (http://www.genome.gov/26525384), the National Center for Biotechnology Information (NCBI) dbGap (http://www.ncbi.nlm.nih.gov/gap/), and the GWAS central (http://www.gwascentral.org/). In addition, both EMBASE and PUBMED databases were searched with the following key words: “GWAS” or “genome-wide association study” and “circadian”. All searches were completed up to April 20th, 2016 without language limitation. In order to reduce the effect of genotyping errors, two independent authors (DZ and JYuan) filtered the primary GWAS data set and removed individuals with a call rate<95%, minor allele frequency<0.01, and deviating from the Hardy-Weinberg equilibrium (HWE) test (P<0.001). During data extraction, discussion with a third author (YW) helped resolve the discrepancies, with consensus on each item reached in the end. After extracting data from the original papers and contacting the corresponding authors, we ruled out the studies without details as needed.

Identification of candidate causal SNPs and pathways

ICSNPathway analysis was conducted in two consecutive stages. In the first stage, the candidate SNPs were pre-selected by LD analysis and functional SNP annotation with P values of <0.05[. During the LD analysis, we queried GWAS to capture the SNPs in LD (with r2>0.8) and positioned in the flanking region (with up to 500 kb upstream and downstream) . The extended dataset including HapMap data (http://hapmap.ncbi.nlm.nih.gov) was utilized to obtain more possible candidate SNPs[. Additionally, to gain LD structures, we used SNAP dataset (http://www.broadinstitute.org/mpg/snap/)[. The other method involves the functional annotation on the SNPs by searching the international SNP function annotation databases, including PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/)[, Ensembl database (http://www.ensembl.org)[, SNPs3D (http://www.snps3d.org)[, and SIFT (http://sift.jcvi.org)[. Genotypic frequencies of candidate SNPs was extracted from the International HapMap Project (phase II, release 23), consisting of 3.96 million SNP genotypes from 270 subjects[. Besides, the data of corresponding mRNA expression was acquired from lymphoblastic cell lines of the 270 individuals mentioned above[, which was extracted from SNPexp (http://app3.titan.uio.no/biotools/help.php?app=snpexp/)[. During the second stage, PBA algorithm was employed to annotate biological pathways of selected SNPs by integrating data from four databases, including BioCarta (http://www.biocarta.com), MsiDB (http://www.broadinstitute.org/gsea/msigdb), Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg) and gene ontology (GO, http://www.geneontology.org). Furthermore, SNP label normalization and permutation were adopted to correct gene variations and generate the distribution of significant proportion based enrichment score (SPES).According to the distributions of SPESs, a nominal P-value and a FDR (false discovery rate; cutoff value: 0.05) were calculated.

Statistical analysis

The expression levels were shown as mean±SEM, and the difference between two genotypes was evaluated by two-side Student's t test. Furthermore, one way ANOVA was utilized to assess the difference of transcript expression levels in more than two genotypes. The statistical analysis was performed with SPSS version 21.0. P values<0.05 were considered statistically significant.

Results

Characteristics of the study population

One GWAS drawn from NCBI dbGap (study accession: phs000007) was finally adopted in our study[ with publicly available summary data after a thorough search. In the GWAS on circadian phenotypes(including usual weekday bedtime and usual weekday sleep duration), totally 749 subjects were collected from the Framingham Offspring Study containing 2848 participants who accomplished sleep habit questionnaires between 1995 and 1998 (Offspring Examination Cycle 6) for the Sleep Heart Health Study[. For usual weekday bedtime, 65,514 candidate causative SNPs were originally generated with an Affymetrix 100K SNP Gene Chip, and afterwards 47,285 SNPs passed the quality control filters which were employed for ultimate bioinformatics analysis. Besides, for usual weekday sleep duration, 65,514 SNPs were generated with the gene chip, while 47,301 SNPs met the quality control criterions and were then applied for subsequent analysis.

Candidate SNPs and pathways

As presented in , totally four SNPs in three genes were determined to correlate with usual weekday bedtime, namely, MT-ND5 rs10517616, GRSF1 rs3775728, and ENAM rs7671281, rs3796704 polymorphisms. Moreover, eleven SNPs in six genes were identified to correlate with usual weekday sleep duration, namely, HSPD1 rs8539, APOBEC2 rs2076472, GRSF1 rs3775728, TTN rs9808377, rs1001238, rs2042995, rs3829746, rs2042996, CENPE rs2243682, rs2615542 and SLC17A1 rs13213957. Of note, GRSF1 rs3775728 was linked with both usual weekday bedtime and usual weekday sleep duration. SNP rs3775728 was in LD with rs2278134 (r2=1.0) ; rs7671281 and rs3796704 were in LD with rs2553319 (r2=1.0, and 1.0, respectively); rs9808377, rs1001238 and rs2042995 were in LD with rs3829746 (r2=0.945, 0.946, and 0.945, respectively); rs2243682 and rs2615542 were in LD with rs2290943 (r2=1.0, and 1.0, respectively); SNP rs13213957 was in LD with rs3734523 (r2=0.828). Except for a repeated SNP, fourteen regional LD plots are shown in . Candidate single nucleotide polymorphisms identified by ICSNPathway analysis SNP: single nucleotide polymorphism; LD: linkage disequilibrium; NA: not available. a The number indicates the index of pathways ranked by their statistical significance (false discovery rate). b−log10(P) for candidate SNP in the original genome wide association study (GWAS). c −log10(P) for the SNP in the original GWAS which was in LD with candidate SNP. Detailed LD plots for the polymorphisms. Then, we examined the roles of different genotypes in mRNA expression levels via HapMap c-DNA expression database which was publicly available. No significant association between all SNPs with the mRNA expressions of corresponding genes was found in Caucasians as presented in . However, the SLC17A1 rs13213957 polymorphisms might tend to affect the mRNA expression levels of SLC17A1 (with marginal P value=0.0785), which is consistent with the functional class indicated in . In addition, the functions of the corresponding proteins were examined, which demonstrated that all SNPs could cause residue change except for HSPD1 rs8539, summarized in . In addition, MT-ND5 rs10517616 was not estimated here because no data was available publicly. mRNA expression by the genotypes of SNPs with the data from HapMap SNP: single nucleotide polymorphism; NA: not available. a Two-side Student's t test within the stratum. b P values for one way ANOVA of mRNA expression among different genotypes for each SNP. c Marginal P value (in bold). Residue changes by the genotypes of SNPs with the data from dbSNP SNP: single nucleotide polymorphism. During the ICSNPathway analysis, six pathways about usual weekday bedtime were detected and are summarized in . The first mechanism involved MT-ND5 rs10517616 polymorphism (nonsynonymous coding) in pathways such as NADH dehydrogenase activity (nominal P<0.001, FDR=0.011), respiratory electron transport chain (nominal P=0.001, FDR=0.011), oxidoreductase activity (nominal P=0.002, FDR=0.017), and oxidative phosphorylation (nominal P=0.004, FDR=0.047). The second was GRSF1 rs3775728 polymorphism (nonsynonymous coding) in mRNA binding pathway (nominal P<0.001, FDR=0.014). The third one included ENAM rs7671281, rs3796704 polymorphisms (nonsynonymous coding) in pathway of biomineral formation (nominal P<0.001, FDR=0.021). Candidate pathways for circadian phenotypes PN: nominal P value; FDR: false discovery rate; GO: gene ontology. In the ICSNPathway analysis of usual weekday sleep duration, six pathways were found and are presented in similarly. The first was HSPD1 rs8539 polymorphism (nonsynonymous coding) in the unfolded protein binding pathway (nominal P=0.001 FDR=0.03). The second one was APOBEC2 rs2076472 polymorphism (nonsynonymous coding) in pathway of mRNA processing (nominal P<0.001, FDR=0.031). The third mechanism involved GRSF1 rs3775728 polymorphism (nonsynonymous coding) in pathways containing mRNA processing (nominal P<0.001, FDR=0.031), RNA processing (nominal P=0.002, FDR=0.039), and mRNA binding (nominal P<0.001, FDR=0.042). The fourth pathway consisted of TTN rs9808377, rs1001238, rs2042995, rs3829746, rs2042996, and CENPE rs2243682, rs2615542 polymorphisms (nonsynonymous coding) in cell cycle (nominal P<0.001, FDR=0.036). The last one was SLC17A1 rs13213957 polymorphism (regulatory region) in the anion transport pathway (nominal P<0.001, FDR=0.042).

Discussion

A compound molecular network may make a significant contribution to the development of circadian phenotypes, containing several cellular pathways[. GWASs are limited to detect single SNP associations and identify new loci, so we applied a pathway-based pattern to take the biological interplay between multiple genes into consideration, and propose novel views into how genes might help the development of circadian phenotypes[. In this study, we applied ICSNPathway analysis to identify six potential regulating mechanisms, respectively, in usual weekday bedtime and sleep duration. The most significant SNP-to-gene-to-effect hypothesis was that rs10517616 changes the feature of MT-ND5 in NADH dehydrogenase activity[. It was reported that NADH promoted the transcription of the lactate dehydrogenase (LDH) gene under redox state. This is based on the activation of E-box by binding heterodimer Bmal1/NPAS2, the master brain clock to regulate circadian rhythmicity[. The second candidate gene GRSF1 found in this study and previous studies has been implied in the pathway of mRNA binding through SNP rs3775728[. The third biological mechanism involves the modulation of ENAM by rs7671281 and rs3796704 to affect its role in mineral formation[. The forth one involves the influence of rs8539 on HSPD1 in unfolded protein binding[. The fifth involves the modulation of APOBEC2 by rs2076472 to affect mRNA processing. The sixth involves the modulation of TTN by rs9808377, rs1001238, rs2042995, rs3829746, and rs2042996 as well as CENPE by rs2243682 and rs2615542 to influence its role in cell cycle[. The seventh involves the modulation of SLC17A1 by rs13213957 to affect anion transport[, which could influence the mRNA expression of SLC17A1. As far as we know, these mechanisms of circadian phenotypes, including MT-ND5, GRSF1, ENAM, HSPD1, APOBEC2, TTN, CENPE and SLC17A1, have been firstly identified in our study. The ICSNPathway analysis has been conducted to identify candidate causal genes relevant to disease-related phenotypes such as rheumatoid arthritis[. Thus, the results received in our study might help the development of novel hypotheses for the further investigations. Even though the abovementioned biological mechanisms may affect circadian phenotypes, several limitations should be acknowledged. Firstly, the data was obtained from only 749 subjects[, which may limit the application to the whole populations and weaken the authority to identify the candidate SNPs. Secondly, with no study supplying strong supports for these results, the candidate SNP-gene-pathways should be verified in more studies. In short, our results demonstrated fifteen candidate SNPs in eight genes (MT-ND5 rs10517616, GRSF1 rs3775728, ENAM rs7671281, rs3796704, HSPD1 rs8539, APOBEC2 rs2076472, GRSF1 rs3775728, TTN rs9808377, rs1001238, rs2042995, rs3829746, rs2042996, CENPE rs2243682, rs2615542 and SLC17A1 rs13213957 polymorphisms), which participate in six hypothetical pathways involved in usual weekday bedtime and six potential pathways implicated usual weekday sleep duration. However, further investigations are warranted to validate the identified genetic variations in the biological pathways related to circadian phenotypes.
Tab.1

Candidate single nucleotide polymorphisms identified by ICSNPathway analysis

SNP IDFunctional classGeneChromosomeCandidate- pathwaya-log10(P)bIn LD with r2D'-log10(P)c
Usual weekday bedtime
rs10517616nonsynonymous codingMT-ND541,2,4,61.49rs10517616NANA1.49
rs3775728nonsynonymous codingGRSF14q133NArs2278134111.664
rs7671281nonsynonymous codingENAM4q13.35NArs2553319111.342
rs3796704nonsynonymous coding(deleterious)ENAM4q13.35NArs2553319111.342
Usual weekday sleep duration
rs8539nonsynonymous codingHSPD12q33.111.62rs8539NANA1.62
rs2076472nonsynonymous codingAPOBEC26p2121.533rs2076472NANA1.533
rs3775728nonsynonymous codingGRSF14q132,4,6NArs2278134111.881
rs9808377nonsynonymous codingTTN2q313NArs38297460.94511.567
rs1001238nonsynonymous codingTTN2q313NArs38297460.94611.567
rs2042995nonsynonymous codingTTN2q313NArs38297460.94511.567
rs3829746nonsynonymous codingTTN2q3131.567rs3829746NANA1.567
rs2042996nonsynonymous codingTTN2q3131.423rs2042996NANA1.423
rs2243682nonsynonymous coding(deleterious)CENPE4q24-q253NArs2290943111.418
rs2615542nonsynonymous codingCENPE4q24-q253NArs2290943111.418
rs13213957regulatory regionSLC17A16p22.25NArs37345230.82811.605

SNP: single nucleotide polymorphism; LD: linkage disequilibrium; NA: not available. a The number indicates the index of pathways ranked by their statistical significance (false discovery rate). b−log10(P) for candidate SNP in the original genome wide association study (GWAS). c −log10(P) for the SNP in the original GWAS which was in LD with candidate SNP.

Tab.2

mRNA expression by the genotypes of SNPs with the data from HapMap

CategoryNo.Mean±SEM Pa Pb
rs3775728    
TT5510.56±0.04378NA
CT310.84±0.112300.1536
rs7671281
TT526.067±0.01329NA
CT46.046±0.037290.6837
rs3796704
GG536.065±0.01309NA
AG36.062±0.048070.9500
rs8539
CC316.198±0.011350.9718
CT196.190±0.016280.6832
TT56.194±0.025300.8887
CT+TT246.191±0.013700.6868
rs2076472
TT376.311±0.01794NA
CT196.291±0.019700.4927
rs9808377
AA376.426±0.013310.7668
AG166.409±0.018010.4692
GG36.398±0.057420.5726
AG+GG196.408±0.016920.3997
rs1001238
TT356.427±0.013540.7663
CT186.410±0.017650.4561
CC36.398±0.057420.5616
CT+CC216.408±0.016580.3918
rs2042995
TT376.426±0.013310.7668
CT166.409±0.018010.4692
CC36.398±0.057420.5726
CT+CC196.408±0.016920.3997
rs3829746
TT366.429±0.013350.5472
CT176.404±0.017680.2779
CC36.398±0.057420.5305
CT+CC206.403±0.016610.2376
rs2042996
GG366.426±0.013680.787
AG176.410±0.016950.4921
AA36.398±0.057420.5787
AG+AA206.409±0.016080.4213
rs2243682
GG388.850±0.063240.5172
AG168.949±0.112000.4196
AA19.41940207NA
AG+AA178.976±0.108700.2934
rs2615542
AA388.850±0.063240.522
AG178.945±0.105200.4215
GG19.41940207NA
AG+GG188.972±0.102600.2979
rs13213957
TT726.067±0.010000.1051
CT166.033±0.019200.1453
CC15.926081NA
TT+CT176.027±0.019100.0785c 

SNP: single nucleotide polymorphism; NA: not available. a Two-side Student's t test within the stratum. b P values for one way ANOVA of mRNA expression among different genotypes for each SNP. c Marginal P value (in bold).

Tab.3

Residue changes by the genotypes of SNPs with the data from dbSNP

SNPGeneProtein positionResidue change
rs3775728GRSF1194Val-to-Ile
rs7671281ENAM648Ile-to-Thr
rs3796704ENAM763Arg-to-Gln
rs8539HSPD191Lys-to-Lys
rs2076472APOBEC2136Ile-to-Thr
rs9808377TTN20346Ile-to-Thr
rs1001238TTN9651Asn-to-Asp
rs2042995TTN10221Ile-to-Val
rs3829746TTN18725Ile-to-Val
rs2042996TTN12353Thr-to-Ile
rs2243682CENPE1942Thr-to-Met
rs2615542CENPE1535Phe-to-Leu

SNP: single nucleotide polymorphism.

Tab.4

Candidate pathways for circadian phenotypes

IndexCandidate pathwayDescriptionPNFDR
Usual weekday bedtime
1NADH dehydrogenase activityGO:0003954. Catalysis of the reaction: NADH+ H+ + acceptor=NAD+ + reduced acceptor.<0.0010.011
2Respiratory electron transport chainGO:0022904. A process whereby a series of electron carriers operate together to transfer electrons from donors such as NADH and FADH2 to any of several different terminal electron acceptors to generate a transmembrane electrochemical gradient.0.0010.011
3mRNA bindingGO:0003729. Interacting selectively with pre-messenger RNA (pre-mRNA) or messenger RNA (mRNA).<0.0010.014
4Oxidoreductase activityGO:0016655. Catalysis of an oxidation-reduction (redox) reaction in which NADH or NADPH acts as a hydrogen or electron donor and reduces a quinone or a similar acceptor molecule.0.0020.017
5Biomineral formationGO:0031214. Formation of hard tissues that consist mainly of inorganic compounds, and also contain a small amounts of organic matrices that are believed to play important roles in their formation.<0.0010.021
6Oxidative phosphorylationGO:0006119. The phosphorylation of ADP to ATP that accompanies the oxidation of a metabolite through the operation of the respiratory chain. Oxidation of compounds establishes a proton gradient across the membrane, providing the energy for ATP synthesis.0.0040.047
Usual weekday sleep duration
1Unfolded protein bindingGO:0051082. Interacting selectively with an unfolded protein.0.0010.03
2mRNA processingGO:0006397. Any process involved in the conversion of a primary mRNA transcript into one or more mature mRNA(s) prior to translation into polypeptide.<0.0010.031
3Cell cycleGO:0007049. The progression of biochemical and morphological phases and events that occur in a cell during successive cell replication or nuclear replication events. Canonically, the cell cycle comprises the replication and segregation of genetic material followed by the division of the cell, but in endocycles or syncytial cells nuclear replication or nuclear division may not be followed by cell division.<0.0010.036
4RNA processingGO:0006396. Any process involved in the conversion of one or more primary RNA transcripts into one or more mature RNA molecules.0.0020.039
5Anion transportGO:0006820. The directed movement of anions, atoms or small molecules with a net negative charge, into, out of, within or between cells.<0.0010.042
6mRNA bindingGO:0003729. Interacting selectively with pre-messenger RNA (pre-mRNA) or messenger RNA (mRNA).<0.0010.042

PN: nominal P value; FDR: false discovery rate; GO: gene ontology.

  42 in total

1.  Evidence for genetic influences on sleep disturbance and sleep pattern in twins.

Authors:  A C Heath; K S Kendler; L J Eaves; N G Martin
Journal:  Sleep       Date:  1990-08       Impact factor: 5.849

Review 2.  Analysing biological pathways in genome-wide association studies.

Authors:  Kai Wang; Mingyao Li; Hakon Hakonarson
Journal:  Nat Rev Genet       Date:  2010-12       Impact factor: 53.242

3.  SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap.

Authors:  Andrew D Johnson; Robert E Handsaker; Sara L Pulit; Marcia M Nizzari; Christopher J O'Donnell; Paul I W de Bakker
Journal:  Bioinformatics       Date:  2008-10-30       Impact factor: 6.937

4.  Chromokinesin Kid and kinetochore kinesin CENP-E differentially support chromosome congression without end-on attachment to microtubules.

Authors:  Kenji Iemura; Kozo Tanaka
Journal:  Nat Commun       Date:  2015-03-06       Impact factor: 14.919

5.  Genetic and environmental influences on insomnia, daytime sleepiness, and obesity in twins.

Authors:  Nathaniel F Watson; Jack Goldberg; Lester Arguelles; Dedra Buchwald
Journal:  Sleep       Date:  2006-05       Impact factor: 5.849

6.  A Na+-phosphate cotransporter homologue (SLC17A4 protein) is an intestinal organic anion exporter.

Authors:  Natsuko Togawa; Takaaki Miyaji; Sho Izawa; Hiroshi Omote; Yoshinori Moriyama
Journal:  Am J Physiol Cell Physiol       Date:  2012-03-28       Impact factor: 4.249

7.  A genetic analysis of the Epworth Sleepiness Scale in 1560 World War II male veteran twins in the NAS-NRC Twin Registry.

Authors:  D Carmelli; D L Bliwise; G E Swan; T Reed
Journal:  J Sleep Res       Date:  2001-03       Impact factor: 3.981

8.  Genetic and environmental determination of human sleep.

Authors:  M Partinen; J Kaprio; M Koskenvuo; P Putkonen; H Langinvainio
Journal:  Sleep       Date:  1983       Impact factor: 5.849

9.  GRSF1 regulates RNA processing in mitochondrial RNA granules.

Authors:  Alexis A Jourdain; Mirko Koppen; Mateusz Wydro; Chris D Rodley; Robert N Lightowlers; Zofia M Chrzanowska-Lightowlers; Jean-Claude Martinou
Journal:  Cell Metab       Date:  2013-03-05       Impact factor: 31.373

10.  Genome-wide association analysis identifies novel loci for chronotype in 100,420 individuals from the UK Biobank.

Authors:  Jacqueline M Lane; Irma Vlasac; Simon G Anderson; Simon D Kyle; William G Dixon; David A Bechtold; Shubhroz Gill; Max A Little; Annemarie Luik; Andrew Loudon; Richard Emsley; Frank A J L Scheer; Deborah A Lawlor; Susan Redline; David W Ray; Martin K Rutter; Richa Saxena
Journal:  Nat Commun       Date:  2016-03-09       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.