Literature DB >> 27980618

Examination of previously identified associations within the Genetic Analysis Workshop 19 data.

Richard A J Howey1, Jakris Eu-Ahsunthornwattana2, Rebecca Darlay1, Heather J Cordell1.   

Abstract

We investigate the possible replication of "known" associated single-nucleotide polymorphisms (SNPs) with blood pressure and expression phenotypes. Previous studies have provided a list of 95 SNPs thought to be associated with blood pressure phenotypes, of which 44 were present in the Genetic Analysis Workshop 19 (GAW19) family-imputed genome-wide association studies (GWAS) data and 4 in the GAW19 unrelateds sequence data. Using only the real (not simulated) GAW19 data, we show through the use of statistical tests that account for family relatedness, using FaST-LMM (Factored Spectrally Transformed Linear Mixed Model), that none of our candidate SNPs yields a significant p value. Furthermore, a study of epistasis, aiming to detect statistical interactions between loci with respect to their association with transcription levels has provided a list of 30 associated interacting SNP pairs, of which 13 are present in the GAW19 family GWAS and expression data. We show for this set of results, using the program GEMMA (genome-wide efficient mixed-model analysis) to account for family relatedness, that there is evidence of replication within the real GAW19 data. Two individual SNP pairs reach significance, and the set of remaining results give a combined p value of 0.017 that at least 1 of these remaining SNP pairs interacts to influence an expression phenotype.

Entities:  

Year:  2016        PMID: 27980618      PMCID: PMC5133475          DOI: 10.1186/s12919-016-0012-2

Source DB:  PubMed          Journal:  BMC Proc        ISSN: 1753-6561


Background

Previous studies using very large data sets have provided a list of single-nucleotide polymorphisms (SNPs) believed to be associated with blood pressure and expression phenotypes. We attempt to replicate these SNPs in the Genetic Analysis Workshop 19 (GAW19) family genome-wide association studies (GWAS) data set and GAW19 sequence data, which may indicate the feasibility of finding novel SNPs in the GAW19 data sets.

Methods

Family genome-wide association studies data

The GAW19 family GWAS data [1] consisted of 959 individuals in 20 families with SNP data for odd chromosomes, including both real and imputed SNP data, and phenotype data for systolic blood pressure (SBP), diastolic blood pressure (DBP), and hypertension (HTN). Quality control was performed identically to that by Eu-ahsunthornwattana et al. [2] on the Genetic Analysis Workshop 18 data. This resulted in 954 individuals in 20 families. The phenotype data consists of longitudinal data measured over 4 years with covariates for smoking, HTN medication, and age. Covariates and measurements over multiple time points were accounted for by transformation to a single “average” quantitative trait for each phenotype, as described by Eu-ahsunthornwattana et al. [2]. A meta-analysis study conducted by Tragante et al. [3] of 87,736 individuals provided 95 candidate SNPs associated with blood pressure–related phenotypes; of these, 44 candidate SNPs were present in the GAW19 family data. Two extra phenotypes were created using the GAW19 phenotype data as defined by Tragante et al. [3]: (a) median arterial pressure (MAP) = 2/3 DBP + 1/3 SBP; and (b) pulse pressure (PP) = SBP − DBP. The 44 candidate SNPs were tested individually using FaST-LMM (Factored Spectrally Transformed Linear Mixed Model) with the realized relationship matrix (RRM) option to adjust for relatedness between individuals. To examine the overall association of a set of SNPs we defined a statistic inspired by Dudbridge and Koeleman’s rank truncated product statistic [4]:where p i is the p value of the i SNP tested from n candidate SNPs. Candidate SNPs for each phenotype were considered together giving overall p values for DBP, MAP, PP, SBP, and HTN. The null hypothesis is formed by assuming that none of the SNPs are associated with the phenotype in question. A set of p values can be generated by sampling U[0, 1], where the correlations (r 2, calculated using PLINK) of nearby SNPs (<2 Mb) are accounted for [5]. The overall p value is then given by the proportion of simulated test statistics greater than the observed test statistic, from 500,000 replicates generated under the null hypothesis. An alternative method to control the false discovery rate for correlated test statistics is given by Yekutieli and Benjamini [6]. Estimates of the power to detect association, at significance level 0.05, for each of the tested SNPs with the appropriate phenotypes were calculated using the program Quanto (http://biostats.usc.edu/Quanto.html), assuming that the individuals are unrelated, thus providing upper limits for the power. Parameter estimates and minor allele frequencies were taken from Tragante et al. [3], and sample sizes were assumed to equal those of the GAW19 family GWAS data.

Unrelated sequence data

From the 95 previously associated SNPs, only 5 were found in the GAW19 sequence data. Data consisted of 1943 unrelated individuals (of which 92 had missing phenotype data), together with 1 covariate on HTN medication. PLINK was used to calculate p values using linear regression.

Expression data

A recent study by Hemani et al. [7], motivated by a desire to investigate the extent to which epistasis (the phenomenon whereby one polymorphism’s effect on a trait depends on other polymorphisms present in the genome) might influence complex traits, detected 30 gene–gene (SNP–SNP) interactions associated with transcription. We attempted to replicate these associations using the GAW19 family GWAS and expression data. From the 30 candidate SNP pairs, 11 were not considered because SNPs were on even chromosomes and 6 because of missing gene-probe data. The gene probes in the GAW19 data were different from those used by Hemani et al. [7], and were adjusted to account for covariates using the same method as was described for the blood pressure phenotypes. One gene (CTSC) had 2 gene probes. To test for SNP–SNP interactions while allowing for family relatedness, GEMMA (genome-wide efficient mixed-model analysis) was used with an estimated kinship matrix. GEMMA does not have an interaction option but it does allow covariates, which were used to encode SNP data through use of 2 linear mixed models. The first model encoded 3 variables: the number of minor alleles for each SNP and the intercept. The second model encoded an extra variable given by the product of the number of minor alleles of the 2 SNPs, thus imposing an additive × additive interaction model. The maximum likelihood estimates for each model were used to evaluate p values using the likelihood ratio test. An overall p value for all 14 interaction tests was calculated using the method previously described for single SNP association analyses in the GAW19 family GWAS data, using 10 million replicates.

Results

Table 1 lists the results for the SNP analyses using the imputed family GWAS data. It can be seen that no SNPs are below the Bonferroni corrected p value of 0.05/75 = 0.00067 for a family-wise error rate (FWER) of 0.05. Figure 1 shows the overall permutation distributions for the test statistics for each phenotype resulting in p values of 0.34, 0.17, 0.052, 0.25, and 0.53 for DBP, MAP, PP, SBP, and HTN respectively. This appears to provide some very weak evidence of association for PP; that is, at least 1 of the tested SNPs is significantly associated with PP.
Table 1

GWAS family data

SNPChrPositionDBPMAPPPSBPHTN
rs8803151107968660.750.00902
rs48460491118503650.0971
rs173675041118627780.07290.06950.2050.323
rs133065601118661830.685
rs50681119059740.646
rs1703061311131908070.967
rs293253811132165430.872
rs216913712044979130.276
rs200477612308487020.5990.771
rs1112258712308671000.635
rs3475913112901220.294
rs130827113275379090.6740.527
rs37743723418774140.0539
rs98153543419126510.0539
rs3196903479274840.795
rs41907631691008860.7080.5490.519
rs77264755325759140.2240.907
rs14218115327142700.577
rs11737665328045280.992
rs11737715328150280.9130.2660.360.857
rs1195363051578454020.2730.4450.86
rs22829787922644100.618
rs1747717771064118580.00845
rs391822671506901760.502
rs1022400271514150410.913
rs6613481119052920.158
rs2177271120169080.02550.0144
rs712922011103505380.0894
rs201440811163652820.604
rs38181511169022680.9450.7940.643
rs75708111173516830.2990.03220.0537
rs207431111174218600.238
rs374137811654089370.5990.352
rs633185111005935380.04190.05640.2180.289
rs11222084111302732300.504
rs103647715489149260.951
rs137894215750773670.6290.8750.7740.126
rs649512215751256450.1450.0616
rs207141015914209400.517
rs252150115914373880.5230.460.455
rs1294645417432081210.919
rs1760876617450132710.9430.936
rs1294088717474028070.4510.443
rs1694804817474404660.338

Replication p values in the GAW19 imputed family GWAS data for SNPs and phenotypes as indicated. Only previously associated SNP–phenotype combinations were investigated

Fig. 1

GWAS family data and expression data, permutation distributions. Permutation distributions for the GWAS and expression data p values with observed test statistics shown by the dashed lines

GWAS family data Replication p values in the GAW19 imputed family GWAS data for SNPs and phenotypes as indicated. Only previously associated SNP–phenotype combinations were investigated GWAS family data and expression data, permutation distributions. Permutation distributions for the GWAS and expression data p values with observed test statistics shown by the dashed lines Upper limits of the power to detect the tested SNP associations, at significance level 0.05, gave estimates ranging from approximately 0.080 to 0.30.

Unrelateds sequence data

Table 2 lists the results for SNP analyses using sequence data. No SNPs are below the Bonferroni corrected p value (0.05/12 = 0.0042) for a FWER of 0.05.
Table 2

Unrelateds sequence SNP data

SNPChromosomePositionDBPMAPPPSBP
rs37743723418774140.974
rs6613481119052920.497
rs2177271120169080.7890.490
rs75708111173516830.3750.3750.432
rs374137811654089370.9420.772
rs2472304a 15750442380.2460.2750.472

Replication p values in the GAW19 sequence data for SNPs and phenotypes as indicated. Only previously associated SNP-phenotype combinations were investigated

aUsing the proxy SNP rs1378942

Unrelateds sequence SNP data Replication p values in the GAW19 sequence data for SNPs and phenotypes as indicated. Only previously associated SNP-phenotype combinations were investigated aUsing the proxy SNP rs1378942 Table 3 lists the results for the SNP pair analyses using the GWAS and expression data. Two SNP pairs gave p values below the Bonferroni corrected p value of 0.0036 for a FWER of 0.05: rs4284750 and rs873870 on chromosome 19 interacted to influence gene expression at ATP13A1; and rs9979356 and rs3761385 on chromosome 21 interacted to influence gene expression at CSTB. The p values are generally lower than expected with a median of 0.223. Furthermore, the overall p value for the 14 tests, as shown in Fig. 1, is 5.6 × 10−6, and with the 2 significant SNP pairs removed the remaining 12 tests give an overall p value of 0.017, which provides reasonable evidence that at least 1 of these SNP pairs also interacts to influence the corresponding gene expression measurement.
Table 3

Gene expression data and SNP–SNP interactions

GeneProbeSNP 1Chr 1SNP 2Chr 2 p value
ATP13A1 GI_9966896-Srs428475019rs873870190.000101
CSTB GI_20357564-Srs997935621rs3761385210.000747
CTSC GI_22538439-Irs793023711rs556895110.020446
CTSC GI_22538438-Irs793023711rs556895110.636009
FN3KRP GI_20149679-Srs89809517rs9892064170.015972
GAA GI_11496988-Srs1115084717rs12602462170.240101
LAX1 GI_8923315-Srs18914321rs1090052010.205055
MBNL1 GI_41281590-Srs168643673rs1307920830.01683
MBNL1 GI_41281590-Srs77107385rs1306955930.938258
MBNL1 GI_41281590-Srs21867117rs1306955930.178989
MBNL1 GI_41281590-Srs119815137rs1306955930.307821
PRMT2 GI_4504494-Srs283937221rs11701058210.4209
TRA2A GI_33620726-Srs77765727rs1177019270.51894
VASP GI_4507868-Srs126422619rs2276470190.645204

Replication p values in the GAW19 GWAS data for interacting SNP pairs and gene probes as indicated

Gene expression data and SNP–SNP interactions Replication p values in the GAW19 GWAS data for interacting SNP pairs and gene probes as indicated

Discussion

The candidate SNPs and blood pressure phenotypes investigated here were previously detected in large meta-analyses or other replicated studies, giving considerable confidence that these SNPs are in fact genuinely associated. However, the sample size in the GAW19 family GWAS data consists of only 954 related individuals, giving power (for nominal p value 0.05) expected to be less than 0.080 to 0.30; perhaps it is not too unexpected that no associations were replicated. The sample size of the unrelateds sequence data, 1943 individuals, was greater, but nonetheless did not replicate any previously observed associations. Although the low sample sizes are the most obvious reason for the nonreplication, there may also be more subtle reasons for the nonreplication, such as the relatedness and ethnicity of the samples used. The quality and accuracy of the measured phenotypes may also be relevant, in particular whether individuals took HTN medicine or not. The SNP–SNP interactions previously shown to be associated with transcription did, however, show some evidence of replication, with 2 SNP pairs showing significant evidence of association and the remaining SNP pairs giving an overall p value of 0.017, indicating that at least 1 additional SNP pair is associated. It is argued that the power to detect such associations may be greater because of the more direct link between SNPs and transcription. We note, however, that the interpretation of such findings as representing genuine interactions (as opposed to haplotype effects, possibly marking an untyped causal variant) can be flawed when the SNPs are close to one another [8].

Conclusions

There was no evidence of replication using the GAW19 data for previously found SNP associations with blood pressure phenotypes, possibly because of the low sample size. However, there was some evidence of replication for SNP–SNP interactions associated with transcription. This may be the result of a greater power to detect associations with transcription than with more distantly related phenotypes.
  5 in total

1.  Rank truncated product of P-values, with application to genomewide association scans.

Authors:  Frank Dudbridge; Bobby P C Koeleman
Journal:  Genet Epidemiol       Date:  2003-12       Impact factor: 2.135

2.  Another explanation for apparent epistasis.

Authors:  Andrew R Wood; Marcus A Tuke; Mike A Nalls; Dena G Hernandez; Stefania Bandinelli; Andrew B Singleton; David Melzer; Luigi Ferrucci; Timothy M Frayling; Michael N Weedon
Journal:  Nature       Date:  2014-10-02       Impact factor: 49.962

3.  Gene-centric meta-analysis in 87,736 individuals of European ancestry identifies multiple blood-pressure-related loci.

Authors:  Vinicius Tragante; Michael R Barnes; Santhi K Ganesh; Matthew B Lanktree; Wei Guo; Nora Franceschini; Erin N Smith; Toby Johnson; Michael V Holmes; Sandosh Padmanabhan; Konrad J Karczewski; Berta Almoguera; John Barnard; Jens Baumert; Yen-Pei Christy Chang; Clara C Elbers; Martin Farrall; Mary E Fischer; Tom R Gaunt; Johannes M I H Gho; Christian Gieger; Anuj Goel; Yan Gong; Aaron Isaacs; Marcus E Kleber; Irene Mateo Leach; Caitrin W McDonough; Matthijs F L Meijs; Olle Melander; Christopher P Nelson; Ilja M Nolte; Nathan Pankratz; Tom S Price; Jonathan Shaffer; Sonia Shah; Maciej Tomaszewski; Peter J van der Most; Erik P A Van Iperen; Judith M Vonk; Kate Witkowska; Caroline O L Wong; Li Zhang; Amber L Beitelshees; Gerald S Berenson; Deepak L Bhatt; Morris Brown; Amber Burt; Rhonda M Cooper-DeHoff; John M Connell; Karen J Cruickshanks; Sean P Curtis; George Davey-Smith; Christian Delles; Ron T Gansevoort; Xiuqing Guo; Shen Haiqing; Claire E Hastie; Marten H Hofker; G Kees Hovingh; Daniel S Kim; Susan A Kirkland; Barbara E Klein; Ronald Klein; Yun R Li; Steffi Maiwald; Christopher Newton-Cheh; Eoin T O'Brien; N Charlotte Onland-Moret; Walter Palmas; Afshin Parsa; Brenda W Penninx; Mary Pettinger; Ramachandran S Vasan; Jane E Ranchalis; Paul M Ridker; Lynda M Rose; Peter Sever; Daichi Shimbo; Laura Steele; Ronald P Stolk; Barbara Thorand; Mieke D Trip; Cornelia M van Duijn; W Monique Verschuren; Cisca Wijmenga; Sharon Wyatt; J Hunter Young; Aeilko H Zwinderman; Connie R Bezzina; Eric Boerwinkle; Juan P Casas; Mark J Caulfield; Aravinda Chakravarti; Daniel I Chasman; Karina W Davidson; Pieter A Doevendans; Anna F Dominiczak; Garret A FitzGerald; John G Gums; Myriam Fornage; Hakon Hakonarson; Indrani Halder; Hans L Hillege; Thomas Illig; Gail P Jarvik; Julie A Johnson; John J P Kastelein; Wolfgang Koenig; Meena Kumari; Winfried März; Sarah S Murray; Jeffery R O'Connell; Albertine J Oldehinkel; James S Pankow; Daniel J Rader; Susan Redline; Muredach P Reilly; Eric E Schadt; Kandice Kottke-Marchant; Harold Snieder; Michael Snyder; Alice V Stanton; Martin D Tobin; André G Uitterlinden; Pim van der Harst; Yvonne T van der Schouw; Nilesh J Samani; Hugh Watkins; Andrew D Johnson; Alex P Reiner; Xiaofeng Zhu; Paul I W de Bakker; Daniel Levy; Folkert W Asselbergs; Patricia B Munroe; Brendan J Keating
Journal:  Am J Hum Genet       Date:  2014-02-20       Impact factor: 11.025

4.  Accounting for relatedness in family-based association studies: application to Genetic Analysis Workshop 18 data.

Authors:  Jakris Eu-Ahsunthornwattana; Richard Aj Howey; Heather J Cordell
Journal:  BMC Proc       Date:  2014-06-17

5.  Detection and replication of epistasis influencing transcription in humans.

Authors:  Gibran Hemani; Konstantin Shakhbazov; Harm-Jan Westra; Tonu Esko; Anjali K Henders; Allan F McRae; Jian Yang; Greg Gibson; Nicholas G Martin; Andres Metspalu; Lude Franke; Grant W Montgomery; Peter M Visscher; Joseph E Powell
Journal:  Nature       Date:  2014-02-26       Impact factor: 49.962

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.