Young Lee1, Suyeon Park2, Sanghoon Moon3, Juyoung Lee4, Robert C Elston5, Woojoo Lee6, Sungho Won7. 1. The Center for Genome Science, Korea National Institute of Health, KCDC, Osong 361-951, Korea. lyou7688@gmail.com. 2. The Center for Genome Science, Korea National Institute of Health, KCDC, Osong 361-951, Korea. sooyeon1002@gmail.com. 3. The Center for Genome Science, Korea National Institute of Health, KCDC, Osong 361-951, Korea. jiap72@hanmail.net. 4. The Center for Genome Science, Korea National Institute of Health, KCDC, Osong 361-951, Korea. jylee_monte@naver.com. 5. Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH 44106, USA. robert.elston@cwru.edu. 6. Department of Statistics, Inha University, Incheon 402-751, Korea. 7. Department of Public Health Science, Seoul National University, Seoul 151-742, Korea. won1@snu.ac.kr.
Abstract
Longitudinal data enables detecting the effect of aging/time, and as a repeated measures design is statistically more efficient compared to cross-sectional data if the correlations between repeated measurements are not large. In particular, when genotyping cost is more expensive than phenotyping cost, the collection of longitudinal data can be an efficient strategy for genetic association analysis. However, in spite of these advantages, genome-wide association studies (GWAS) with longitudinal data have rarely been analyzed taking this into account. In this report, we calculate the required sample size to achieve 80% power at the genome-wide significance level for both longitudinal and cross-sectional data, and compare their statistical efficiency. Furthermore, we analyzed the GWAS of eight phenotypes with three observations on each individual in the Korean Association Resource (KARE). A linear mixed model allowing for the correlations between observations for each individual was applied to analyze the longitudinal data, and linear regression was used to analyze the first observation on each individual as cross-sectional data. We found 12 novel genome-wide significant disease susceptibility loci that were then confirmed in the Health Examination cohort, as well as some significant interactions between age/sex and SNPs.
Longitudinal data enables detecting the effect of aging/time, and as a repeated measures design is statistically more efficient compared to cross-sectional data if the correlations between repeated measurements are not large. In particular, when genotyping cost is more expensive than phenotyping cost, the collection of longitudinal data can be an efficient strategy for genetic association analysis. However, in spite of these advantages, genome-wide association studies (GWAS) with longitudinal data have rarely been analyzed taking this into account. In this report, we calculate the required sample size to achieve 80% power at the genome-wide significance level for both longitudinal and cross-sectional data, and compare their statistical efficiency. Furthermore, we analyzed the GWAS of eight phenotypes with three observations on each individual in the Korean Association Resource (KARE). A linear mixed model allowing for the correlations between observations for each individual was applied to analyze the longitudinal data, and linear regression was used to analyze the first observation on each individual as cross-sectional data. We found 12 novel genome-wide significant disease susceptibility loci that were then confirmed in the Health Examination cohort, as well as some significant interactions between age/sex and SNPs.
Disease prognosis and personalized medicine require the identification of genetic and non-genetic risk factors and, with the rapid improvement of genotyping technology, more than ten thousand genome-wide association studies (GWAS) have been conducted to discover disease susceptibility loci. Since the first such successful study in 2005 [1], more than ten thousand disease susceptibility loci have been successfully identified and these findings have improved our understanding of the genetic background of human diseases. However, in spite of these successes in GWAS, causal genetic variants identified by GWAS explain only a small proportion of the heritability [2,3]. Various reasons, including the common disease/rare variant hypothesis, have been put forward to explain this so-called missing heritability [4]. However, the missing heritability is partially attributable to a large number of false negative findings induced by insufficient sample sizes when controlling for multiple testing [5], and various strategies, such as GWAS using multiple phenotypes or longitudinal data [6,7], have been considered to overcome these problems. The analysis of multiple phenotypes can suffer from their inherent heterogeneity, but the analysis of the multiple measures of the same phenotype provided by longitudinal data may avoid this issue and, if measurement errors are substantial, GWAS with longitudinal data can be expected to mitigate the sample size problem.Even though there are few GWAS using longitudinal data [8,9,10], compared to cross-sectional data longitudinal data have various useful features. First, although phenotyping is sometimes more expensive than the cost of genotyping, in those situations where the cost of genotyping is more expensive than that of phenotyping, repeated measurements at different time points have the virtual effect of enlarging the sample size. Second, with longitudinal data, the total phenotypic variance can be decomposed into among-subject and within-subject components. Third, phenotypes at different time points can be compared with baseline phenotypes and any confounding effect due to age can be prevented. Fourth, the onset of some diseases is sometimes affected by genetic variants, and gene × age interaction can be estimated with better accuracy. In this report, we conducted GWAS with longitudinal data in the Korean Association Resource (KARE) cohort. Phenotypes in the KARE cohort were measured every two years from 2001 to 2005, and we performed GWAS for eight phenotypes with three repeated measurements: systolic blood pressure (SBP), diastolic blood pressure (DBP), fasting plasma glucose (GLU0), 2-h OGTT glucose (GLU120), height, body mass index (BMI), high-density lipoprotein (HDL) and aspartate aminotransferase (AST). Results from the longitudinal GWAS were compared with those from GWAS using cross-sectional data, and our results showed that GWAS using longitudinal data provided more significant results. We identified 12 novel variants associated with phenotypes: rs11067763 (near MED13L) for DBP; rs12991703 (near MARCO) and rs7197218 (in XYLT1) for GLU0; rs17178527 (in AK097143) for BMI; rs12292858 (in SIK3), rs11066280 (in HECTD4) and rs183786 (near ALDH1A2) for HDL; and rs10849915 (in CCDC63), rs3782889 (in MYL2), rs12229654 (near MYL2-CUX2), rs11066280 (in HECTD4) and rs2072134 (in OAS3) for log-transformed AST. These variant associations were found to replicate in the Health Examinee (HEXA) cohort and thus illustrate the practical value of a longitudinal data analysis.
2. Materials and Methods
2.1. The Korean Association Resource (KARE) Cohort
The KARE cohort consists of a total of 10,038 individuals (5018 and 5020 individuals from Ansung and Ansan, respectively). Participants ranged from 40 to 69 years old, and their phenotypes were consecutively measured with two-year intervals from 2001 to 2005. Among the 10,038 participants, 10,004 individuals were genotyped for 500,568 SNPs with the Affymetrix Genome-Wide Human SNP array 5.0. Individuals and SNPs with call rates less than 95% were excluded from the analysis. SNPs with p-values for Hardy-Weinberg equilibrium (HWE) less than 10−6, or with minor allele frequencies (MAF) less than 0.01, were eliminated. Furthermore, individuals with tumors, gender inconsistencies, or whose heterozygosity rates were more than 30%, or identity in state (IBS) more than 0.8, were excluded from the analysis [11]. In total, 8842 individuals with 352,228 SNPs were available at the baseline time-point. At the second and third time-points, there were some missing phenotypes, and phenotypes for 7568 and 6675 individuals, respectively, were available.
2.2. The Health Examinee (HEXA) Cohort
Independent individuals in the HEXA cohort were from a second population based cohort sample provided by the Health study. This study combines subjects from the Wonju, Pyeong Chang, Gangneung, Geumsan, and Naju regional cohorts in Korea. There are 120,000 participants in the HEXA cohort, 4302 of whom, between 40 and 69 years old, were randomly selected for genotyping with the Affymetrix Genome-Wide Human SNP array 6.0. Individuals in the HEXA cohort were used to replicate the significant findings found in the KARE cohort, and individuals with tumors, large heterozygosity rates, gender inconsistencies, evidence of non-Asian ancestry, or whose IBS was more than 0.8 or call rates were less than 95%, were excluded from analysis [12]. SNPs with p-values for HWE less than 10−6, genotype call rates less than 95%, or MAF less than 0.01, were excluded, and the remaining SNPs for 3703 individuals were used for the analysis. In particular, the HEXA cohort is a cross-sectional study, and if there are some SNPs which have a progressing effect on phenotypes, results from HEXA and KARE cohort could be heterogeneous.
2.3. Sample Size Calculation for a Longitudinal Study
Sample sizes required to achieve 0.8 power at the 10−8 significance level were calculated in both the presence and absence of population substructure. We denote the number of individuals and measurements for each individual by n and t, respectively. We assume that t is same for all individuals. The additively coded value of the genotype for individual i is denoted by X. We assume Hardy-Weinberg equilibrium and each individual’s genotype is assumed to follow a trinomial distribution. The effect of the disease allele is assumed to be β. We assume a matrix of environmental effects that does not contain time-varying covariates for individual i, denoted by Z, and its coefficient is assumed to be the column vector α. The columns of Z denote each covariate. The phenotype for individual i at time-point j is denoted by Y, and the corresponding t-dimensional vector by Y. Letting 1 be the t-dimensional column vector with elements 1, Y was assumed to be:
where indicates the random effect which explains the similarity of repeated measurements for each individual attributable to a non-polygenic effect. We denote σ2 = σ2 + σ2 + σ2 and ρ = σ2/σ2. Thus ρ measures the proportion of the variance of c relative to the phenotypic variance. In particular, we included in the model to provide a polygenic effect for individual i, and assumed as an approximation follows the multivariate normal distribution with mean vector 0 and variance-covariance matrix where Φ corresponds to the genetic relationship matrix. If
is 0, the correlation matrix R of Y becomes compound symmetric. The null hypothesis H0 is β = 0, and β is assumed to be β under the alternative hypothesis.For sample size calculation, we assumed that there is no covariate effect other than the genotypes, and then we could assume that the environmental variable is 1. Then if we let π = P(X = δ, Z = 1) = P(X = δ) for the coded genotype δ where δ1 = 0, δ2 = 1 and δ3 = 2, and K = π1(π2 + 2π3)2 + π2(π3 − π1)2 + π3(1 + π1 − π3)2, the required sample size for 1 − ϕ power at the significance level α can be derived to be:
(see the Appendix for the detailed derivation). Liu and Liang [13] derived the required sample sizes when X is binary and our results are based on their derivations. We extend their result to X with arbitrary many levels. For n, we assume that σ2 + σ 2 = 1 and:In the presence of population substructure, σ2 is assumed to be larger than 0. The required sample size cannot be directly calculated, and we calculated n by simulation studies based on a Monte Carlo method. In our simulations, we assumed that σ2 + σ2 + σ 2 = 1 and the effect of disease the allele, β, was calculated with the following assumptions:
where these equations indicate that the proportion of genetic variance explained by the causal genotype is 0.005 and heritability is 0.3. For convenience, Φ was assumed to be a compound symmetric matrix with off-diagonal elements 0.1. The off-diagonal elements are asymptotically equivalent to twice the kinship coefficient between two individuals, and 0.1 means that the individuals are genetically remote relatives.
2.4. Genome-Wide Association Studies (GWAS) Using Longitudinal and Cross-Sectional Data
In the Korean Association Resource (KARE) data, eight phenotypes (SBP, DBP, GLU0, GLU120, height, BMI, HDL, and AST) were observed every two years from 2001 to 2005, so that three observations were available for each phenotype. Genotypes and phenotypes for 8842 individuals were initially available. However, there were some missing phenotypes for follow-up observations, and only 7568 and 6675 individuals were observed for the second and third time-points, respectively. Reasons for dropout were not known, but may include death, immigration, and non-response.GWAS using longitudinal data were performed by generalized least squares using the nlme package in the R software. The phenotype for individual i is denoted by Y which is a three dimensional vector. The matrix Z indicates a covariate vector for environmental effects, including the intercept as the first column, and sex, age and age2 at the first time-point were included as covariates for all eight phenotypes. In particular, weight is known to be related to glucose levels, and thus it was included as an additional covariate for the GWAS of GLU0 and GLU120 [14,15]. The coefficient vector of Z is denoted by a vector α. The effect of the time interval can be understood as the effect of aging, and it was denoted by the vector w. Here, w is a three-dimensional row vector and its coefficient is η. The population substructure between individuals was adjusted for with the EIGENSTRAT approach [16] and the remaining potential bias unadjusted by EIGENSTRAT was further adjusted for by the genomic control method [17]. In particular, the IBS matrix is often better than the identity-by-descent matrix for capturing the long-distance relationships that result from variations at the population level [18] and we used the IBS matrix for EIGENSTRAT. The first five principal component (PC) scores accounted for 75% of the variation in the IBS matrix, and they were used as covariates to adjust for any population substructure. The PC score vectors for individual i and its coefficient vector are PC and γ, respectively. The additively coded value of the genotype for individual i is denoted by X. The effect of the disease allele is assumed to be β. The variance-covariance matrix for ε is denoted Σ and assumed to be an unstructured symmetric matrix. For longitudinal analysis, our final model is:
where , is distributed as for i = 1,2, …, 8842.Furthermore, we conducted GWAS using cross-sectional data for comparison with the GWAS using longitudinal data. For this we took the phenotypes at the first-time point in the KARE cohort, and the GWAS were conducted with linear regression. For the cross-sectional analysis, we used the same covariates except for time interval, with the linear model:
where is distributed as for i = 1,2, …, 8842. The results from longitudinal and cross-sectional data were compared.We tested whether there exist any interaction effects between our significant findings and environmental variables by adding interaction terms as covariates. We considered age, sex and time interval as environmental variables, and their statistical interaction with the SNPs were tested. Significant results were further tested for replication in the HEXA cohort (other than time interval, because the HEXA cohort only has cross-sectional data). Finally, we tested all the significant findings from the KARE cohort, as a discovery dataset, in the HEXA cohort, as a replication dataset.
3. Results
3.1. Sample Size for a Longitudinal Cohort Design
We calculated the sample size required to achieve 0.8 power at the genome-wide significance level α = 10−8 in both the absence and presence of population substructure. Figure 1 and Figure 2 respectively show the required sample size n, in the absence and presence of population substructure, as a function of the number of time points t and the common correlation ρ between phenotype measures on the same person. We found the required sample size n is proportionally related to t and inversely related to ρ. Sample size is minimized for small ρ and large t, and the effect of t on sample size is maximal when ρ = 0. These results illustrate the practical efficiency of GWAS with longitudinal data. For instance, if ρ = 0.4 and t = 3, then 5158 individuals are sufficient to achieve 0.8 power at the genome-wide significance level and, compared to cross-sectional data, genotyping costs for 3438 individuals can be saved vs. the cost of obtaining 6878 phenotypes.
Figure 1
Required sample size in the absence of population substructure. The sample size is indicated by n. The required sample size to achieve 0.8 power at the significance level α = 10−8 has been calculated as a function of t, the number of time points, and ρ, the correlation between measurements at two different time-points.
Figure 2
Required sample size in the presence of population substructure. The sample size is indicated by n. The required sample size to achieve 0.8 power at the significance level α = 10−8 has been calculated as a function of t, the number of time points, and ρ, the correlation between measurements at two different time-points.
Required sample size in the absence of population substructure. The sample size is indicated by n. The required sample size to achieve 0.8 power at the significance level α = 10−8 has been calculated as a function of t, the number of time points, and ρ, the correlation between measurements at two different time-points.
3.2. Genome-Wide Association Studies (GWAS) with Longitudinal Data for Eight Phenotypes
Table 1 and Table 2 provide descriptive statistics for sex, age and other available phenotypes from the KARE and HEXA cohorts. These show that the distributions of phenotypes are similar in the HEXA cohort the KARE cohort at each time point. We checked the normality of the eight phenotypes with histograms. In particular, AST was not normally distributed and so was log-transformed. Figure 3 shows that log-transformed AST and the other seven phenotypes on the original scale are about normally distributed, so these were used for the GWAS.
Table 1
Sample sizes for Korean Association Resource (KARE) cohort.
Time Point
KARE
1
2
3
N(Ansan/Ansung)
Age(s.d)
N(Ansan/Ansung)
Age(s.d)
N(Ansan/Ansung)
Age(s.d)
Male
2374/1809
51.78(8.79)
1967/1642
53.71(8.82)
1758/1424
55.58(8.71)
Female
2263/2396
52.61(9.02)
1764/2213
54.60(8.99)
1543/1950
56.48(8.90)
Total
4637/4205
52.22(8.92)
3731/3855
54.18(8.92)
3301/3374
56.05(8.82)
Table 2
Descriptive statistics for eight quantitative phenotypes examined in the Korean Association Resource (KARE) and Health Examinee (HEXA) cohorts.
Time Point
KARE Cohort
HEXA Cohort
1
2
3
Phenotype
Mean(s.d)
N
Mean(s.d)
N
Mean(s.d)
N
Mean(s.d)
N
SBP
121.65(18.61)
8842
118.6(17.3)
7504
116.6(16.62)
6646
121.69(14.36)
3703
DBP
80.26(11.46)
8842
78.49(10.96)
7504
77.69(10.25)
6646
77.05(9.84)
3703
GLU0
87.66(21.88)
8581
92.74(15.14)
6688
92.31(15.15)
5985
94.10(24.56)
3703
GLU120
126.76(51.03)
8387
125.77(41.59)
4865
134.07(50.56)
5985
Not available
height
160(8.67)
8842
159.93(8.74)
7461
159.95(8.76)
6596
161.49(8.10)
3703
BMI
24.6(3.12)
8838
24.59(3.09)
7456
24.52(3.05)
6596
23.96(2.88)
3703
HDL
44.65(10.09)
8841
46.27(9.90)
7495
44.04(10.25)
6640
54.60(13.27)
3703
AST
29.81(18.41)
8841
24.67(14.95)
7495
25.87(19.02)
6640
24.51(12.94)
3703
Figure 3
Histograms for SBP, DBP, GLU0, GLU120, HEIGHT, BMI, HDL and log AST in the KARE cohort.
Required sample size in the presence of population substructure. The sample size is indicated by n. The required sample size to achieve 0.8 power at the significance level α = 10−8 has been calculated as a function of t, the number of time points, and ρ, the correlation between measurements at two different time-points.Sample sizes for Korean Association Resource (KARE) cohort.Descriptive statistics for eight quantitative phenotypes examined in the Korean Association Resource (KARE) and Health Examinee (HEXA) cohorts.The results of the GWAS with longitudinal data in the KARE cohort were compared with the results from GWAS using cross-sectional data. For the cross-sectional data we used the phenotypes at the first time-point, applying linear regression. Population substructure was adjusted for with the EIGENSTRAT method, and five principal component (PC) scores were included as covariates in both the longitudinal and cross-sectional data analyses. We found that five PC scores explain roughly 75% of the kinship matrix, and Table 3 shows the estimated variance inflation factors, λ, obtained by genomic control [17]. The estimated variance inflation factors from the longitudinal data analyses were always slightly larger than those from the cross-sectional data analyses, which suggests that longitudinal data analysis tends to be more sensitive to population substructure. Even though more detailed sensitivity analyses are necessary to confirm whether the model assumption for longitudinal data analyses are satisfied, our findings are probably not affected by population substructure because the quantile-quantile (QQ) and Manhattan plots for the eight phenotypes in Supplementary Figures 1–4 consistently show the validity of our analysis.
Table 3
Inflation factors by genomic control.
Phenotype
Cross-Sectional Data
Longitudinal Data
SBP
1.040
1.051
DBP
1.028
1.053
GLU0
1.022
1.026
GLU120
1.026
1.037
height
1.069
1.071
BMI
1.046
1.052
HDL
1.034
1.039
log AST
1.023
1.030
Histograms for SBP, DBP, GLU0, GLU120, HEIGHT, BMI, HDL and log AST in the KARE cohort.Inflation factors by genomic control.We calculated the correlations between the different time-points for each phenotype and they are presented in Table 4. The correlations for height and BMI are usually very large and those for log AST are the smallest. Therefore, the improvement in power on using longitudinal data is expected to be the most substantial for log AST, and it seems to be almost negligible for height and BMI. Table 5 and Table 6 show the results from GWAS using longitudinal data and cross-sectional data in the KARE cohort, and the significant results were further tested in the HEXA cohort. The cross-sectional data for the KARE cohort are the first measurements for each individual in the longitudinal data. Cross-sectional and longitudinal data in the KARE cohort were analyzed with linear regression and a linear mixed model, respectively, and SNPs with p-values from either the cross-sectional or longitudinal data analysis less than 10−6 were selected for the replication studies. For the discovery analyses, the genome-wide significance level by Bonferroni correction is 1.4E − 07. For replication, we calculated the one-sided p-value for the direction from the longitudinal analysis using the KARE data, and used 0.05 as the significance level. Whenever the results from the two cohorts were in different directions, the p-values from the HEXA cohort were larger than 0.5. In Table 5 and Table 6, we added results from previous studies. If a SNP has not been significantly reported but SNPs in genes in linkage disequilibrium with it have been significantly reported, those SNPs are denoted by “*”. Table 5 and Table 6 show that GWAS using the longitudinal data in the KARE cohort identified 29 significant SNPs, 20 of which have been reported in previous GWAS, while the cross-sectional data identified only 19 genome-wide significant SNPs. Therefore we can conclude that the longitudinal data lead to substantial power improvement. In our GWAS using the longitudinal data, nine SNPs were newly detected, six of which were significantly replicated in the HEXA cohort.
Table 4
Correlations between different time-points for each phenotype.
Time point Phenotype
Correlation between time-points
1–2
2–3
1–3
SBP
0.608
0.604
0.552
DBP
0.550
0.596
0.517
GLU0
0.700
0.795
0.822
GLU120
0.675
0.748
0.715
height
0.984
0.985
0.984
BMI
0.942
0.941
0.916
HDL
0.690
0.684
0.667
log AST
0.444
0.432
0.468
Table 5
Results for SBP, DBP, GLU0 and GLU120. SNPs with p-values less than 10−6 from cross-sectional or longitudinal data are listed.
SNP
Chr
Position
Nearby Gene
Minor Allele
MAF
Discovery
Replication
Previously Published
Cross-Sectional
Longitudinal
Cross-Sectional
beta ± s.e
P
beta ± s.e
P
beta ± s.e
one-side P
SBP
rs17249754
12
88584717
ATP2B1
A
0.3732
−1.63 ± 0.27
9.73E − 10
−1.27 ± 0.22
1.11E − 08
−0.86 ± 0.49
5.30E − 03
Cho et al. NG 2009 [11]
rs11066280
12
111302166
in HECTD4
T
0.1717
−1.45 ± 0.34
2.52E − 05
−1.59 ± 0.29
2.95E − 08
−1.65 ± 0.43
6.35E − 05
Kato et al. NG 2011 [27]
rs2401887
14
89952963
CALM1
C
0.02125
−3.19 ± 0.93
5.89E − 04
−3.84 ± 0.78
8.58E − 07
DBP
rs10030362
4
102841866
in BANK1
C
0.2081
−0.72 ± 0.21
4.34E − 04
−0.87 ± 0.17
3.21E − 07
−0.22 ± 0.27
2.11E-01
* Zhang et al. Hypertension Res 2012 [43]
rs3025047
6
43854388
in VEGFA
A
0.01024
−2.65 ± 0.2
1.75E − 03
−3.64 ± 0.71
3.07E − 07
rs7100467
10
108153198
SORCS1
T
0.02356
−2.43 ± 0.53
1.81E − 04
−2.82 ± 0.55
3.43E − 07
rs17249754
12
88584717
ATP2B1
A
0.3732
−0.94 ± 0.17
4.33E − 08
−0.8 ± 0.14
2.01E − 08
−0.56 ± 0.23
8.50E − 03
Cho et al. NG 2009 [11]
rs11066280
12
111302166
in HECTD4
T
0.1717
−0.94 ± 0.22
1.97E − 05
−0.98 ± 0.18
9.25E − 08
−0.76 ± 0.38
5.35E − 03
Kato et al. NG 2011 [27]
rs11067763
12
114682724
MED13L
G
0.3297
−0.78 ± 0.18
1.04E − 05
−0.79 ± 0.15
8.75E − 08
0 ± 0.35
5.04E − 01
GLU0
rs12991703
2
119536716
MARCO
A
0.05655
3.62 ± 0.68
1.18E − 07
2.51 ± 0.58
1.55E − 05
−1.14 ± 0.7
9.11E − 01
rs7754840
6
20769229
in CDKAL1
C
0.4761
1.8 ± 0.32
1.72E − 08
1.78 ± 0.27
5.16E − 11
0.98 ± 0.74
3.99E − 02
Kwak et al. Diabetes 2012 [44]
rs9460546
6
20771611
in CDKAL1
G
0.4808
1.75 ± 0.32
3.38E − 08
1.76 ± 0.27
3.76E − 11
rs2191346
7
15020403
DGKB
C
0.2891
−1.72 ± 0.36
1.33E − 06
−1.53 ± 0.3
3.64E − 07
−0.62 ± 0.62
1.55E − 01
rs6494306
15
VPS13C
A
0.3435
−1.43 ± 0.33
1.71E − 05
−1.46 ± 0.28
1.92E − 07
−1.21 ± 0.59
2.08E − 02
* Manning et al. NG 2012 [45]
rs7197218
16
17319136
in XYLT1
G
0.01456
7.23 ± 0.68
1.23E − 07
4.29 ± 1.17
2.48E − 04
GLU120
rs7754840
6
20769229
in CDKAL1
C
0.4761
4.73 ± 0.78
1.51E − 09
4.73 ± 0.73
1.10E − 10
Kwak et al. Diabetes 2012 [44]
rs12229654
12
109898844
MYL2-CUX2
G
0.1426
−4.84 ± 1.11
1.21E − 05
−5.16 ± 1.03
5.84E − 07
Go et al. J Hum Genet 2013 [46]
rs2074356
12
111129784
in HECTD4
T
0.1467
−5.19 ± 1.09
2.02E − 06
−5.2 ± 1.02
3.44E − 07
Go et al. J Hum Genet 2013 [46]
rs6031492
20
42330963
in GDAPL1L
G
0.4949
3.84 ± 0.77
6.91E − 07
3.03 ± 0.72
2.81E − 05
rs2868088
20
42347066
GDAPL1L
A
0.4377
−3.99 ± 0.78
2.68E − 07
−3.54 ± 0.72
1.04E − 06
Table 6
Results for Height, BMI, HDL and log AST. SNPs with p-values less than 10−6 from cross-sectional or longitudinal data are listed.
SNP
Chr
Position
Nearby gene
Minor Allele
MAF
Discovery
Replication
Previously Published
Cross-sectional
Longitudinal
Cross-Sectional
beta ± s.e
P
beta ± s.e
P
beta ± s.e
one-side P
Height
rs17038182
1
118669928
SPAG17
G
0.4188
−0.45 ± 0.08
4.08E − 08
−0.45 ± 0.08
5.58E − 08
−0.13 ± 0.13
1.53E − 01
Cho et al. Nat Genet 2009 [11]
rs10513137
3
142626120
in ZBTB38
A
0.2605
0.49 ± 0.09
8.14E − 08
0.49 ± 0.09
5.85E − 08
0.43 ± 0.14
7.40E − 04
Kim et al. J Hum Genet 2009 [47]
rs6918981
6
34346492
RPL35P2-NUDT3
G
0.2092
0.55 ± 0.1
2.98E − 08
0.55 ± 0.1
1.72E − 08
0.1 ± 0.15
2.51E − 01
Kim et al. J Hum Genet 2009 [47]
BMI
rs17178527
6
141947773
in AK097143
A
0.2486
−0.32 ± 0.05
2.96E − 09
−0.31 ± 0.05
6.35E-09
0.05 ± 0.08
7.47E − 01
rs11000212
10
73625658
in ANAPC16 in ASCC1
G
0.2057
0.27 ± 0.06
1.85E − 06
0.28 ± 0.06
5.14E − 07
0.05 ± 0.08
2.90E − 01
rs9939609
16
52378028
in FTO
T
0.1262
0.34 ± 0.07
1.29E − 06
0.34 ± 0.07
7.36E − 07
0.23 ± 0.1
1.29E − 02
Cho et al. Nat Genet 2009 [11]
HDL
rs271
8
19857982
in LPL
T
0.2064
1.15 ± 0.19
4.84E − 10
1.12 ± 0.17
2.15E − 11
1.81 ± 0.38
7.70E − 07
rs17482753
8
19876926
LPL
T
0.1243
1.95 ± 0.23
8.83E − 18
1.91 ± 0.21
1.42E − 20
3.48 ± 0.46
2.71E − 14
Heid et al. Circ Cardiovasc Genet 2008 [48]
rs17410962
8
19892360
LPL
A
0.1244
1.95 ± 0.23
8.25E − 18
1.91 ± 0.21
1.76E − 20
3.48 ± 0.46
2.35E − 14
rs12686004
9
106693247
in ABCA1
T
0.2136
−1.26 ± 0.2
7.01E − 12
−1.37 ± 0.17
1.62E − 16
−1.19 ± 0.32
6.10E − 04
Kim et al. Nat Genet 2011 [12]
rs11216126
11
116122450
BUD13
C
0.2027
1.43 ± 0.19
2.69E − 14
1.36 ± 0.17
1.54E − 15
1.44 ± 0.5
7.45E − 05
Kim et al. Nat Genet 2011 [12]
rs6589566
11
116157633
in ZNF259
C
0.2176
−1.25 ± 0.18
1.10E − 11
−1.15 ± 0.17
4.47E − 12
−1.89 ± 0.32
8.15E − 08
* Waterworth et al. Arteriosclear Thromb Vasc Biol 2010 [49]
rs12292858
11
116319189
in SIK3
C
0.1759
1.05 ± 0.2
7.73E − 08
0.88 ± 0.18
7.68E − 07
0.9 ± 0.35
1.11E − 02
rs12229654
12
109898844
MYL2-CUX2
G
0.1426
−1.25 ± 0.24
6.42E − 09
−1.21 ± 0.2
7.35E − 10
−1.66 ± 0.46
1.25E − 04
Kim et al. Nat Genet 2011 [12]
rs2238153
12
110423930
in ATXN2
A
0.4579
−0.68 ± 0.16
8.76E − 06
−0.71±0.14
3.29E − 07
rs11066280
12
111302166
in HECTD4
T
0.1717
−1.4 ± 0.15
3.10E − 12
−1.35±0.18
1.17E − 13
−1.92 ± 0.4
8.95E − 07
rs2072134
12
111893559
in OAS3
A
0.1143
−1.39 ± 0.19
4.53E − 09
−1.31 ± 0.22
1.25E − 09
−1.36 ± 0.4
2.33E − 03
Kim et al. Nat Genet 2011 [12]
rs183786
15
56455402
ALDH1A2
T
0.305
−0.8 ± 0.16
8.53E − 07
−0.83 ± 0.15
2.33E − 08
−0.61 ± 0.33
3.13E-02
rs16940212
15
56481312
LIPC
T
0.3405
1.27 ± 0.16
1.05E − 15
1.3 ± 0.14
1.95E − 19
1.11 ± 0.47
2.04E − 04
Kim et al. Nat Genet 2011 [12]
rs6494005
15
56511816
in LIPC
G
0.2678
−0.89 ± 0.2
1.34E − 07
−0.89 ± 0.15
5.82E − 09
−0.79 ± 0.39
1.05E − 02
rs12708980
16
55569880
in CETP
C
0.0984
−1.67 ± 0.25
3.63E − 11
−1.65 ± 0.23
5.61E − 13
−1.88 ± 0.5
7.40E − 05
Kim et al. Nat Genet 2011 [12]
rs2156552
18
45435666
LIPG
A
0.164
−0.89 ± 0.21
1.53E − 05
−0.93 ± 0.19
5.90E − 07
−1.15 ± 0.4
2.29E − 03
Waterworth et al. Arteriosclear Thromb Vasc Biol 2010 [49]
rs4420638
19
50114786
APOC1
C
0.1121
−1.3 ± 0.16
4.21E − 08
−1.14 ± 0.21
1.23E − 07
−2.01 ± 0.47
1.88E − 05
Willer et al. Nat Genet 2013 [50]
AST
rs9837421
3
15322297
in SH3BP5
G
0.193
−0.02 ± 0.01
2.14E − 04
−0.03 ± 0.01
5.79E − 07
0 ± 0
5.32E − 01
rs10849915
12
109818005
in CCDC63
G
0.1758
−0.03 ± 0.01
2.00E − 06
−0.03 ± 0.01
1.81E − 08
−0.02 ± 0
3.68E − 02
rs3782889
12
109835038
in MYL2
C
0.1726
−0.03 ± 0.01
3.79E − 06
−0.03 ± 0.01
7.26E − 09
−0.02 ± 0
2.02E − 02
rs12229654
12
109898844
MYL2-CUX2
G
0.1426
−0.04 ± 0.01
7.34E − 08
−0.04 ± 0.01
4.74E − 11
−0.02 ± 0
2.80E − 02
rs11066280
12
111302166
in HECTD4
T
0.1717
−0.05 ± 0.01
8.17E − 13
−0.05 ± 0.01
1.70E − 18
−0.03 ± 0
1.94E − 04
rs11066453
12
111850004
in OAS1
G
0.1265
−0.03 ± 0.01
5.26E − 06
−0.03±0.01
8.72E − 07
−0.02 ± 0
7.75E − 02
rs2072134
12
111893559
in OAS3
A
0.1143
−0.04 ± 0.01
7.31E − 08
−0.04 ± 0.01
8.42E − 09
−0.02 ± 0
6.75E − 02
rs12483959
22
42657329
in PNPLA3
A
0.4157
0.03 ± 0
1.79E − 09
0.03 ± 0
1.02E − 12
0.03 ± 0
2.14E − 06
* Kamatani et al. Nat Genet 2010 [38]
rs2143571
22
42723019
in SAMM50
T
0.4136
0.02 ± 0.01
6.45E − 07
0.03 ± 0
7.73E − 10
0.03 ± 0
4.22E − 05
* Kawaguchi et al. PLoS One 2012 [39]
Correlations between different time-points for each phenotype.Table 5 shows that rs2401887 located in CALM1 is more significantly associated with SBP in the longitudinal data analysis. GWAS of DBP identified three significant SNPs; rs3025047 in the VEGFA gene, rs7100467 near SORCS1 and rs11067763 near MED13L. It has been reported that VEGFA is related to type-2 diabetes, coronary artery disease, age-related macular degeneration and body fat [19,20,21]. For GLU0, rs12991703 located near the MARCO gene was genome-wide significant using the cross-sectional data, and rs2191346 and rs6494306, which are respectively in linkage disequilibrium with DGKB and VPS13C, were more significant using the longitudinal data.rs7197218 in the XYLT1 gene, which is related to corneal astigmatism [22], was genome-wide significant using the cross-sectional data. rs6031492, located in GDAPL1L, is more significantly associated with GLU120 in the cross-sectional data analysis. Table 6 shows that we detected rs17178527 in AK097143 and rs11000212 in ANAPC16 as associated with BMI. rs12292858 in SIK3 was more significant using the cross-sectional data, and rs2238153 in ATXN2, rs11066280 in HECTD4 and rs183786 near ALDH1A2 were more significantly associated with HDL by longitudinal data analysis. ATXN2, HECTD4 and ALDH1A2 have been reported to have significant associations for phenotypes related to HDL [12,23,24,25,26,27,28,29,30,31,32,33,34], and the significant associations for HECTD4 and ALDH1A2 were successfully replicated in the HEXA cohort. We also performed GWAS of log-transformed AST, and Table 6 shows nine significant SNPs, rs9837421 in SH3BP5, rs10849915 in CCDC63, rs3782889 in MYL2, rs12229654 near MYL2-CUX2, rs11066280 in HECTD4, rs11066453 in OAS1, rs2072134 in OAS3, rs12483959 in PNPLA3 and rs2143571 in SAMM50. Previous studies have reported that SH3BP5, CCDC63, MYL2 and OAS3 are related to alcohol dependence phenotypes [35,36,37], and PNPLA3 and SAMM50 are related to nonalcoholic fatty liver disease [38,39], and so our results strengthen their importance in liver disease.We also performed association analysis to detect gene×environment interaction, and SNPs that interact with aging, sex and time interval were identified by using the longitudinal data in the KARE cohort. Table 7 and Table 8 list SNPs with p-values for gene×environment interaction less than 10−6. Table 7 shows that rs7197218 seems to be a promising candidate SNP for interaction with aging for GLU0, and Figure 4 shows that the age effects are substantially different for this SNP. However, the MAF of rs7197218 is 0.01456, and neither it nor any other SNPs that are in linkage disequilibrium with it, were found in the HEXA cohort. Thus the significant association of this SNP could not be confirmed and it will need to be further investigated in follow-up studies. Table 8 shows that rs2074356 and rs11066280 interact significantly with sex for HDL, and rs2074356, rs11066280 and rs12229654 do so for log-transformed AST. Interestingly, rs2074356 and rs11066280 have significant interaction effects with sex for both HDL and AST. We further confirmed these significant gene×environment interactions in the HEXA cohort. Based on the direction of the coefficients for these interactions, we calculated one-sided p-values, and the combined p-values by Fisher’s and Liptak’s methods [40,41]. It has been shown that the most efficient method is achieved by Liptak’s methods if the effect sizes are expected to be the same [42]. Table 9 shows that these significant interactions were further replicated in the HEXA cohort, and the combined p-values become smaller. Figure 4 shows that the effects of these SNPs are substantially different for males and females and, therefore, we can conclude that the effects of these SNPs are significantly different for males and females.
Table 7
Gene × environment interaction effect in the KARE cohort. Interactions of time interval with SNP were tested, and p-values for SNPs with genome-wide significant interaction are listed.
Effect
rs7197218 for GLU0
beta
Std.Error
p-value
SNP
5.92
1.21
1.08E − 06
time×SNP
−1.27
0.25
2.65E − 07
Table 8
Gene × environment interaction effect in the KARE cohort. Interactions of sex with SNPs were tested, and p-values for SNPs with genome-wide significant interaction are listed.
Effect
rs2074356 for HDL
rs11066280 for HDL
rs2074356 for Log(AST)
rs11066280 for Log(AST)
rs12229654 for Log(AST)
beta
Std.Error
p-value
beta
Std.Error
p-value
beta
Std.Error
p-value
beta
Std.Error
p-value
Beta
Std.Error
p-value
SNP
−4.80
0.62
7.28E − 15
−4.69
0.58
4.60E − 16
−0.13
0.02
5.52E − 13
−0.17
0.02
2.81E − 20
−0.15
0.02
1.32E − 19
sex×SNP
2.26
0.39
4.46E − 09
2.21
0.36
1.06E − 09
0.06
0.01
5.82E − 08
0.08
0.01
8.25E − 12
0.07
0.01
3.24E − 11
Figure 4
Interaction effects of SNPs with sex or time interval. (a) Mean of GLU0 at each time-point for each of two rs7197218 genotypes (circles indicate homozygous genotypes with no minor alleles, triangles indicate heterozygous genotypes); (b-f) phenotypic mean of each genotype for males and females (blue and red lines indicate male and female, respectively).
Table 9
Combined p-values for gene × environment interaction. For replication, interactions of sex with SNPs were tested in the HEXA cohort and a combined p-value was calculated using both Fisher’s and Liptak’s methods.
Sex × SNP
HEXA Cohort p-value
Combined p-value Using Fisher’s Method
Combined p-value Using Liptak’s Method
rs2074356 for HDL
3.58E − 01
3.39E − 08
2.52E − 07
rs11066280 for HDL
2.184E − 01
5.37E − 09
2.52E − 08
rs2074356 for Log(AST)
4.86E − 02
5.86E − 08
4.41E − 08
rs11066280 for Log(AST)
1.23E − 02
3.13E − 12
3.10E − 12
rs12229654 for Log(AST)
7.42E − 03
7.22E − 12
4.96E − 12
In summary, we can conclude that GWAS with longitudinal data provide an efficient strategy, and our overall results show that the improvement in power is substantial, its effect being inversely proportional to ρ.Results for SBP, DBP, GLU0 and GLU120. SNPs with p-values less than 10−6 from cross-sectional or longitudinal data are listed.Results for Height, BMI, HDL and log AST. SNPs with p-values less than 10−6 from cross-sectional or longitudinal data are listed.Gene × environment interaction effect in the KARE cohort. Interactions of time interval with SNP were tested, and p-values for SNPs with genome-wide significant interaction are listed.Gene × environment interaction effect in the KARE cohort. Interactions of sex with SNPs were tested, and p-values for SNPs with genome-wide significant interaction are listed.Interaction effects of SNPs with sex or time interval. (a) Mean of GLU0 at each time-point for each of two rs7197218 genotypes (circles indicate homozygous genotypes with no minor alleles, triangles indicate heterozygous genotypes); (b-f) phenotypic mean of each genotype for males and females (blue and red lines indicate male and female, respectively).
4. Discussion
It is well known that longitudinal analysis is useful to detect aging effects, and statistically efficient for detecting significant associations. In this report, we numerically calculated the sample sizes required to achieve statistical power at the genome-wide significance level, and our results showed that the power is proportionally related to the number of observations on each individual and inversely related to the correlation between the pairs of observations on an individual. In a large-scale genetic analysis, genotyping cost may be larger than the phenotyping cost, and then we can conclude that analyzing longitudinal data is an efficient strategy to improve the rate of false negative findings. However if the proportion of missing data is large, statistical power loss can be substantial; and if the missingness is not at random, even a small proportion of missing phenotypes can generate a serious bias [51]. In spite of the statistical efficiency of longitudinal data analysis, any possibility of potential bias from the missingness pattern should be carefully investigated; and it should be noted that a little carelessness can lead to a substantial bias.Combined p-values for gene × environment interaction. For replication, interactions of sex with SNPs were tested in the HEXA cohort and a combined p-value was calculated using both Fisher’s and Liptak’s methods.Furthermore, we performed GWAS with both longitudinal and cross-sectional data, and significant results from a longitudinal data analysis in the KARE cohort were further tested in the HEXA cohort. 12 SNPs that have not been reported elsewhere were identified, and the significant p-values from replication studies strengthened the possibility that they are causal. In particular, GWAS with longitudinal data showed that rs3025047 is significantly associated with DBP even though it is not significantly associated in GWAS with cross-sectional data. The MAF of rs3025047 is 0.01, so it is a variant with relatively low frequency. In the HEXA cohort, rs3025047 was not available, nor were any SNPs in linkage disequilibrium with it. Even though further studies are necessary to confirm whether rs3025047 is a true causal variant, our analysis results illustrate that GWAS using longitudinal data can be an efficient strategy for rare variant association analysis.During the last decade, more than ten thousand GWAS successfully identified disease susceptibility loci, and these findings increase our understanding of diseases. However, the so-called missing heritability [4] reveals that efficient analysis algorithms should be investigated, and GWAS of longitudinal data seem to provide a useful strategy that may bridge the gap.
5. Conclusions
Analyzed as a repeated measure design, the power of longitudinal data is proportionally related to the number of observations on each individual and inversely related to the correlation between the multiple observations on an individual. This facilitates finding causal SNPs and their interactions with environmental variables, as well as with age and sex. In two Korean cohorts it enabled us to find 12 novel genome-wide significant SNPs associated with eight phenotypes, and significant gene × environment interaction. Therefore, we can conclude that longitudinal data seem to provide efficient strategies for GWAS.
Authors: Robert J Klein; Caroline Zeiss; Emily Y Chew; Jen-Yue Tsai; Richard S Sackler; Chad Haynes; Alice K Henning; John Paul SanGiovanni; Shrikant M Mane; Susan T Mayne; Michael B Bracken; Frederick L Ferris; Jurg Ott; Colin Barnstable; Josephine Hoh Journal: Science Date: 2005-03-10 Impact factor: 47.728
Authors: Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich Journal: Nat Genet Date: 2006-07-23 Impact factor: 38.330
Authors: Iris M Heid; Eva Boes; Martina Müller; Barbara Kollerits; Claudia Lamina; Stefan Coassin; Christian Gieger; Angela Döring; Norman Klopp; Ruth Frikke-Schmidt; Anne Tybjaerg-Hansen; Anita Brandstätter; Andreas Luchner; Thomas Meitinger; H-Erich Wichmann; Florian Kronenberg Journal: Circ Cardiovasc Genet Date: 2008-10
Authors: Robert W Davies; George A Wells; Alexandre F R Stewart; Jeanette Erdmann; Svati H Shah; Jane F Ferguson; Alistair S Hall; Sonia S Anand; Mary S Burnett; Stephen E Epstein; Sonny Dandona; Li Chen; Janja Nahrstaedt; Christina Loley; Inke R König; William E Kraus; Christopher B Granger; James C Engert; Christian Hengstenberg; H-Erich Wichmann; Stefan Schreiber; W H Wilson Tang; Stephen G Ellis; Daniel J Rader; Stanley L Hazen; Muredach P Reilly; Nilesh J Samani; Heribert Schunkert; Robert Roberts; Ruth McPherson Journal: Circ Cardiovasc Genet Date: 2012-02-07
Authors: Yi Yu; Tushar R Bhangale; Jesen Fagerness; Stephan Ripke; Gudmar Thorleifsson; Perciliz L Tan; Eric H Souied; Andrea J Richardson; Joanna E Merriam; Gabriëlle H S Buitendijk; Robyn Reynolds; Soumya Raychaudhuri; Kimberly A Chin; Lucia Sobrin; Evangelos Evangelou; Phil H Lee; Aaron Y Lee; Nicolas Leveziel; Donald J Zack; Betsy Campochiaro; Peter Campochiaro; R Theodore Smith; Gaetano R Barile; Robyn H Guymer; Ruth Hogg; Usha Chakravarthy; Luba D Robman; Omar Gustafsson; Haraldur Sigurdsson; Ward Ortmann; Timothy W Behrens; Kari Stefansson; André G Uitterlinden; Cornelia M van Duijn; Johannes R Vingerling; Caroline C W Klaver; Rando Allikmets; Milam A Brantley; Paul N Baird; Nicholas Katsanis; Unnur Thorsteinsdottir; John P A Ioannidis; Mark J Daly; Robert R Graham; Johanna M Seddon Journal: Hum Mol Genet Date: 2011-06-10 Impact factor: 6.150
Authors: Cristen J Willer; Ellen M Schmidt; Sebanti Sengupta; Michael Boehnke; Panos Deloukas; Sekar Kathiresan; Karen L Mohlke; Erik Ingelsson; Gonçalo R Abecasis; Gina M Peloso; Stefan Gustafsson; Stavroula Kanoni; Andrea Ganna; Jin Chen; Martin L Buchkovich; Samia Mora; Jacques S Beckmann; Jennifer L Bragg-Gresham; Hsing-Yi Chang; Ayşe Demirkan; Heleen M Den Hertog; Ron Do; Louise A Donnelly; Georg B Ehret; Tõnu Esko; Mary F Feitosa; Teresa Ferreira; Krista Fischer; Pierre Fontanillas; Ross M Fraser; Daniel F Freitag; Deepti Gurdasani; Kauko Heikkilä; Elina Hyppönen; Aaron Isaacs; Anne U Jackson; Åsa Johansson; Toby Johnson; Marika Kaakinen; Johannes Kettunen; Marcus E Kleber; Xiaohui Li; Jian'an Luan; Leo-Pekka Lyytikäinen; Patrik K E Magnusson; Massimo Mangino; Evelin Mihailov; May E Montasser; Martina Müller-Nurasyid; Ilja M Nolte; Jeffrey R O'Connell; Cameron D Palmer; Markus Perola; Ann-Kristin Petersen; Serena Sanna; Richa Saxena; Susan K Service; Sonia Shah; Dmitry Shungin; Carlo Sidore; Ci Song; Rona J Strawbridge; Ida Surakka; Toshiko Tanaka; Tanya M Teslovich; Gudmar Thorleifsson; Evita G Van den Herik; Benjamin F Voight; Kelly A Volcik; Lindsay L Waite; Andrew Wong; Ying Wu; Weihua Zhang; Devin Absher; Gershim Asiki; Inês Barroso; Latonya F Been; Jennifer L Bolton; Lori L Bonnycastle; Paolo Brambilla; Mary S Burnett; Giancarlo Cesana; Maria Dimitriou; Alex S F Doney; Angela Döring; Paul Elliott; Stephen E Epstein; Gudmundur Ingi Eyjolfsson; Bruna Gigante; Mark O Goodarzi; Harald Grallert; Martha L Gravito; Christopher J Groves; Göran Hallmans; Anna-Liisa Hartikainen; Caroline Hayward; Dena Hernandez; Andrew A Hicks; Hilma Holm; Yi-Jen Hung; Thomas Illig; Michelle R Jones; Pontiano Kaleebu; John J P Kastelein; Kay-Tee Khaw; Eric Kim; Norman Klopp; Pirjo Komulainen; Meena Kumari; Claudia Langenberg; Terho Lehtimäki; Shih-Yi Lin; Jaana Lindström; Ruth J F Loos; François Mach; Wendy L McArdle; Christa Meisinger; Braxton D Mitchell; Gabrielle Müller; Ramaiah Nagaraja; Narisu Narisu; Tuomo V M Nieminen; Rebecca N Nsubuga; Isleifur Olafsson; Ken K Ong; Aarno Palotie; Theodore Papamarkou; Cristina Pomilla; Anneli Pouta; Daniel J Rader; Muredach P Reilly; Paul M Ridker; Fernando Rivadeneira; Igor Rudan; Aimo Ruokonen; Nilesh Samani; Hubert Scharnagl; Janet Seeley; Kaisa Silander; Alena Stančáková; Kathleen Stirrups; Amy J Swift; Laurence Tiret; Andre G Uitterlinden; L Joost van Pelt; Sailaja Vedantam; Nicholas Wainwright; Cisca Wijmenga; Sarah H Wild; Gonneke Willemsen; Tom Wilsgaard; James F Wilson; Elizabeth H Young; Jing Hua Zhao; Linda S Adair; Dominique Arveiler; Themistocles L Assimes; Stefania Bandinelli; Franklyn Bennett; Murielle Bochud; Bernhard O Boehm; Dorret I Boomsma; Ingrid B Borecki; Stefan R Bornstein; Pascal Bovet; Michel Burnier; Harry Campbell; Aravinda Chakravarti; John C Chambers; Yii-Der Ida Chen; Francis S Collins; Richard S Cooper; John Danesh; George Dedoussis; Ulf de Faire; Alan B Feranil; Jean Ferrières; Luigi Ferrucci; Nelson B Freimer; Christian Gieger; Leif C Groop; Vilmundur Gudnason; Ulf Gyllensten; Anders Hamsten; Tamara B Harris; Aroon Hingorani; Joel N Hirschhorn; Albert Hofman; G Kees Hovingh; Chao Agnes Hsiung; Steve E Humphries; Steven C Hunt; Kristian Hveem; Carlos Iribarren; Marjo-Riitta Järvelin; Antti Jula; Mika Kähönen; Jaakko Kaprio; Antero Kesäniemi; Mika Kivimaki; Jaspal S Kooner; Peter J Koudstaal; Ronald M Krauss; Diana Kuh; Johanna Kuusisto; Kirsten O Kyvik; Markku Laakso; Timo A Lakka; Lars Lind; Cecilia M Lindgren; Nicholas G Martin; Winfried März; Mark I McCarthy; Colin A McKenzie; Pierre Meneton; Andres Metspalu; Leena Moilanen; Andrew D Morris; Patricia B Munroe; Inger Njølstad; Nancy L Pedersen; Chris Power; Peter P Pramstaller; Jackie F Price; Bruce M Psaty; Thomas Quertermous; Rainer Rauramaa; Danish Saleheen; Veikko Salomaa; Dharambir K Sanghera; Jouko Saramies; Peter E H Schwarz; Wayne H-H Sheu; Alan R Shuldiner; Agneta Siegbahn; Tim D Spector; Kari Stefansson; David P Strachan; Bamidele O Tayo; Elena Tremoli; Jaakko Tuomilehto; Matti Uusitupa; Cornelia M van Duijn; Peter Vollenweider; Lars Wallentin; Nicholas J Wareham; John B Whitfield; Bruce H R Wolffenbuttel; Jose M Ordovas; Eric Boerwinkle; Colin N A Palmer; Unnur Thorsteinsdottir; Daniel I Chasman; Jerome I Rotter; Paul W Franks; Samuli Ripatti; L Adrienne Cupples; Manjinder S Sandhu; Stephen S Rich Journal: Nat Genet Date: 2013-10-06 Impact factor: 38.330
Authors: Paul F O'Reilly; Clive J Hoggart; Yotsawat Pomyen; Federico C F Calboli; Paul Elliott; Marjo-Riitta Jarvelin; Lachlan J M Coin Journal: PLoS One Date: 2012-05-02 Impact factor: 3.240
Authors: Bernadette Wendel; Sergi Papiol; Till F M Andlauer; Jörg Zimmermann; Jens Wiltfang; Carsten Spitzer; Fanny Senner; Eva C Schulte; Max Schmauß; Sabrina K Schaupp; Jonathan Repple; Eva Reininghaus; Jens Reimer; Daniela Reich-Erkelenz; Nils Opel; Igor Nenadić; Susanne Meinert; Carsten Konrad; Farahnaz Klöhn-Saghatolislam; Tilo Kircher; Janos L Kalman; Georg Juckel; Andreas Jansen; Markus Jäger; Maria Heilbronner; Martin von Hagen; Katrin Gade; Christian Figge; Andreas J Fallgatter; Detlef E Dietrich; Udo Dannlowski; Ashley L Comes; Monika Budde; Bernhard T Baune; Volker Arolt; Ion-George Anghelescu; Heike Anderson-Schmidt; Kristina Adorjan; Peter Falkai; Thomas G Schulze; Heike Bickeböller; Urs Heilbronner Journal: Transl Psychiatry Date: 2021-07-10 Impact factor: 6.222