Literature DB >> 31738745

A meta-analysis of genome-wide association studies of epigenetic age acceleration.

Jude Gibson1, Tom C Russ1,2,3,4, Toni-Kim Clarke1, David M Howard1, Robert F Hillary5, Kathryn L Evans4,5, Rosie M Walker4,5, Mairead L Bermingham5, Stewart W Morris5, Archie Campbell5,6, Caroline Hayward7, Alison D Murray8, David J Porteous4,5, Steve Horvath9,10, Ake T Lu9, Andrew M McIntosh1,4, Heather C Whalley1, Riccardo E Marioni4,5.   

Abstract

'Epigenetic age acceleration' is a valuable biomarker of ageing, predictive of morbidity and mortality, but for which the underlying biological mechanisms are not well established. Two commonly used measures, derived from DNA methylation, are Horvath-based (Horvath-EAA) and Hannum-based (Hannum-EAA) epigenetic age acceleration. We conducted genome-wide association studies of Horvath-EAA and Hannum-EAA in 13,493 unrelated individuals of European ancestry, to elucidate genetic determinants of differential epigenetic ageing. We identified ten independent SNPs associated with Horvath-EAA, five of which are novel. We also report 21 Horvath-EAA-associated genes including several involved in metabolism (NHLRC, TPMT) and immune system pathways (TRIM59, EDARADD). GWAS of Hannum-EAA identified one associated variant (rs1005277), and implicated 12 genes including several involved in innate immune system pathways (UBE2D3, MANBA, TRIM46), with metabolic functions (UBE2D3, MANBA), or linked to lifespan regulation (CISD2). Both measures had nominal inverse genetic correlations with father's age at death, a rough proxy for lifespan. Nominally significant genetic correlations between Hannum-EAA and lifestyle factors including smoking behaviours and education support the hypothesis that Hannum-based epigenetic ageing is sensitive to variations in environment, whereas Horvath-EAA is a more stable cellular ageing process. We identified novel SNPs and genes associated with epigenetic age acceleration, and highlighted differences in the genetic architecture of Horvath-based and Hannum-based epigenetic ageing measures. Understanding the biological mechanisms underlying individual differences in the rate of epigenetic ageing could help explain different trajectories of age-related decline.

Entities:  

Mesh:

Year:  2019        PMID: 31738745      PMCID: PMC6886870          DOI: 10.1371/journal.pgen.1008104

Source DB:  PubMed          Journal:  PLoS Genet        ISSN: 1553-7390            Impact factor:   5.917


Introduction

Ageing is associated with a decline in physical and cognitive health, and is the main risk factor for many debilitating and life-threatening conditions including cardiovascular disease, cancer, and neurodegeneration [1]. Ageing is a multi-dimensional construct, incorporating physical, psychosocial, and biological changes. Everyone experiences the same rate of chronological ageing, but the rate of ‘biological ageing’, age-related decline in physiological functions and tissues, differs between individuals. Various phenotypic and molecular biomarkers have been used to study biological ageing, including a number of 'biological clocks', the best known of which is telomere length. Telomeres shorten with increasing age, and telomere length has been found to predict morbidity and mortality [2]. More recently, research into epigenetics–chemical modifications to DNA without altering the genetic sequence–has yielded another method for measuring biological age. DNA methylation is an epigenetic modification, typically characterised by the addition of a methyl group to a cytosine-guanine dinucleotide (CpG) [3], that can influence gene expression and is associated with variation in complex phenotypes. This process is essential for normal development and is associated with a number of key processes including ageing. DNA methylation levels are dynamic, varying with age across the life course [4,5] and are influenced by both genetic and environmental factors [6]. Weighted averages of methylation at multiple CpG sites can be integrated into estimates of chronological age referred to as ‘epigenetic age’. Two influential studies have used this method to create ‘epigenetic clocks’, which accurately predict chronological age in humans. Hannum et al. used DNA methylation profiles from whole blood from two cohorts to identify 71 CpG sites that could be used to generate an estimate of age [7], while Horvath used data from 51 different tissue types from multiple studies to identify 353 CpG sites whose methylation levels can be combined to form an age predictor [8]. Hannum et al.’s clock is specific to blood samples, although it can be adjusted for different tissue types using linear models. The Horvath clock is widely applicable, with the same CpG set and the same algorithm being used irrespective of the DNA source. Although similar penalised regression models were used to select the CpG sites to be included in each of these epigenetic clocks, there is limited overlap in the CpGs included. The two measures are clearly related, but are thought to capture slightly different aspects of the biology of ageing [9]. The Hannum age estimator correlates with proportions of certain blood cells, reflecting its construction based on blood methylation data [9,10], and it is considered to track aspects of immunosenescence. The pan-tissue Horvath clock, constructed across a broad spectrum of tissue and cell types, is relatively uncorrelated with blood cell proportions [11], and is thought to capture cell-intrinsic changes in DNA methylation which might reflect an innate ageing process. Both the Hannum and Horvath epigenetic clocks are strongly correlated (r>0.95) with chronological age [7,8]. However, despite these high overall correlations, there can be substantial differences between epigenetic and chronological age at the individual level, and it is unclear what drives these differences. A greater epigenetic age relative to chronological age is commonly described as ‘epigenetic age acceleration’ (EAA), and implies that a person is biologically older than their years. EAA has been shown to be informative for both current and future health trajectories [9]. Recently, a growing number of studies have used EAA to investigate age-related disorders, and the epigenetic clock is increasingly being recognised as a valuable marker of biological ageing [10,12]. The simplest definition of epigenetic age acceleration is the residual that results from regressing epigenetic age on chronological age. However, it is well known that the abundance of different cell types in the blood changes with age [13,14], and hence two broad categories of EAA measures have been distinguished: those that are independent of age-related changes in blood cell composition, and those that incorporate and are enhanced by blood cell count information [10]. The former group, considered to reflect ‘pure’ epigenetic ageing effects that are not influenced by differences in blood cell counts, are often referred to as ‘intrinsic’ epigenetic age measures. The latter group up-weights the contributions of blood cell counts, thus leveraging known age-related changes to blood cell proportions to capture aspects of immunosenescence; these measures are referred to as ‘extrinsic’ epigenetic age measures. In keeping with previous work, this study focuses on two different epigenetic age measures, based on the Horvath and Hannum epigenetic clocks [7,8], and uses these to derive variations of EAA that are either independent of blood cell counts, or enhanced by changes in blood cell composition. Horvath-based epigenetic age follows the approach by Horvath (2013), and is defined as the predicted value of age based on the DNA methylation levels of the 353 CpG sites identified in his study [8]. Horvath-based epigenetic age acceleration (Horvath-EAA) is the residual term of a multivariate model regressing the Horvath-based epigenetic age estimate on chronological age and estimates of blood cell counts. It is by definition independent of both chronological age and age-related changes in the cellular composition of blood. Hannum-based epigenetic age is based on DNA methylation levels at the 71 CpGs identified by Hannum et al. (2013) [7]. Hannum-based epigenetic age acceleration (Hannum-EAA) is an enhanced version of the Hannum estimate which up-weights the contributions of age-associated blood cells. A weighted average of Hannum-based epigenetic age with blood cells whose abundance is known to change with age is calculated, and Hannum-EAA is then defined to be the residual variation from a univariate model regressing the weighted DNA methylation age estimate on chronological age. Hannum-EAA is independent of chronological age but in addition to cell-intrinsic epigenetic changes it also tracks age-related changes in blood cells. Full details of the calculation of Horvath-EAA and Hannum-EAA are given in . Horvath-EAA, described in previous publications as ‘intrinsic’ epigenetic age acceleration (IEAA), can be interpreted as a measure of cell-intrinsic ageing that exhibits preservation across multiple tissues, appears unrelated to lifestyle factors, and probably indicates a fundamental cell ageing process that is largely conserved across cell types [8,10]. In contrast, Hannum-EAA, referred to in previous studies as ‘extrinsic’ epigenetic age acceleration (EEAA), can be considered a biomarker of immune system ageing, explicitly incorporating aspects of immune system decline such as age-related changes in blood cell counts, correlating with lifestyle and health-span related characteristics, and thus yielding a stronger predictor of all-cause mortality [10,15]. It should be noted that as both the Horvath and Hannum epigenetic clocks correlate well with age, in a population with a wide age range they are guaranteed to correlate with each other. However, Horvath-based and Hannum-based epigenetic age acceleration estimates, i.e. the degree of divergence of epigenetic age from chronological age, are not guaranteed to be correlated. Previous studies have identified relationships between epigenetic ageing and numerous traits, including several age-related health outcomes, for example Alzheimer’s disease pathology [16], cognitive impairment [16], and age at menopause [17]. Higher EAA has been associated with poorer measures of physical and cognitive fitness [9] and higher risk of all-cause mortality [12]. Many associations are specific to either Horvath-EAA or Hannum-EAA, a discordance that may reflect the differences in the two estimates and supports the theory that they represent different aspects of ageing [15,18,19]. While EAA has been associated with various markers of physical and mental fitness, the mechanisms underlying epigenetic ageing remain largely unknown. There has been little research conducted thus far on genetic contributions to epigenetic age acceleration. However, Lu et al. (2018) recently published results of the first genome-wide association analysis of blood EAA in a sample of 9,907 individuals, identifying five genetic loci associated with Horvath-EAA and three Hannum-EAA-associated loci [20]. This current study, with a sample size of 13,493 individuals, constitutes the largest study of the genetic determinants of DNA methylation-based ageing to date. Single nucleotide polymorphism (SNP)-based and gene-based approaches were used to identify genes and loci associated with Hannum-based and Horvath-based estimates of EAA. Functional mapping and annotation of genetic associations were performed, alongside gene-based and gene-set analyses, in an attempt to elucidate the genes and pathways implicated in differential rates of epigenetic ageing between individuals and shed light on the underlying biological mechanisms. We report novel SNPs and genes associated with epigenetic age acceleration, and highlight differences in the genetic architectures of the Horvath-based and Hannum-based EAA measures.

Results

Estimation of epigenetic age and epigenetic age acceleration in the Generation Scotland sample

A summary of the estimated epigenetic age variables in Generation Scotland (GS) is given in . Both the Horvath- and Hannum-based estimates of biological age were highly correlated with chronological age (r = 0.94, SE = 0.005 and r = 0.93, SE = 0.005 respectively). The two DNA methylation age estimates were also highly correlated with each other (r = 0.93, SE = 0.005); however, the two estimates of epigenetic age acceleration, Horvath-EAA and Hannum-EAA, were only weakly correlated (r = 0.30, SE = 0.013) (). Scatter plots of A) Horvath-based epigenetic age versus Hannum-based epigenetic age, and B) Horvath-EAA vs Hannum-EAA, for the Generation Scotland sample.

GWAS of Horvath-EAA and Hannum-EAA in GS and replication of previously identified loci

The genome-wide association study (GWAS) for the GS cohort yielded two significant (P<5x10-8) variants for Horvath-EAA, but no SNPs achieved genome-wide significance for association with Hannum-EAA (minimum P-value 7.85x10-8) ( full output available online at https://doi.org/10.7488/ds/2631). There was a moderate genetic correlation between the two traits in the GS sample (rG = 0.597, SE = 0.279), and both measures had high genetic correlations with the previously reported findings of Lu et al. (rG = 0.724, SE = 0.312 and rG = 1.021, SE = 0.356 for Horvath-EAA and Hannum-EAA respectively). All the significant SNPs from the Lu et al. analysis of Horvath-EAA had the same direction of effect in GS (), with one attaining genome-wide significance (rs143093668, P-value = 3.53x10-8; remaining SNPs had P-values between 5.76x10-2 and 1.34x10-4). Two of the three significant SNPs from Lu et al.'s GWAS of Hannum-EAA had the same direction of effect in GS, although not at genome-wide significance levels in this smaller sample (P-values 1.76x10-3 and 1.75x10-4). Miami plots demonstrating a comparison between the EAA SNP association profiles in the GS and Lu et al. samples are shown in (Horvath-EAA) and (Hannum-EAA). Quantile-quantile plots (QQ plots) for the GWAS of Horvath-EAA and Hannum-EAA in GS are shown in .

Miami plot for GWAS of Horvath-EAA in the GS and Lu et al. cohorts.

SNP-based Miami plot comparing the results of genome-wide association analyses of Horvath-based epigenetic age acceleration in GS (top, n = 5,100) and Lu et al. (bottom, n = 8393), with—log10 transformed P-values for each SNP plotted against chromosomal location. The red line indicates the threshold for genome-wide significance (P<5×10−8) and the blue line for suggestive associations (P<1×10−5). Independent significant SNPs are annotated.

Miami plot for GWAS of Hannum-EAA in the GS and Lu et al. cohorts.

SNP-based Miami plot comparing the results of genome-wide association analyses of Hannum-based epigenetic age acceleration in GS (top, n = 5,100) and Lu et al. (bottom, n = 8393), with—log10 transformed P-values for each SNP plotted against chromosomal location. The red line indicates the threshold for genome-wide significance (P<5×10−8) and the blue line for suggestive associations (P<1×10−5). Independent significant SNPs are annotated.

GWAS meta-analysis

We conducted genome-wide association meta-analyses of Horvath-EAA and Hannum-EAA using 13,493 European-ancestry individuals aged between ten and 98 years from 12 cohorts, adjusting for sex. Manhattan plots for Horvath-EAA and Hannum-EAA are shown in , with QQ plots of the observed P-values versus those expected shown in . We did not find apparent evidence for genomic inflation in either the GS study (Horvath-EAA: genomic inflation factor λGC = 1.017, Linkage Disequilibrium (LD) score regression intercept (SE) = 1.002 (0.007); Hannum-EAA: λGC = 1.023, intercept (SE) = 0.998 (0.006), ) or the meta-analysis (Horvath-EAA: λGC = 1.035, intercept (SE) = 1.006 (0.008), Hannum-EAA: λGC = 1.044, intercept (SE) = 1.002 (0.007)); Lu et al. previously reported no evidence for genomic inflation for any of the individual studies making up their meta-analysis [20].

Manhattan plots for genome-wide meta-analyses (n = 13,493) of Horvath-based and Hannum-based epigenetic age acceleration.

SNP-based Manhattan plots for Horvath-EAA and Hannum-EAA, with—log10 transformed P-values for each SNP plotted against chromosomal location. The red line indicates the threshold for genome-wide significance (P<5×10−8) and the blue line for suggestive associations (P<1×10−5). Independent significant variants are annotated.

QQ plots for the meta-analyses of Horvath-based and Hannum-based epigenetic age acceleration.

Quantile-quantile plots for the genome-wide meta-analyses of Horvath-EAA and Hannum-EAA, showing the expected distribution of GWAS test statistics, -log10(p), versus the observed distribution. We identified 439 variants with a genome-wide significant association (P<5×10−8) with Horvath-EAA, of which ten were independent (r2<0.1 within a 250kb window). The significantly associated variants mapped to nine genomic loci on six chromosomes (, full details in ). Of the ten independent significant variants identified here, five were novel, that is, not within ± 500 Kb of a significant variant (P<5×10−8) reported by Lu et al. [20]. The novel findings were a SNP on chromosome 1q24.2 in the C1orf112 gene, three SNPs on chromosome three, at 3q21.3 (nearest gene: GATA2-AS1), 3q22.3 in the PIK3CB gene, and 3q25.1 in the LINC01214 gene, and a SNP on chromosome 12q23.3 (nearest genes: RP11-412D9.4 and TMEM263). The risk alleles at these loci conferred between 0.33 (SE = 0.054) and 1.34 (SE = 0.127) years higher Horvath-EAA (). These ten independent lead SNPs showed complete sign concordance for association with Horvath-EAA across GS and the Lu study (). Comparing the genomic loci identified in the current study with the five reported by Lu et al., only one locus that was previously reported was not identified at genome-wide significance here (rs11706810 at 3q25.33, meta-analysis P-value 8.68x10-8). shows the regional association plots for the independent signals, visualised in LocusZoom [21]. Of the ten independent SNPs achieving genome-wide significance, none associated with any other phenotype in currently published GWAS available via the NHGRI-EBI catalog. Genome-wide significance defined as having a P-value of P<5x10-8. A1 and A2 refer to the reference allele and non-reference allele for the index SNP, respectively. Freq (allele frequency), Beta (effect size), and SE (standard error of effect size) columns pertain to the reference allele, A1. Chromosome and position (in Mb) denote the location of the index SNP, and are given with regards to the GRCh38 assembly. a Genes are listed if located within +/- 10 kb of a listed SNP. The Hannum-EAA GWAS meta-analysis identified 324 genome-wide significant (P<5×10−8) associated variants mapping to a single genomic locus at 10p11.21 with one index SNP (, , full details of index SNP in ). ZNF25, a transcription factor associated with osteoblast differentiation of human skeletal stem cells [22], is the closest gene to this variant, at a distance of 20 Kb. At this Hannum-EAA-related locus, the risk allele conferred 0.53 (SE = 0.070) years higher Hannum-EAA. We replicated two of the three variants significantly associated with Hannum-EAA in the Lu et al. study; however, based on our clumping criteria with r2<0.1, we report only one as an independent significant SNP. Conditional analysis revealed no secondary signal at this locus. The third locus reported in the previous study was not associated at genome wide significance in this larger sample (P = 3.74x10-3). A regional association plot for 10p11.21 is shown in . Of the ten independent variants associated with Horvath-EAA, nine exhibited sign-consistent associations with Hannum-EAA, of which five attained at least nominal significance with association P-values less than 0.05 (most significant P = 6.9x10-5) (). The single independent SNP associated with Hannum-EAA also exhibited a nominal and sign-consistent association with Horvath-EAA (P = 0.011).

Methylation quantitative trait loci

Multiple studies have found that individual genotypes at specific loci can influence patterns of DNA methylation (e.g. [23,24]). These loci, referred to as methylation quantitative trait loci (mQTL) can influence methylation across extended genomic regions [23,24], and may underlie some SNP-phenotype associations. To evaluate whether mQTL are driving the observed associations between SNPs and epigenetic age acceleration in our analysis, we assessed whether any of the independent significant SNPs from the Horvath-EAA and Hannum-EAA GWAS meta-analysis are mQTL for any CpGs included in the Horvath or Hannum epigenetic clocks, using the methylation quantitative trait loci database (mQTLdb, [25]). The single Hannum-EAA genome-wide significant SNP, rs1005277, is an mQTL for 38 different CpGs across the five assessed time points (birth, childhood, adolescence, middle age, pregnancy). For 11 of these CpGs the mQTL is cis-acting (where the genetic variation occurs close to the methylation site), while it acts in trans (where variation occurs elsewhere in the genome) for the other 27 CpGs. None of these CpGs, however, are included in either the Horvath or Hannum epigenetic clocks. Nine of the ten Horvath-EAA independent significant SNPs are mQTL, for a total of 74 different CpGs in the mQTL database. Two of these CpGs, cg26297688 and cg01459453, are included in the Horvath clock only, while one, cg22736354, intersects with both the Hannum and Horvath clocks. Four of the Horvath-EAA SNPs are cis-acting mQTL for these clock CpGs at multiple time points, with two SNPs acting as mQTL for the same CpG (). These results suggest a potential mechanism of action whereby these SNPs influence biological ageing through their effect on methylation levels. A summary of the CpGs linked to each mQTL is shown in

Heritability

In order to characterise the genetic contribution to accelerated epigenetic ageing, SNP-based heritability was estimated using univariate LD score regression [26], which requires only GWAS summary statistics rather than full genotype data. The SNP-based heritabilities of Horvath-EAA and Hannum-EAA were estimated to be 0.154 (SE = 0.042) and 0.194 (SE = 0.040) respectively (), providing evidence for a genetic component to differential epigenetic ageing rates. These figures are comparable to previous SNP-based heritability estimates but lower than estimates based on pedigree relationships [20].

SNP functional annotation

We used FUMA [27] to functionally annotate SNPs in LD (r2≥0.6) with the independent significant SNPs for each of the epigenetic age acceleration measures. For Horvath-EAA, this resulted in functional annotation of 825 SNPs (). The vast majority of the SNPs were intergenic (44.85%) or intronic (47.88%), with only five (0.61%) exonic SNPs. 25 SNPs had CADD (Combined Annotation Dependent Depletion) scores greater than 12.37, surpassing the suggested threshold to be considered deleterious and thus providing evidence of pathogenicity [28]. The highest CADD scores were found in three exonic SNPs: rs1800460 and rs1142345 of TPMT and rs10949483 of NHLRC1 (CADD scores 28.40, 28.30 and 18.92 respectively), indicating potentially deleterious protein effects. Six SNPs (rs413147, rs12631035, rs9851887, rs12189658, rs6915893, rs12199316) had RegulomeDB scores below 2, suggesting that variation at these SNPs is likely to affect gene expression [29]. Almost all SNPs (98.18%) were in open chromatin regions. For Hannum-EAA, functional annotation of 1,382 candidate SNPs indicated a high proportion of intergenic SNPs (60.49%), while 11.79% were intronic and only three SNPs were located in exons (). 14 SNPs had CADD scores above 12.37, indicating that variation at these SNPs is potentially deleterious. Although 42.04% of the SNPs were located in open chromatin regions, there is little evidence that the Hannum-EAA-associated locus contains regulatory regions, as analysis using RegulomeDB, which integrates a larger collection of regulatory information encompassing protein binding, motifs, expression quantitative trait loci (eQTL), and histone modifications as well as chromatin structure, revealed only one SNP (rs2474568) with a score below 2.

eQTL and colocalisation analysis

For each independent SNP associated with Horvath-EAA or Hannum-EAA, evidence of eQTL was explored using the Genotype Tissue Expression (GTEx) v7 database [30]. Seven of the ten independent significantly associated SNPs for Horvath-EAA were identified as potential eQTL (). Notably, rs388649 is associated with expression of ESYT3, which has a role in lipid transport and metabolism pathways [31,32], expression of FAIM, which is associated with apoptosis and autophagy [33], in a number of skin and brain tissues, and PIK3CB, which regulates vital cell functions including proliferation and survival [34,35]. rs76244256, the variant most strongly associated with Horvath-EAA, shows eQTL evidence for NHLRC1 expression, which is associated with glycogen metabolism [36], across multiple tissues. We found no evidence for the Hannum-EAA-associated SNP, rs1005277, regulating gene expression. To further investigate the possibility that these SNPs act via regulating the expression of genes, we carried out colocalisation analysis using a Bayesian statistical method implemented in the 'coloc' package in R [37], which uses an approximate Bayes factor to estimate the posterior probability (PP) that a given variant is causal in both the GWAS and eQTL studies. We integrated our GWAS data with cis-eQTL data from the eQTLGen Consortium (https://www.eqtlgen.org/) [38] and analysed pairwise colocalisation within a +/- 200 kb window of each significant SNP. These analyses provide no evidence that the effect of these SNPs on accelerated epigenetic ageing is mediated through cis gene expression. There was no evidence for colocalisation of any Horvath-EAA or Hannum-EAA-associated SNP with cis-eQTL (PP for shared causal variant 8.31x10-15–0.030, ). Rather, in all but one case, the results support the hypothesis that there are two distinct causal variants affecting epigenetic age acceleration and transcript levels in the region (PP>0.95). In the +/-200 kb region surrounding variant rs2736099, there is strong evidence (PP>0.95) for a causal variant affecting gene expression, but not EAA.

Gene-based analysis

MAGMA (Multi-marker Analysis of GenoMic Annotation) v1.6 was used to identify gene-level associations with each EAA measure [39]. SNPs were mapped to 17,798 protein coding genes, with genome-wide significance defined at P = 0.05/17,798 = 2.809x10-6. A total of 21 genes attained genome-wide significance for association with Horvath-EAA (, full details in ). As expected, many of these genes were located in the same regions as the lead SNPs. Three genes at 6p22.3, NHLRC1, TPMT, and KDM1B, had the lowest P-values of 1.251x10-23, 4.639x10-23, and 7.68x10-11 respectively; all these genes are involved in metabolism-related pathways [36,40,41]. Although containing no genome-wide significant SNPs, 3q25.33 appears to be an important genomic region for Horvath-EAA, with four significantly associated genes including TRIM59 and KPNA4, which play roles in the immune system [42,43]. Two further significant genes are FAIM and TERT, whose functions include apoptosis and autophagy [33], and telomere length-associated ageing and apoptosis [44,45] respectively. Twelve genes were significantly associated with Hannum-EAA (, ). Genes of interest include MTRNR2L7, a neuroprotective and anti-apoptotic factor [46,47], and TRIM46 and MUC1, both located at 1q22, and which are involved with innate immune system pathways [42,48]. The 4q24 cytogenetic band houses several genes significantly associated with Hannum-EAA: MANBA and UBE2D3 have metabolic and innate immune system functions [32,49] while CISD2 regulates autophagy and is involved in life span control [50,51]. Comparing the results of the gene-based association analyses of Horvath-based and Hannum-based EAA, there was no overlap in significantly associated genes. Manhattan plots and QQ plots for the gene-based analysis of both epigenetic age acceleration measures are shown in . Genome-wide significant results after Bonferroni correction for multiple testing (P<2.809x10-6) are reported. N_SNPs is the number of SNPs in the gene.

Gene-set and pathway analysis

Using a competitive test of enrichment implemented in MAGMA v1.6, we did not identify any gene sets that were significantly associated with either Horvath-EAA or Hannum-EAA after Bonferroni correction for multiple testing. show the top 100 gene-sets for Horvath-EAA and Hannum-EAA respectively.

Genetic correlations

Several large-scale cohort studies have previously reported phenotypic associations between epigenetic age acceleration and a number of traits or health outcomes. To investigate whether these observed associations may be partly due to shared genetic variants influencing the traits, we conducted cross-trait LD score regression analysis of summary-level data [52], implemented in the online software LD Hub [53], to determine genetic correlations between Horvath-EAA/Hannum-EAA and a number of health and behavioural variables. The SNP-based genetic correlation between Horvath-EAA and Hannum-EAA was 0.571 (SE = 0.132, P = 1.605x10-5), suggesting a moderate overlap in the genetic factors influencing these two measures of epigenetic age acceleration. Of the 218 other health and behavioural traits investigated, none had a statistically significant genetic correlation (P<0.05) with either Horvath-EAA or Hannum-EAA after applying false discovery rate correction (most significant correlation with Horvath-EAA: father's age at death, P = 0.160; with Hannum-EAA: waist-to-hip ratio, P = 0.065). This correction, however, may be overly conservative, as not all the tested traits are independent, with several being highly correlated. Nominally significant correlations (P<0.05) were found with a number of traits (). Genetic correlations were determined using bivariate Linkage Disequilibrium score regression implemented in the online software LD Hub. SE is the standard error of the genetic correlation estimate; P-value is the association P-value for the genetic correlation estimate; ICV–intracranial volume; LDL–low density lipoprotein; IDL–intermediate density lipoprotein; VLDL–very low density lipoprotein. Both epigenetic age acceleration measures had nominally significant positive genetic correlations with a range of traits pertaining to adiposity, and negative correlations with father’s age at death and childhood IQ. Nominally significant genetic correlations were observed between Hannum-EAA, but not Horvath-EAA, and a wide range of traits including measures relating to education, smoking behaviour, various lipid- and cholesterol-related measures, diabetes and related glycemic measures, and parent’s age at death. Some of these results have previously been reported [19,20], but many are novel. The current study did, however, fail to replicate a number of previously reported correlations, including with age at menopause [20]. Details of the genetic correlations of all the tested traits with Horvath-EAA and Hannum-EAA are given in , respectively.

Discussion

This study investigated genetic markers of epigenetic ageing in a sample of 13,493 individuals of European ancestry. We examined genetic determinants of both Horvath-based (adjusted for the composition of age-related blood cells) and Hannum-based (immune system-associated) epigenetic age acceleration, sometimes referred to as ‘intrinsic’ and ‘extrinsic’ epigenetic age acceleration, to gain insight into the regulation of epigenetic ageing. We report several novel findings in addition to replicating a sub-set of previous results. The meta-analysis of Horvath-EAA identified ten independent associated SNPs, doubling the number reported to date, and highlighted 21 genes involved in Horvath-based epigenetic ageing. A single genome-wide significant variant was identified for Hannum-EAA, along with 12 implicated genes. We uncovered limited evidence of functionality within some associated genomic loci, with many SNPs located in regions of open chromatin and a smaller number in regulatory regions. Some loci also contained regions where genetic variation is predicted to be deleterious. It has been hypothesised that in some cases DNA methylation could be a candidate mechanism for mediating genetic effects on ageing-related phenotypes [54]. Intriguingly, four of the ten Horvath-EAA-associated SNPs are mQTL for CpGs used in the Horvath/Hannum epigenetic clocks. A possible interpretation of this is that the functional mechanism by which these SNPs influence the rate of biological ageing is via altering methylation levels. A number of the genes significantly associated with Horvath-EAA are related to metabolism (NHLRC1, TPMT, KDM1B, and ESYT3), consistent with several studies reporting phenotypic associations between Horvath-based EAA and metabolic syndrome characteristics and supporting the suggestion of a role in tracking metabolic ageing [15,19]. Others are involved in immune system pathways (TRIM59, KPNA4, EDARADD), while several have roles in cellular processes linked to ageing: apoptosis and autophagy (FAIM), ageing and autophagy (TERT), and coordinating vital cell functions (PIK3CB). PIK3CB plays a role in the signal transduction of insulin and insulin-like pathways [55], and genetic variants at this locus have been related to insulin-like growth factor levels in plasma, and human longevity [56]. Genes associated with Hannum-based EAA, often referred to as immune system ageing, include several involved in innate immune system pathways (e.g. TRIM46 and MUC1) or with metabolic and immune system functions (MANBA, UBE2D3). Other associated genes of interest include those with roles relating to ageing and longevity: MTRNR2L7 is a neuroprotective and anti-apoptotic factor, and CISD2 regulates autophagy and is a fundamentally important regulator of lifespan. Mouse studies indicate that CISD2 ameliorates age-associated degeneration of skin, skeletal muscle, and neurons, protects mitochondria from age-related damage and functional decline, and attenuates age-associated reduction in energy metabolism [57], while CISD2 deficiency leads to a number of phenotypic features suggestive of premature ageing [58]. Our LD score regression analysis replicated the positive genetic correlations with central adiposity reported by Lu et al. (2018) at nominal significance levels, supporting the suggestion that observed phenotypic associations [15,19] may result in part from a shared genetic aetiology. We did not, however, replicate previously reported correlations between Horvath-EAA and metabolic disease-related traits or diabetes, and found these traits to be correlated with Hannum-EAA at only nominal significance levels in our larger sample [20]. We also found no correlation between epigenetic age acceleration and age at menopause. Nominally significant genetic correlations between Hannum-based, but not Horvath-based, epigenetic age acceleration, and lifestyle factors such as smoking behaviour and education level, provide some evidence for a genetic basis underlying the phenotypic results we reported previously [19], and provide tentative support for the hypothesis that Hannum-based epigenetic ageing is relatively sensitive to changes in environment and lifestyle. Father’s age at death, a rough proxy for lifespan [59], was nominally significantly correlated with both EAA measures, and parents’ age at death was additionally correlated with Hannum-EAA, consistent with a body of work demonstrating robustly that EAA predicts life span [10,12]. Aside from these, genetic correlations with age-related traits were surprisingly few: it is possible that this could reflect an overly conservative correction for the multiple tests carried out, or low statistical power, rather than a genuine lack of correlations (). While the mean χ2 values (1.059 and 1.054 for Horvath-EAA and Hannum-EAA respectively) indicate a sufficient level of polygenicity within the dataset for use with LD score regression, the heritability Z-scores for Horvath-EAA and Hannum-EAA are 3.69 and 4.91 respectively. The recommendation is that genetic correlation analysis should be restricted to GWAS with a heritability Z-score of 4 or more, on the grounds of interpretability and power [53], so the Horvath-based results particularly should be interpreted with caution. This study of epigenetic age acceleration benefits from having a large sample size. Increasing GWAS sample size increases the power to detect associated loci, and is often achieved, as in this case, by combining smaller studies in a meta-analysis. Meta-analytic GWAS are, however, sometimes hampered by differences in how a trait is measured between individual studies. In this instance, use of the online calculator to calculate the EAA measures and using the same algorithm and output columns for each study, mitigates this. The current study comprises only individuals of European ancestry, which confers a further advantage as epigenetic ageing rates have been shown to differ between ethnicities [60]. Despite the large sample overlap, some results of this study differ from those reported by Lu et al. (2018). One reason for this could be that only European-ancestry individuals were included in this analysis whereas the Lu study reports results from a mixed ancestry sample. Another likely contributing factor is the age ranges involved: the GS cohort, not included in Lu’s analysis but which makes up 38% of the total sample in the current study, has a mean age of 48.5 years, 14.4 years younger than the mean age of the remaining cohorts. Given that epigenetic age changes over the life course, although not necessarily in parallel with chronological age, this could help explain the discrepancies between the studies. There are a number of limitations which should be considered when interpreting the results of this study. This is the largest meta-analysis of genetic determinants of epigenetic age acceleration to date, however, while large for these phenotypes, the size of the sample studies here is still small in terms of genome-wide analysis of polygenic traits. As only European-ancestry individuals were included, the results are not generalisable to other ethnicities. The MAGMA gene-based analysis identified a number of biologically plausible associated genes for both EAA measures; however, while many of these genes are located in the same genomic regions as the significantly associated SNPs, this should not be taken as evidence that the SNP association is effected through the gene. Identifying effector transcripts for GWAS variants is a difficult and as yet unresolved problem, and our knowledge of how these genes may affect the activity of the SNPs is limited. In addition, MAGMA does not take into account information from methylation QTL to help identify relevant genes; future work should place more emphasis on the role of mQTL. The lack of significant genetic correlations between EAA and age-related traits may reflect low statistical power (the heritability Z-score of 3.69 for Horvath-EAA falls below the recommended lower threshold of 4 for genetic correlation analysis) or overly stringent correction for multiple comparisons (FDR correction was applied over the 218 tested traits, however not all of these were independent) rather than a true absence of shared genetic aetiology. Finally, while we have identified a number of SNPs and genes significantly associated with EAA, including genes already known to be related to ageing, the analyses presented here fall short of providing a mechanistic explanation for how these variants and genes act to influence biological age. This study should be considered as 'discovery' research, with a comprehensive investigation of the functional and biological mechanisms behind the SNP and gene associations being a direction for future work. Horvath-based and Hannum-based epigenetic age acceleration are thought to represent different aspects of ageing. Hannum-EAA has been described as a biomarker of immune system ageing, and has been found to be associated with a wide range of traits [15,19], indicating a sensitivity to variations in environment and lifestyle. By contrast, Horvath-EAA is considered to be a fundamental, intrinsic cellular ageing process, largely unrelated to lifestyle factors, although associations with a range of metabolic syndrome characteristics suggest a role in tracking metabolic ageing processes. Our results reflect this to a large degree, with more nominally significant genetic correlations found with Hannum-EAA than Horvath-EAA, including items relating to education, smoking, intelligence, and various cholesterol measures. Meanwhile the greater number of significant variants, genomic loci, and genes associated with Horvath-EAA are consistent with the hypothesis that this measure of 'cell-intrinsic' ageing is less related to lifestyle and more under genetic control, and thus more likely to remain relatively stable. Despite these differences, however, our results indicate some common features. The significant genetic correlation of 0.57 between the two measures suggests a moderate overlap in the genetic factors influencing the two phenotypes despite the biomarkers being based on almost entirely distinct CpG sets. Both also appear to be influenced by genes associated with metabolic and immune system pathways, although the specific genes involved are different.

Conclusions

This study provided insight into the genetic determinants of differential biological ageing through the identification of genes and genetic variants associated with epigenetic age acceleration. We doubled the number of SNPs associated with Horvath-EAA reported to date, and report 21 genes significantly associated with this phenotype, including PIK3CB, linked to human longevity. We identified 12 Hannum-EAA-associated genes, one of which, CISD2, has a fundamental role in lifespan control. Our results also highlighted differences in the genetic architecture of the Horvath-based and Hannum-based EAA measures, with no genome-wide significant SNPs or genes common to the two, providing substantial support for the hypothesis that they represent different aspects of ageing. While the genetic information coded by our DNA sequence remains largely fixed throughout the lifetime, the expression of our genes is primarily regulated by epigenetic factors, which change over time. Epigenetic age increases with, but not in parallel with, chronological age; individual differences in the rate of epigenetic ageing potentially explain why trajectories of ageing differ between individuals. Understanding what causes these differences could potentially inform therapeutic interventions to delay the onset of age-related decline and improve ageing outcomes.

Methods

Ethics statement

Generation Scotland received ethical approval from the NHS Tayside Committee on Medical Research Ethics (REC Reference Number: 05/S1401/89). GS has also been granted Research Tissue Bank status by the Tayside Committee on Medical Research Ethics (REC Reference Number: 10/S1402/20), providing generic ethical approval for a wide range of uses within medical research. All participants provided written informed consent. Details of ethics approval and consent to participate for the cohorts included in the Lu et al. (2018) study can be found in their publication.

Generation Scotland cohort

We carried out genome-wide association analyses of Horvath-EAA and Hannum-EAA in a subset of individuals (n = 5,100) from the Generation Scotland: Scottish Family Health Study (GS) for whom both genetic and DNA methylation data were available. GS is a family- and population-based cohort recruited via general medical practices across Scotland; the recruitment protocol and sample characteristics are described in detail elsewhere [61,62]. In brief, the full cohort comprises 23,960 individuals aged between 18 and 98 years. Pedigree information was available for all participants, detailed socio-demographic and clinical data were collected, and biological samples were taken for genotyping.

DNA methylation and derivation of epigenetic age acceleration variables in GS

DNA methylation data were obtained from peripheral blood (n = 5,091) or saliva (n = 10) samples for 5,101 individuals from GS, with quality control checks carried out using standard methods outlined in , and described in full elsewhere [19]. After quality control (QC), the dataset comprised beta-values for 860,928 methylation loci. Methylation-based age estimates (DNAm age) and epigenetic age acceleration variables (Horvath-EAA and Hannum-EAA, described in ) were obtained from the online DNA Methylation Age Calculator (https://dnamage.genetics.ucla.edu/) developed by Horvath [8]. Normalised DNA methylation beta-values were submitted to the calculator, using the 'Advanced Analysis for Blood Data' option, and undergoing further normalisation within the calculator algorithm to make the data comparable to the training data of the epigenetic clock. One individual was flagged by the calculator as having a gender mismatch, and was therefore omitted from downstream analysis, leaving a total of 5,100 individuals for the GWAS of Horvath-EAA and Hannum-EAA in GS. Blood cell abundance measures were also estimated by the online calculator, based on DNA methylation levels, as described previously [63].

Genotyping, imputation, and quality control in GS

An overview of biological sample collection, DNA extraction, genotyping, imputation using the Haplotype Research Consortium reference panel (v1.1), and quality control for GS is included in ; full details have been described previously [64]. A total of 20,032 individuals passed all quality control thresholds. Following the removal of monomorphic or multiallelic variants and SNPs with a low imputation quality or a minor allele frequency below 1%, an imputed dataset with 8,633,288 hard called variants remained to be used in the genome-wide association analysis.

GWAS of Horvath-EAA and Hannum-EAA in GS

GWAS of Horvath-EAA and Hannum-EAA in GS were conducted using mixed linear model based association (MLMA) analysis [65], implemented in GCTA (Genome-wide Complex Trait Analysis) (v1.25) [66], and adjusting for sex to account for the higher epigenetic age acceleration in men than in women [7,12,60]. In order to account for population stratification, it is common to conduct ancestry-informative principal components analysis on the population in question, and use a number of the top-ranking principal components (PCs) from this analysis as covariates in the GWAS. However, as GS is a family-based sample, we employed a different approach to capture population structure. In place of PCs, two genomic relationship matrices (GRMs) were included in the GWAS, as this method has been shown to account for potential upward biases due to excessive relationships, and thus allows the inclusion of closely and distantly related individuals in genetic analyses [67]. The first GRM included pairwise relationship coefficients for all individuals, while the second had off-diagonal elements <0.05 set to 0; full details of the methods involved and construction of the GRMs is given elsewhere [68]. The results of univariate LD score regression analysis [26] () indicate that the two GRMs adequately accounted for population stratification, so it was not necessary to include ancestry-informative PCs in the GWAS.

GWAS meta-analysis of Horvath-EAA and Hannum-EAA

We obtained summary statistics from the largest European-ancestry analysis of epigenetic age acceleration to date (n = 8,393, Lu et al., 2018, summary information in ), and meta-analysed these with GS (details above). We chose not to include available data from non-European samples, despite the advantages of increased sample size, as different ethnicities have been shown to have different epigenetic ageing rates [60]. Association summary statistics from the GWAS of the two EAA phenotypes in GS and the Lu et al. study were meta-analysed using the inverse variance-weighted approach, which weights effect sizes by sampling distribution. This analysis was implemented in METAL [69], conditional on each variant being available in both samples. As SNPs which co-located with CpGs from the Hannum- or Horvath-based DNAm age predictors had already been excluded from Lu et al.'s analysis, it was not necessary to repeat this step. This resulted in 5,932,107 genetic variants for Horvath-EAA and 5,931,171 variants for Hannum-EAA, in a meta-analysis dataset containing 13,493 participants. The meta-analytic summary statistics produced by METAL were uploaded to FUMA (fuma.ctglab.nl) [27], which identified index SNPs and genomic risk loci related to epigenetic age acceleration. FUMA selects independent significant SNPs based on their having a genome-wide significant P-value (P<5x10-8) and being independent from each other (r2<0.6 by default) within a 250kb window. The European subset of the 1000 Genomes phase 3 reference panel [70] was used to map LD. SNPs in LD with these independent significant SNPs (r2≥0.6) within a 250kb window, and which have a minor allele frequency (MAF)>1% within the 1000 Genomes reference panel, were included for further annotation and used for gene prioritization. A subset of the independent significant SNPs, those in LD with each other at r2<0.1 within a 250kb window, were identified as lead SNPs. Genomic risk loci, including all independent signals that were physically close or overlapping in a single locus, were identified by merging any lead SNPs that were closer than 250kb apart (meaning that a genomic risk locus could contain multiple lead SNPs, with each locus represented by the lead SNP with the lowest P-value in that locus). Conditional analysis was implemented using GCTA software [66] to ascertain whether associated genetic loci harboured more than one independent causal variant, conditioning on the lead SNP at the locus and using GS as the reference panel for inferring the LD pattern. SNPs which remained significantly associated (P<5x10-8) with the phenotype after conditioning on the lead SNP were considered to be further independent associated variants. Manhattan plots and quantile-quantile plots were generated in R version 3.2.3 using the 'qqman' package, and regional SNP association results were visualised with LocusZoom [21]. SNPs which surpassed the threshold for genome-wide significance in our meta-analyses were checked against the NHGRI-EBI catalog of published GWAS [71,72] (www.ebi.ac.uk/gwas/) to determine whether they had previously been observed in association analysis. To ascertain whether the genome-wide significant associations from the Horvath-EAA and Hannum-EAA GWAS are confounded by methylation quantitative trait loci, we checked for SNP-CpG pairings in the mQTL database, a catalogue of the genetic influences on DNA methylation (mQTLdb, [25]). The independent significant SNPs from both GWAS were input to the database, using the MatrixEQTL database setting, which contains all associations below 1x10-7, and assessing all five time points (birth, adolescence, childhood, middle age, and pregnancy). A distance greater than or equal to 1 Mb was considered to be trans.

Heritability analysis

To estimate the SNP-based heritability for Horvath-EAA and Hannum-EAA, univariate Linkage Disequilibrium score regression [26] was applied to the GWAS summary statistics for both measures. This method also provides metrics to evaluate the proportion of inflation in the test statistics caused by confounding biases such as residual population stratification, relative to genuine polygenicity. We used pre-computed LD scores, estimated from the European-ancestry samples in the 1000 Genomes Project [73]. Functional annotation, using all SNPs located within the genomic risk loci which were nominally significant (P<0.05), had a MAF≥1%, and were in LD of r2≥0.6, was carried out in FUMA v1.3.0 [27]. In order to investigate the functional consequences of variation at these SNPs, they were first matched (based on chromosome, base pair position, reference and non-reference alleles) to a database containing functional annotations from a number of repositories: ANNOVAR (Annotate Variation) categories [74], used to identify a SNP's function and determine its position within the genome. Combined Annotation Dependent Depletion (CADD) scores [28], a measure of the deleteriousness of genetic variation at a SNP to protein structure and function, with higher scores indicating more deleterious variants. RegulomeDB (RDB) scores [29], based on data from eQTL as well as chromatin marks, with lower scores given to variants with the greatest evidence for having regulatory function. Chromatin states [75-77], indicating the level of accessibility of genomic regions, described on a 15 point scale, where lower chromatin scores indicate a greater level of accessibility to the genome at that site; generally, between 1 and 7 is considered an open chromatin state. Gene-based analysis was performed for each phenotype using the results of our association analysis, using default settings in MAGMA v1.6 [39], integrated within the FUMA web application. Summary statistics of SNPs located within protein-coding genes were aggregated to assess the simultaneous effect of all SNPs in the gene on the phenotype. The European panel of the 1000 Genomes phase 3 data was used as a reference panel to account for LD [70]. Genetic variants were assigned to protein-coding genes obtained from Ensembl build 85, resulting in 17,798 genes being analysed. After Bonferroni correction (α = 0.05/17,798), a threshold for genome-wide significant genes was defined at P<2.809×10−6. The independent genome-wide significant SNPs identified in the meta-analyses of Horvath-EAA and Hannum-EAA were assessed to determine whether they were potential eQTL, by mapping SNPs to genes if allelic variation at the SNP is associated with expression levels of the gene. This analysis was carried out using data from the Genotype Tissue Expression portal (GTEx) v7 [30], integrated within the FUMA web application. GTEx uses gene expression data from 48 different types of human tissue, linked to genotype data to provide information on eQTL. Since Horvath-EAA is derived from the pan-tissue Horvath epigenetic clock, eQTL analysis of the ten Horvath-EAA-associated SNPs used all the available tissue types in GTEx. Analysis for the Hannum-EAA SNP, however, was restricted to only the blood tissue types, as the Hannum epigenetic clock is specific to blood samples. eQTL mapping carried out within FUMA maps SNPs to genes which likely affect expression of those genes within 1Mb, i.e. cis-eQTL. Although FUMA contains all SNP-gene pairs of cis-eQTL, including non-significant associations, we limited our analysis to significant SNP-gene pairs, with a false discovery rate (FDR) ≤ 0.05 used as the cut-off to define significant eQTL associations. To further investigate the potential regulatory functions of the identified SNPs, we carried out colocalisation analysis to determine whether the SNPs are mediated through gene expression. We integrated our GWAS results with cis-eQTL data from the eQTLGen Consortium (https://www.eqtlgen.org/) [38], using a Bayesian method, 'coloc' [37], which evaluates whether the GWAS and eQTL associations best fit a model in which the same SNP is associated with both EAA and cis gene expression. This method, implemented in the 'coloc' package in R, tests pairwise colocalisation of SNPs in significant genomic regions in the GWAS with eQTLs, and generates posterior probabilities for each locus by weighing the evidence for competing hypotheses of no causal variants for either trait, causal variants for one trait only, independent causal variants influencing the two traits, or a shared causal SNP. We extracted summary statistics from the Horvath-EAA/Hannum-EAA meta-analytic GWAS results for all SNPs in a +/- 200 kb region around each genome-wide significant SNP, and extracted equivalent summary data for the same region in the eQTL analysis. Using the default prior probabilities in ‘coloc’, pairwise colocalisation was then tested between each GWAS-eQTL pair, with a posterior probability of ≥0.95 considered to be strong evidence in favour of a given hypothesis.

Gene-set analysis

To assess whether the Horvath-EAA and Hannum-EAA GWAS meta-analysis results are enriched for various gene-sets and provide insight into the involvement of specific biological pathways in the genetic aetiology of the phenotype, the gene-based analysis results were used to perform competitive gene-set and pathway analysis using default parameters in MAGMA v1.6, integrated within FUMA. The reference genome was 1000 genomes phase 3. This analysis used gene annotation files from the Molecular Signatures Database v5.2 for "Curated gene sets", covering chemical and genetic perturbations, and Canonical pathways, and "GO terms", covering three ontologies: biological process, cellular components, and molecular function. A total of 10,894 gene-sets were examined for enrichment in Horvath-EAA and Hannum-EAA, with a Bonferroni correction applied to control for multiple testing. Thus genome-wide significance was defined at P = 0.05/10,894 = 4.59x10-6. Cross trait LD score regression [52] was used to calculate genetic correlations between Horvath-based and Hannum-based EAA in our meta-analysis, and then between Horvath-EAA/Hannum-EAA and 218 other behavioural and disease-related traits for which GWAS summary data were available through LD Hub [53]; traits derived from non-Caucasian or mixed ethnicity samples were removed prior to analysis. This method exploits the correlational structure of SNPs across the genome and uses test statistics provided from GWAS summary estimates to calculate the genetic correlations between traits [52]. We checked whether our meta-analysis datasets had sufficient evidence of a polygenic signal, indicated by a heritability Z-score of >4 and a mean χ2 statistic of >1.02 [52]. By default, a MAF filter of >1% was applied, and indels and strand ambiguous SNPs were removed. We filtered to HapMap3 SNPs, and SNPs whose alleles did not match those in the 1000 Genomes European reference sample were removed. LD scores and weights for use with European populations were downloaded from (https://github.com/bulik/ldsc). We did not constrain the intercepts in our analysis, as we could not quantify the exact amount of sample overlap between cohorts. False discovery rate correction was applied across the 218 traits to correct for multiple testing [78].

Supplementary information.

S1 Text contains further information on the Hannum and Horvath epigenetic clocks, measures of epigenetic age and epigenetic age acceleration, DNA methylation in GS, derivation of epigenetic age and epigenetic age acceleration variables in GS, genotyping, imputation, and quality control in GS. (DOCX) Click here for additional data file.

Summary of age and estimated epigenetic age variables in Generation Scotland.

(XLSX) Click here for additional data file.

Independent variants with a genome-wide significant association (P<5x10-8) with epigenetic age acceleration in the Generation Scotland cohort.

(XLSX) Click here for additional data file.

Independent variants with a P-value <5x10-8 for association with Horvath-EAA/Hannum-EAA in the Lu et al. sample, and their corresponding effect size and significance in the Generation Scotland cohort.

(XLSX) Click here for additional data file.

Estimated polygenicity and SNP-based heritability using LD score regression.

(XLSX) Click here for additional data file.

Full details of independent variants with a genome-wide significant association (P<5x10-8) with Horvath-based or Hannum-based epigenetic age acceleration, ordered by chromosomal location.

(XLSX) Click here for additional data file.

Independent variants with a P-value <5x10-8 for association with Horvath-EAA/Hannum-EAA in the meta-analysis, and their corresponding effect size and significance in the Generation Scotland and Lu samples.

(XLSX) Click here for additional data file.

Summary of the independent variants significantly associated with either Horvath-EAA or Hannum-EAA, and their association with both epigenetic age acceleration measures.

(XLSX) Click here for additional data file.

Horvath-EAA-associated SNPs which act as methylation quantitative trait loci for CpGs included in the Horvath or Hannum epigenetic clocks.

(XLSX) Click here for additional data file.

Summary of Horvath-EAA and Hannum-EAA associated SNPs which act as methylation QTL for CpGs in the mQTL database.

(XLSX) Click here for additional data file.

Functional annotation of all SNPs in LD (r2≤0.6) with FUMA-identified independent significant SNPs for Horvath-EAA.

(XLSX) Click here for additional data file.

Functional annotation of all SNPs in LD (r2≤0.6) with FUMA-identified independent significant SNPs for Hannum-EAA.

(XLSX) Click here for additional data file.

Expression quantitative trait loci identified by analysis of independent significant variants for Horvath-EAA and the significance of their expression in the specified tissues.

(XLSX) Click here for additional data file.

Results of colocalisation analysis in the regions surrounding genome-wide significant SNPs for Horvath-EAA and Hannum-EAA.

(XLSX) Click here for additional data file.

Genome-wide significant gene-based results (P<2.809x10-6) obtained by MAGMA gene-based association analyses of Horvath-EAA and Hannum-EAA.

(XLSX) Click here for additional data file.

Most associated gene sets for the GWAS meta-analysis of Horvath-EAA.

(XLSX) Click here for additional data file.

Most associated gene sets for the GWAS meta-analysis of Hannum-EAA.

(XLSX) Click here for additional data file.

Genetic correlations between Horvath-EAA and 218 other health and behavioural traits, using bivariate Linkage Disequilibrium score regression.

(XLSX) Click here for additional data file.

Genetic correlations between Hannum-EAA and 218 other health and behavioural traits, using bivariate Linkage Disequilibrium score regression.

(XLSX) Click here for additional data file.

Overview of study datasets.

(XLSX) Click here for additional data file.

QQ plots for the GWAS of Horvath-EAA and Hannum-EAA in GS, showing the expected distribution of GWAS test statistics, -log10(p), versus the observed distribution.

(DOCX) Click here for additional data file.

Regional association plots for Horvath-EAA and Hannum-EAA associated SNPs.

Regional association plots for nine independent significantly associated SNPs for Horvath-EAA (A-I) and the single independent significantly associated SNP for Hannum-EAA (J), showing LD with SNPs in the surrounding region. Plots were produced in LocusZoom. The SNP association P-value is given on the y-axis, and SNP position, with gene annotation, on the x-axis. LD calculations are taken from hg19/1000 Genomes European build. Individual SNPs are coloured according to the strength of LD (r2) with the lead SNP. The highest association signal in each panel, highlighted in violet, are as follows: A: rs1011267, an intronic SNP in C1orf112 on chromosome 1; B: rs79070372, a non-coding transcript variant on chromosome 3 (closest genes GATA2/AS1); C: rs388649, an intronic SNP in PIK3CB on chromosome 3; D: rs6440667, an intronic SNP in LINC01214 on chromosome 3; E: rs2736099, an intronic SNP in TERT on chromosome 5; F: rs76244256, an intron variant in TPMT on chromosome 6 and the top ranking SNP for association with Horvath-EAA *This genomic locus contains a second independent associated SNP, intergenic variant rs7744541 (nearest gene NHLRC1), which remained significantly associated (P<5x10-8) with Horvath-EAA after conditioning on the lead SNP; G: rs4712953, an intronic SNP in SCGN on chromosome 6; H: rs10778517, a SNP of unknown function on chromosome 12 (nearest genes RP11-412D9.4 and TMEM263); I: rs62078811, an intron variant in STXBP4 on chromosome 17; J: rs1005277, the single independent Hannum-EAA significant associated SNP, a SNP of unknown function on chromosome 10 (nearest gene ZNF25). (DOCX) Click here for additional data file.

Manhattan plots for the MAGMA gene-based association analysis for the GWAS meta-analysis (n = 13,493) of Horvath-based epigenetic age acceleration and Hannum-based epigenetic age acceleration, with—log10 transformed P-values for each gene plotted against chromosomal location.

The dotted line denotes genome-wide significance, defined at P = 0.05/17798 = 2.809x10-6. Genes whose P-value reached genome-wide significance are labelled on the plots. (DOCX) Click here for additional data file.

QQ plots for the gene-based association analyses of Horvath-EAA and Hannum-EAA, showing the expected distribution of test statistics, -log10(p), versus the observed distribution.

(DOCX) Click here for additional data file.
Table 1

Independent variants with a meta-analysis genome-wide significant association with Horvath-based or Hannum-based epigenetic age acceleration.

PhenotypeIndex SNPChromosomePositionA1/A2FreqBetaSEP-valueGeneafunctionPreviously reported
Horvath-EAArs10112671q24.2169677720A/G0.503-0.3270.0541.579E-09C1orf112intron variantnovel
rs790703723q21.3128510481A/G0.1110.5050.0876.074E-09GATA2-AS1non coding transcript variantnovel
rs3886493q22.3138777967A/T0.495-0.3380.0556.054E-10PIK3CBintron variantnovel
rs64406673q25.1150287063C/G0.1610.4400.0754.28E-09LINC01214intron variantnovel
rs27360995p15.331287225A/G0.3670.3730.0618.58E-10TERTintron variantyes
rs77445416p22.318104469A/T0.4180.4390.0551.93E-15intergenic variantyes
rs762442566p22.318140332T/C0.046-1.3410.1276.231E-26TPMTintron variantyes
rs47129536p22.225671618A/T0.7250.3460.0593.604E-09SCGNintron variantyes
rs1077851712q23.3106947886T/G0.5650.3350.0544.46E-10RP11-412D9.4/TMEM263unknownnovel
rs6207881117q2255031815A/G0.218-0.3690.0651.158E-08STXBP4intron variantyes
Hannum-EAArs100527710p11.2137929331A/C0.3010.5330.0702.173E-14unknownyes

Genome-wide significance defined as having a P-value of P<5x10-8. A1 and A2 refer to the reference allele and non-reference allele for the index SNP, respectively. Freq (allele frequency), Beta (effect size), and SE (standard error of effect size) columns pertain to the reference allele, A1. Chromosome and position (in Mb) denote the location of the index SNP, and are given with regards to the GRCh38 assembly.

a Genes are listed if located within +/- 10 kb of a listed SNP.

Table 2

Results of MAGMA gene-based association analysis for Horvath-based and Hannum-based epigenetic age acceleration.

PhenotypeGeneChrN_SNPsP-valueFunction/related pathways
Horvath-EAASELP11481.6883E-07Immunoglobulin E responsiveness
EDARADD15164.2627E-07Innate immune system, cytokine signalling in immune system
GATA23721.0663E-07Stem cell maintenance and hematopoietic development
ESYT33936.5973E-07Metabolism, lipid transport
CEP7031641.0891E-06Organelle biogenesis and maintenance
FAIM3492.4177E-08Apoptosis and autophagy; regulates B-cell signalling and differentiation
PIK3CB31752.52E-08Coordinates cell functions e.g. proliferation, survival, migration
IFT8031501.2622E-07Organelle biogenesis and maintenance; intraflagellar transport
SMC43779.8909E-07Changes in chromosome structure during mitotic segregation
TRIM5931051.7737E-07Multifunctional regulator for innate immune signalling pathways
KPNA431042.9437E-07Cytokine signalling in immune system; protein transporter activity
TERT5904.0455E-08Roles in ageing and apoptosis; regulation of telomerase.
NHLRC16761.2512E-23Clearance of toxic polyglucosan and protein aggregates; metabolism pathways
TPMT61514.6385E-23Drug metabolism—cytochrome P450; thiopurine S methyltransferase activity
KDM1B62487.6758E-11Metabolism of proteins, regulates histone lysing methylation
SCGN62023.8379E-10Calcium binding protein; neuroscience, Ca, cAMP and lipid signalling pathways
TMEM72101231.1154E-06Transmembrane protein
RFX4122999.4786E-07Transcriptional regulatory network in embryonic stem cell
RIC8B121551.0988E-06Can activate some G-alpha proteins; odorant signal transduction.
C12orf23/TMEM26312986.321E-09Transmembrane protein
ZNF7022792.7934E-06Transcriptional regulation; gene expression pathways
Hannum-EAATRIM461272.66E-06Innate immune system; cytokine signalling in immune system
MUC11157.33E-07Cytokine signalling in immune system; bacterial infections in CF airways
MANBA41901.31E-06Glycosaminoglycan metabolism; innate immune system
UBE2D341541.19E-06Metabolism of proteins; innate immune system
CISD24541.18E-06Regulator of autophagy; life span control; glucose/energy metabolism pathways
SLC9B142172.56E-06Sperm motility and fertility, ion channel transport
MTRNR2L710525.20E-07Neuroprotective and antiapoptotic factor
ZNF248102042.22E-07Transcriptional regulation; gene expression pathways
ZNF2510991.23E-08Transcriptional regulation; gene expression pathways
ZNF33A101598.29E-09Transcriptional regulation; gene expression pathways
ZNF37A101463.79E-10Transcriptional regulation; gene expression pathways
DNTT101312.51E-07DNA double-strand break repair; hematopoietic cell lineage

Genome-wide significant results after Bonferroni correction for multiple testing (P<2.809x10-6) are reported. N_SNPs is the number of SNPs in the gene.

Table 3

Nominally significant genetic correlations (P<0.05) between Horvath-EAA/Hannum-EAA and other health and behavioural traits.

PhenotypeTraitGenetic CorrelationSEP-value
Horvath-EAAFathers age at death-0.4720.1440.001
Urate0.2780.0890.002
Waist-to-hip ratio0.1940.0640.002
Waist circumference0.1780.0640.005
ICV-0.4030.1630.013
Forced expiratory volume in 1 second (FEV1)/Forced Vital capacity(FVC)-0.1700.0780.030
Extreme waist-to-hip ratio0.3060.1460.036
Child birth weight-0.2740.1310.037
Childhood IQ-0.3170.1520.037
Leucine0.4500.2220.043
Glycoprotein acetyls; mainly a1-acid glycoprotein0.3600.1830.049
Hannum-EAAWaist-to-hip ratio0.2250.0620.0003
Waist circumference0.2100.0670.002
Parents age at death-0.4550.1480.002
Type 2 Diabetes0.3310.1140.004
Years of schooling 2013-0.2310.0830.005
Years of schooling 2016-0.1620.0580.006
Birth weight0.2110.0790.007
HDL cholesterol-0.2100.0820.010
Former vs Current smoker-0.3300.1300.011
Forced expiratory volume in 1 second (FEV1)-0.2390.0950.012
Forced expiratory volume in 1 second (FEV1)-0.3710.1490.013
Hip circumference0.1630.0660.013
Intelligence-0.1690.0700.016
Cigarettes smoked per day0.3260.1380.018
Ever vs never smoked0.2020.0880.022
College completion-0.1950.0860.023
Age of first birth-0.1560.0700.026
HOMA-B0.3050.1370.026
Fasting insulin main effect0.2370.1070.027
HbA1C-0.2770.1260.028
Childhood IQ-0.2860.1300.028
Fathers age at death-0.2870.1310.028
Triglycerides0.1360.0650.036
Phospholipids in medium LDL-0.4100.2000.040
Free cholesterol in large LDL-0.4850.2370.041
Anorexia Nervosa-0.1520.0740.041
Amyotrophic lateral sclerosis0.3630.1780.042
Obesity class 10.1350.0670.042
Years of schooling (proxy cognitive performance)-0.1710.0840.042
Phospholipids in large LDL-0.4710.2320.043
Phospholipids in very small VLDL-0.3730.1840.043
Free cholesterol in IDL-0.4390.2200.046
Celiac disease-0.2650.1340.047
Extreme waist-to-hip ratio0.2850.1450.048
Total cholesterol in large LDL-0.4300.2190.049

Genetic correlations were determined using bivariate Linkage Disequilibrium score regression implemented in the online software LD Hub. SE is the standard error of the genetic correlation estimate; P-value is the association P-value for the genetic correlation estimate; ICV–intracranial volume; LDL–low density lipoprotein; IDL–intermediate density lipoprotein; VLDL–very low density lipoprotein.

  73 in total

Review 1.  Molecular mechanisms involved in endothelial cell aging: role of telomerase reverse transcriptase.

Authors:  S Jakob; J Haendeler
Journal:  Z Gerontol Geriatr       Date:  2007-10       Impact factor: 1.281

2.  Peripheral blood leukocyte telomere length and mortality among 64,637 individuals from the general population.

Authors:  Line Rode; Børge G Nordestgaard; Stig E Bojesen
Journal:  J Natl Cancer Inst       Date:  2015-04-10       Impact factor: 13.506

3.  Control of cell cycle exit and entry by protein kinase B-regulated forkhead transcription factors.

Authors:  Geert J P L Kops; Rene H Medema; Janet Glassford; Marieke A G Essers; Pascale F Dijkers; Paul J Coffer; Eric W-F Lam; Boudewijn M T Burgering
Journal:  Mol Cell Biol       Date:  2002-04       Impact factor: 4.272

4.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets.

Authors:  Zhihong Zhu; Futao Zhang; Han Hu; Andrew Bakshi; Matthew R Robinson; Joseph E Powell; Grant W Montgomery; Michael E Goddard; Naomi R Wray; Peter M Visscher; Jian Yang
Journal:  Nat Genet       Date:  2016-03-28       Impact factor: 38.330

5.  Evidence for in vivo production of Humanin peptide, a neuroprotective factor against Alzheimer's disease-related insults.

Authors:  Hirohisa Tajima; Takako Niikura; Yuichi Hashimoto; Yuko Ito; Yoshiko Kita; Kenzo Terashita; Kazuto Yamazaki; Atsuo Koto; Sadakazu Aiso; Ikuo Nishimoto
Journal:  Neurosci Lett       Date:  2002-05-24       Impact factor: 3.046

Review 6.  Cisd2 mediates lifespan: is there an interconnection among Ca²⁺ homeostasis, autophagy, and lifespan?

Authors:  C-H Wang; C-H Kao; Y-F Chen; Y-H Wei; T-F Tsai
Journal:  Free Radic Res       Date:  2014-07-29

7.  Generation Scotland: the Scottish Family Health Study; a new resource for researching genes and heritability.

Authors:  Blair H Smith; Harry Campbell; Douglas Blackwood; John Connell; Mike Connor; Ian J Deary; Anna F Dominiczak; Bridie Fitzpatrick; Ian Ford; Cathy Jackson; Gillian Haddow; Shona Kerr; Robert Lindsay; Mark McGilchrist; Robin Morton; Graeme Murray; Colin N A Palmer; Jill P Pell; Stuart H Ralston; David St Clair; Frank Sullivan; Graham Watt; Roland Wolf; Alan Wright; David Porteous; Andrew D Morris
Journal:  BMC Med Genet       Date:  2006-10-02       Impact factor: 2.103

8.  MAGMA: generalized gene-set analysis of GWAS data.

Authors:  Christiaan A de Leeuw; Joris M Mooij; Tom Heskes; Danielle Posthuma
Journal:  PLoS Comput Biol       Date:  2015-04-17       Impact factor: 4.475

9.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

10.  DNA methylation arrays as surrogate measures of cell mixture distribution.

Authors:  Eugene Andres Houseman; William P Accomando; Devin C Koestler; Brock C Christensen; Carmen J Marsit; Heather H Nelson; John K Wiencke; Karl T Kelsey
Journal:  BMC Bioinformatics       Date:  2012-05-08       Impact factor: 3.169

View more
  26 in total

Review 1.  DNA methylation-based predictors of health: applications and statistical considerations.

Authors:  Paul D Yousefi; Matthew Suderman; Ryan Langdon; Oliver Whitehurst; George Davey Smith; Caroline L Relton
Journal:  Nat Rev Genet       Date:  2022-03-18       Impact factor: 53.242

2.  Genetic Association Between Epigenetic Aging-Acceleration and the Progression of Mild Cognitive Impairment to Alzheimer's Disease.

Authors:  Hongliang Liu; Michael Lutz; Sheng Luo
Journal:  J Gerontol A Biol Sci Med Sci       Date:  2022-09-01       Impact factor: 6.591

3.  Greater effect of polygenic risk score for Alzheimer's disease among younger cases who are apolipoprotein E-ε4 carriers.

Authors:  Brian Fulton-Howard; Alison M Goate; Robert P Adelson; Jeremy Koppel; Marc L Gordon; Nir Barzilai; Gil Atzmon; Peter Davies; Yun Freudenberg-Hua
Journal:  Neurobiol Aging       Date:  2020-09-30       Impact factor: 4.673

4.  A transdisciplinary approach to understand the epigenetic basis of race/ethnicity health disparities.

Authors:  Lucas A Salas; Lauren C Peres; Zaneta M Thayer; Rick Wa Smith; Yichen Guo; Wonil Chung; Jiahui Si; Liming Liang
Journal:  Epigenomics       Date:  2021-03-10       Impact factor: 4.778

5.  The Socioeconomic Gradient in Epigenetic Ageing Clocks: Evidence from the Multi-Ethnic Study of Atherosclerosis and the Health and Retirement Study.

Authors:  Lauren L Schmitz; Wei Zhao; Scott M Ratliff; Julia Goodwin; Jiacheng Miao; Qiongshi Lu; Xiuqing Guo; Kent D Taylor; Jingzhong Ding; Yongmei Liu; Morgan Levine; Jennifer A Smith
Journal:  Epigenetics       Date:  2021-07-06       Impact factor: 4.861

6.  Genetic loci and metabolic states associated with murine epigenetic aging.

Authors:  Khyobeni Mozhui; Ake T Lu; Caesar Z Li; Amin Haghani; Jose Vladimir Sandoval-Sierra; Yibo Wu; Robert W Williams; Steve Horvath
Journal:  Elife       Date:  2022-04-07       Impact factor: 8.713

7.  Characteristics of epigenetic aging across gestational and perinatal tissues.

Authors:  Linda Dieckmann; Marius Lahti-Pulkkinen; Tuomas Kvist; Jari Lahti; Peter E DeWitt; Cristiana Cruceanu; Hannele Laivuori; Sara Sammallahti; Pia M Villa; Sanna Suomalainen-König; Johan G Eriksson; Eero Kajantie; Katri Raikkönen; Elisabeth B Binder; Darina Czamara
Journal:  Clin Epigenetics       Date:  2021-04-29       Impact factor: 6.551

8.  Genetic underpinnings of affective temperaments: a pilot GWAS investigation identifies a new genome-wide significant SNP for anxious temperament in ADGRB3 gene.

Authors:  Xenia Gonda; Nora Eszlari; Dora Torok; Zsofia Gal; Janos Bokor; Andras Millinghoffer; Daniel Baksa; Peter Petschner; Peter Antal; Gerome Breen; Gabriella Juhasz; Gyorgy Bagdy
Journal:  Transl Psychiatry       Date:  2021-06-01       Impact factor: 6.222

9.  DNAm-based signatures of accelerated aging and mortality in blood are associated with low renal function.

Authors:  Nora Franceschini; Melanie Waldenberger; Pamela R Matías-García; Cavin K Ward-Caviness; Laura M Raffield; Xu Gao; Yan Zhang; Rory Wilson; Xīn Gào; Jana Nano; Andrew Bostom; Elena Colicino; Adolfo Correa; Brent Coull; Charles Eaton; Lifang Hou; Allan C Just; Sonja Kunze; Leslie Lange; Ethan Lange; Xihong Lin; Simin Liu; Jamaji C Nwanaji-Enwerem; Alex Reiner; Jincheng Shen; Ben Schöttker; Pantel Vokonas; Yinan Zheng; Bessie Young; Joel Schwartz; Steve Horvath; Ake Lu; Eric A Whitsel; Wolfgang Koenig; Jerzy Adamski; Juliane Winkelmann; Hermann Brenner; Andrea A Baccarelli; Christian Gieger; Annette Peters
Journal:  Clin Epigenetics       Date:  2021-06-02       Impact factor: 7.259

10.  Genetic and environmental causes of variation in epigenetic aging across the lifespan.

Authors:  Shuai Li; Tuong L Nguyen; Ee Ming Wong; Pierre-Antoine Dugué; Gillian S Dite; Nicola J Armstrong; Jeffrey M Craig; Karen A Mather; Perminder S Sachdev; Richard Saffery; Joohon Sung; Qihua Tan; Anbupalam Thalamuthu; Roger L Milne; Graham G Giles; Melissa C Southey; John L Hopper
Journal:  Clin Epigenetics       Date:  2020-10-22       Impact factor: 6.551

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.