Literature DB >> 35546478

Integrative analysis of clinical and epigenetic biomarkers of mortality.

Tianxiao Huan1,2,3, Steve Nguyen4, Elena Colicino5, Carolina Ochoa-Rosales6,7, W David Hill8, Jennifer A Brody9, Mette Soerensen10,11,12, Yan Zhang13, Antoine Baldassari14, Mohamed Ahmed Elhadad15,16,17, Tanaka Toshiko18, Yinan Zheng19, Arce Domingo-Relloso20,21,22, Dong Heon Lee1,2, Jiantao Ma1,2,23, Chen Yao1,2, Chunyu Liu24, Shih-Jen Hwang1,2, Roby Joehanes1,2, Myriam Fornage25, Jan Bressler26, Joyce B J van Meurs26, Birgit Debrabant10, Jonas Mengel-From10,12, Jacob Hjelmborg10, Kaare Christensen10,12, Pantel Vokonas27,28,29, Joel Schwartz30, Sina A Gahrib9,31, Nona Sotoodehnia9, Colleen M Sitlani9, Sonja Kunze15,16, Christian Gieger15,16,17, Annette Peters16,17,32,33, Melanie Waldenberger15,16,17, Ian J Deary34, Luigi Ferrucci18, Yishu Qu19, Philip Greenland19, Donald M Lloyd-Jones19, Lifang Hou19, Stefania Bandinelli35, Trudy Voortman6, Brenner Hermann13,36, Andrea Baccarelli37, Eric Whitsel14,38, James S Pankow4, Daniel Levy1,2.   

Abstract

DNA methylation (DNAm) has been reported to be associated with many diseases and with mortality. We hypothesized that the integration of DNAm with clinical risk factors would improve mortality prediction. We performed an epigenome-wide association study of whole blood DNAm in relation to mortality in 15 cohorts (n = 15,013). During a mean follow-up of 10 years, there were 4314 deaths from all causes including 1235 cardiovascular disease (CVD) deaths and 868 cancer deaths. Ancestry-stratified meta-analysis of all-cause mortality identified 163 CpGs in European ancestry (EA) and 17 in African ancestry (AA) participants at p < 1 × 10-7 , of which 41 (EA) and 16 (AA) were also associated with CVD death, and 15 (EA) and 9 (AA) with cancer death. We built DNAm-based prediction models for all-cause mortality that predicted mortality risk after adjusting for clinical risk factors. The mortality prediction model trained by integrating DNAm with clinical risk factors showed an improvement in prediction of cancer death with 5% increase in the C-index in a replication cohort, compared with the model including clinical risk factors alone. Mendelian randomization identified 15 putatively causal CpGs in relation to longevity, CVD, or cancer risk. For example, cg06885782 (in KCNQ4) was positively associated with risk for prostate cancer (Beta = 1.2, PMR  = 4.1 × 10-4 ) and negatively associated with longevity (Beta = -1.9, PMR  = 0.02). Pathway analysis revealed that genes associated with mortality-related CpGs are enriched for immune- and cancer-related pathways. We identified replicable DNAm signatures of mortality and demonstrated the potential utility of CpGs as informative biomarkers for prediction of mortality risk.
© 2022 The Authors. Aging Cell published by Anatomical Society and John Wiley & Sons Ltd. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.

Entities:  

Keywords:  DNA methylation; cancer; cardiovascular disease; machine learning; mortality

Mesh:

Substances:

Year:  2022        PMID: 35546478      PMCID: PMC9197414          DOI: 10.1111/acel.13608

Source DB:  PubMed          Journal:  Aging Cell        ISSN: 1474-9718            Impact factor:   11.005


African ancestry the Atherosclerosis Risk in Communities study coronary heart disease cardiovascular disease the Cardiovascular Health Study DNA methylation the Danish Twin Register sample European ancestry epigenome‐wide association studies the Epidemiologische Studie zu Chancen der Verhütung, Früherkennung und optimierten Therapie chronischer Erkrankungen in der älteren Bevölkerung the Framingham Heart Study genome‐wide association studies the Invecchiare in Chianti Study the Cooperative Health Research in the Region of Augsburg the Lothian Birth Cohorts Mendelian randomization the Normative Aging Study the Rotterdam Study the Women’s Health Initiative

INTRODUCTION

Despite substantial evidence of heritability of human longevity (h 2 = 10–30%), genome‐wide association studies (GWAS) have reported few loci associated with human longevity (Deelen et al., 2019; Pilling et al., 2017; Timmers et al., 2019; van den Berg et al., 2017). DNA methylation (DNAm), the covalent binding of a methyl group to the 5′ carbon of cytosine‐ phosphate‐guanine (CpG) dinucleotide sequences, reflects a wide range of environmental exposures and genetic influences at the molecular level and altered DNAm has been shown to regulate gene expression (Jones & Takai, 2001). Recent studies have reported DNAm patterns associated with age in humans (Hannum et al., 2013; Horvath, 2013; Levine et al., 2018; Lu et al., 2019). Estimates of biological age based on DNAm referred to as "epigenetic age" or "DNAm age" have been validated in numerous studies, although the functions of these age‐associated CpGs are largely unknown (Horvath et al., 2015; Lu et al., 2019; Marioni et al., 2015; Marioni et al., 2015). DNAm age also has been shown to be predictive of many age‐related diseases and of all‐cause mortality (Chen et al., 2016; Dugué et al., 2018; Levine et al., 2018; Lu et al., 2019; Marioni et al., 2015). Despite the association of DNAm age with a variety of age‐associated outcomes, age‐related CpGs are different from those that are most strongly associated with mortality. Relatively few DNAm studies have focused on mortality as the primary outcome (Colicino et al., 2020; Svane et al., 2018; Zhang et al., 2017). Moreover, due to sample size limitations, most DNAm mortality studies have not typically investigated cause‐specific mortality such as death due to cardiovascular disease (CVD) and cancer. Additionally, little is known about the prediction performance of DNAm‐based mortality models and whether or not such approaches improve mortality prediction above and beyond established clinical risk factors. We hypothesized that inter‐individual variation in DNAm is associated with all‐cause mortality risk and with cause‐specific mortality, and that we could build models incorporating CpGs that would improve mortality prediction beyond established clinical risk factors. In this study, we report the results of a meta‐analysis of epigenome‐wide association studies (EWAS) of all‐cause mortality and cause‐specific mortality including death from CVD and cancer in up to 15,013 individuals from 15 prospective cohort studies in which DNAm was measured in whole blood. We built all‐cause mortality risk prediction models using penalized regression and machine learning methods and integrated DNAm and established mortality clinical risk factors and validated the models’ performance. Additionally, using Mendelian randomization, we identified putatively causal CpGs for mortality. Last, we investigated the downstream gene expression and pathway changes of the mortality‐related CpGs by testing associations between DNAm and gene expression. Figure 1 summarizes the multi‐step study design.
FIGURE 1

Overall analytic workflow

Overall analytic workflow

RESULTS

Study population

Table 1 presents the major clinical characteristics of the 15,013 study participants including 11,684 European ancestry (EA, mean age 65, 55% women) and 3329 African ancestry (AA, mean age 59, 70% women) participants from 15 cohorts (Table S1 summarizes additional clinical characteristics). Most studies had fewer than 15 years of mean follow‐up (mean values ranged from 6.4 to 13.7 years), except ARIC (mean follow‐up of 20.0 years in ARIC EA and 18.6 in ARIC AA participants, respectively). During follow‐up of EA participants, 2907 died of any cause, 688 of CVD, and 546 of cancer; among AA participants, 1407 died of any cause, 547 of CVD, and 322 of cancer.
TABLE 1

Clinical characteristics the 15,013 study participants

Prevalent diseases
CohortTotal N No. of all‐cause deathNo. of CVD deathNo. of cancer deathTime to death/last follow‐up years, mean (SD)Age, mean (SD)Sex (F, %)BMI, mean (SD)Type 2 Diabetes (n)Coronary Heart Disease (n)Heart Failure (n)Stroke (n)Hypertension (n)Cancer (n)
European ancestry
ARIC969331959420.0 (5.2)59.8 (5.5)5926.2 (4.5)86442916233102
CHS41937313212.7 (6.1)75.0 (4.9)6026.8 (4.9)721611522478
DTR87029874409.3 (3.4)69.4 (7.9)5225.9 (3.9)46 a 37269129
ESTHER1000265949013.7 (3.5)62.1 (6.5)5027.8 (4.3)1541441102857277
FHS2427403911559.1 (2.2)66.3 (9.0)5528.3 (5.3)27922653116107389
InCHIANTi48810410.0 (1.6)62.4 (15.8)5227.0 (3.9)4231910232
KORA F417278931356.4 (0.9)61.0 (8.9)5128.1 (4.8)1581054147789154
LBC 19214183669.8 (4.7)79.1 (0.6)6028.2 (4.0)197033170
LBC 193690019210.2 (2.4)69.6 (0.8)5027.7(4.4)7222146364
NAS6402211237210.5 (3.3)72.8 (6.8)028.1 (4.0)11718142447316
RS731736.8 (1.5)59.9 (8.2)5427.4 (4.5)74453038576
WHI1095192486011.5 (3.5)62 (6.9)10028.8 (5.9)602051146914
African ancestry
ARIC2446106942432218.6 (6.6)56.5 (5.8)6430.1 (6.2)64312016375137387
CHS3252649612.9 (6.6)73.1 (5.5)6228.6 (5.2)6820223536
WHI558742710.6 (3.7)61 (6.8)10031.5 (6.1)761811123692

The clinical risk factors were ascertained at the time of blood draw for DNAm measurements. BMI was calculated as weight (kg) divided by height squared (m2). Diabetes was defined as a measured fasting blood glucose level of >125 mg/dl or current use of glucose‐lowering prescription medication. Hypertension was defined as a measured systolic blood pressure (BP) ≥140 mm Hg or diastolic BP ≥90 mm Hg or use of antihypertensive prescription medication. Cancer was defined as the occurrence of any type of cancer excluding non‐melanoma skin cancer.

The diabetes cases in DTR included both type I and type II diabetes.

Clinical characteristics the 15,013 study participants The clinical risk factors were ascertained at the time of blood draw for DNAm measurements. BMI was calculated as weight (kg) divided by height squared (m2). Diabetes was defined as a measured fasting blood glucose level of >125 mg/dl or current use of glucose‐lowering prescription medication. Hypertension was defined as a measured systolic blood pressure (BP) ≥140 mm Hg or diastolic BP ≥90 mm Hg or use of antihypertensive prescription medication. Cancer was defined as the occurrence of any type of cancer excluding non‐melanoma skin cancer. The diabetes cases in DTR included both type I and type II diabetes.

Ancestry‐stratified epigenome‐wide meta‐analysis of all‐cause mortality

At Bonferroni‐corrected p < 1 × 10−7 (~0.05/400,000), we identified 163 CpGs whose differential methylation in whole blood was associated with all‐cause mortality in EA participants, and 17 CpGs in AA participants, after adjustment of age, sex, lifestyle factors, clinical risk factors, white blood cell types, and technical covariates (e.g., batch). Tables S2–S3 present the results for all CpGs at p < 1 × 10−5. Overall genomic inflation in meta‐analysis (λ) was estimated at 1.15 or less, indicating low inflation and low risk of false‐positive findings. Even though cohort‐specific analysis showed slightly higher genomic inflation in some cohorts (λ > 1.5 in two cohorts, Table S4), forest plots show that the results were not driven by results from one or several cohorts (Fig. S1). Sensitivity analysis results including meta‐analysis after correcting for λ in each cohort, meta‐analysis after excluding results from two cohorts with λ > 1.5, and meta‐analysis after excluding RS cohort are included in Tables S5 and S6. Results of the sensitivity analysis remained consistent with the main results in terms of direction and effect estimates with Pearson's correlation r = 0.99 (in EA, corrected for λ in each cohorts), r = 1.00 (in EA, after removing two cohorts with λ > 1.5), r = 1.00 (in EA, after removing RS) and r = 1.00 (in AA, corrected for λ in each cohorts). Among the 177 all‐cause mortality‐related CpGs (union set of EA and AA results at p < 1 × 10−7), the vast majority of significant CpGs (151, 85%) were inversely associated with mortality, with hazards ratios (HRs) <1 (range 0.72 to 0.89 per standard deviation [SD]). Methylation at the remaining 26 (15%) CpGs was positively associated with mortality, with HRs >1 (range 1.13 to 1.32). The 177 CpGs are annotated to 121 genes and 43 intergenic regions.

Transethnic replication and sensitivity analysis

Of the 163 all‐cause mortality‐related CpGs in EA participants, 18 (11%) had p < 0.0003 (0.05/163) in AA participants; of the 17 CpGs in AA participants, 12 (71%) had p < 0.004 (0.05/17) in EA participants. Table 2 displays the transethnic replicated CpGs including 27 unique CpGs. The top 3 transethnic replicated CpGs in EA participants remained the top 3 in AA participants, including cg16743273 for MOBKL2A, cg18181703 for SOCS3, and cg21393163 at an intergenic region (Chr.1: 12217629).
TABLE 2

Transethnic replicated all‐cause mortality‐related CpGs

CpGChrPositionGeneMeta‐analysis EA cohortsMeta‐analysis AA cohortsTransethnic replication
HR (95% CI) p‐valueHR (95% CI) p‐valueBonferroni‐corrected P
Discovered in EA, and then replicated in AA
cg16743273 192076833 MOBKL2A 1.15 (1.1–1.21)1.57E−091.24 (1.15–1.33)1.28E−082.08E−06
cg18181703 1776354621 SOCS3 0.83 (0.8–0.87)6.15E−160.82 (0.77–0.88)3.71E−086.05E−06
cg21393163 1122176290.84 (0.8–0.88)4.15E−120.84 (0.79–0.89)7.48E−081.22E−05
cg15310871820077936 ATP6V1B2 1.18 (1.12–1.25)1.42E−081.19 (1.11–1.26)1.80E−072.94E−05
cg259531301063753550 ARID5B 0.87 (0.83–0.91)4.67E−100.86 (0.81–0.91)1.22E−061.98E−04
cg054383781567383736 SMAD3 0.88 (0.84–0.92)1.52E−080.85 (0.79–0.91)3.68E−066.00E−04
cg264705011945252955 BCL3 0.84 (0.79–0.88)8.38E−120.81 (0.74–0.89)1.48E−052.42E−03
cg061264216307200800.8 (0.75–0.86)2.48E−100.84 (0.78–0.91)1.69E−052.75E−03
cg0200318314103415882 CDC42BPB 1.19 (1.13–1.26)1.94E−111.16 (1.08–1.24)2.00E−053.26E−03
cg1095025112044664320.86 (0.82–0.91)4.05E−080.86 (0.8–0.92)2.34E−053.81E−03
cg175012106166970252 RPS6KA2 0.86 (0.81–0.9)5.84E−090.87 (0.82–0.93)2.71E−054.41E−03
cg235980891203652079 ATP2B4 1.13 (1.08–1.18)2.36E−081.14 (1.07–1.22)4.19E−056.84E−03
cg219932902233703120 GIGYF2 0.88 (0.84–0.92)6.13E−080.87 (0.81–0.93)4.94E−058.06E−03
cg0498773414103415873 CDC42BPB 1.2 (1.15–1.26)2.53E−141.15 (1.07–1.23)5.77E−059.41E−03
cg20813374635657180 FKBP5 0.84 (0.78–0.89)4.27E−080.84 (0.77–0.91)7.19E−051.17E−02
cg119272335170816542 NPM1 0.84 (0.8–0.89)2.43E−090.89 (0.84–0.95)2.41E−043.92E−02
cg248594336307202030.85 (0.81–0.9)7.15E−100.88 (0.82–0.94)2.70E−044.40E−02
cg014451001688103339 BANP 1.23 (1.15–1.32)1.88E−091.24 (1.1–1.39)2.76E−044.49E−02
Discovered in AA, and then replicated in EA
cg18181703 1776354621 SOCS3 0.83 (0.8–0.87)6.15E−160.82 (0.77–0.88)3.71E−081.04E−14
cg21393163 1122176290.84 (0.8–0.88)4.15E−120.84 (0.79–0.89)7.48E−087.05E−11
cg16743273 192076833 MOBKL2A 1.15 (1.1–1.21)1.57E−091.24 (1.15–1.33)1.28E−082.67E−08
cg25114611635696870 FKBP5 0.86 (0.81–0.91)7.50E−070.81 (0.75–0.87)1.79E−081.28E−05
cg164118571657023191 NLRC5 0.88 (0.84–0.93)4.40E−060.79 (0.74–0.85)2.40E−117.47E−05
cg169369531757915665 TMEM49 0.91 (0.87–0.95)7.05E−050.82 (0.77–0.88)1.72E−081.20E−03
cg2357081011315102 IFITM1 0.86 (0.8–0.93)9.75E−050.77 (0.72–0.83)2.35E−111.66E−03
cg120544531757915717 TMEM49 0.92 (0.88–0.96)1.57E−040.84 (0.79–0.89)2.93E−082.66E−03
cg189425791757915773 TMEM49 0.91 (0.87–0.96)3.53E−040.8 (0.74–0.86)2.58E−096.01E−03
cg010412391813222581 C18orf1 1.1 (1.04–1.16)1.29E−031.22 (1.14–1.31)1.04E−082.20E−02
cg0303826211315262 IFITM1 0.88 (0.82–0.96)1.85E−030.72 (0.66–0.79)5.14E−133.15E−02
cg24408769615506085 JARID2 1.11 (1.04–1.18)2.17E−031.27 (1.17–1.37)1.29E−083.68E−02

Abbreviations: AA, African ancestry; CI, confidence interval; EA, European ancestry; HR, hazard ratio per standard deviation.

Transethnic replicated all‐cause mortality‐related CpGs Abbreviations: AA, African ancestry; CI, confidence interval; EA, European ancestry; HR, hazard ratio per standard deviation. Because ARIC had longer follow‐up than the other cohorts, in sensitivity analysis, we truncated ARIC follow‐up at 15 years. The HRs for the significant CpGs (at p < 1 × 10−5) remained consistent with the main results in terms of direction and effect estimates with Pearson's correlation r = 1.00 and r = 0.99 in EA and AA participants, respectively (Tables S2–S3 and Fig. S2).

Associations of DNAm with CVD death and cancer death

In comparison with results for all‐cause mortality, fewer CpGs were associated with CVD death (at p < 1 × 10−7, n = 4 in EA, and n = 15 in AA) and cancer death (n = 0 in EA, and n = 1 in AA); Tables S7–S8 report the corresponding results at p < 1 × 10−5. Among the 163 all‐cause mortality‐related CpGs identified in EA participants at p < 1 × 10−7, 41 CpGs were associated with CVD death, 16 with cancer death and 5 with both (at p < 0.05/163, Table S2). Among the 17 CpGs identified in AA participants at p < 1 × 10−7, 15 were associated with CVD death, 9 with cancer death and 8 with both (at p < 0.05/17, Table S3). Figure 2 shows the effect sizes and direction of effect for the top CpGs associated with all‐cause mortality, and their consistency with the results of analyses of CVD death and cancer death. If a CpG was positively correlated with all‐cause mortality, it also was positively correlated with CVD death and cancer death, and vice versa.
FIGURE 2

Effect sizes (log hazards ratios) and 95% confidence intervals of CpGs related to mortality identified by meta‐analysis, comparing the results for all‐cause mortality, CVD death, and cancer death. (a) Results of meta‐analysis of European ancestry (EA); (b) Results of meta‐analysis of African ancestry (AA). These figures showed the CpGs associated with all‐cause mortality identified by the meta‐analysis, which were also associated with either CVD death or cancer death passing Bonferroni‐corrected threshold. Figure 1a shows 51 CpGs in EA, including 41 CpGs associated with CVD death, 16 with cancer death, and 5 with both. Figure 1b shows 16 CpGs in AA, including 15 CpGs associated with CVD death, 8 with cancer death, and 7 with both

Effect sizes (log hazards ratios) and 95% confidence intervals of CpGs related to mortality identified by meta‐analysis, comparing the results for all‐cause mortality, CVD death, and cancer death. (a) Results of meta‐analysis of European ancestry (EA); (b) Results of meta‐analysis of African ancestry (AA). These figures showed the CpGs associated with all‐cause mortality identified by the meta‐analysis, which were also associated with either CVD death or cancer death passing Bonferroni‐corrected threshold. Figure 1a shows 51 CpGs in EA, including 41 CpGs associated with CVD death, 16 with cancer death, and 5 with both. Figure 1b shows 16 CpGs in AA, including 15 CpGs associated with CVD death, 8 with cancer death, and 7 with both

Mortality prediction model

To investigate whether DNAm can be used to predict mortality risk, we constructed prediction models for all‐cause mortality and evaluated their prediction of all‐cause mortality, CVD death, and cancer death. To ensure unbiased validation, we split the EA cohorts into separate discovery and replication sets (Figure 1 shows the analysis flowchart). The discovery cohorts consisted of 8288 participants (including 2173 deaths from all causes) from 10 cohorts, excluding FHS (n = 2427) and ARIC (n = 969), which were used as replication cohorts. The meta‐analysis of the discovery set identified 74 CpGs at p < 1 × 10−7, 158 CpGs at p < 1 × 10−6, 357 CpGs at p < 1 × 10−5, 931 CpGs at p < 1 × 10−4, 2717 CpGs at p < 1 × 10−3, and 28,323 CpGs at p < 0.05. We evaluated three types of input features: (a) clinical risk factors only (i.e., clinical risk factor models); (b) CpGs identified in the meta‐analysis of the discovery set (i.e., CpG models); and (c) the input features including both CpGs and clinical risk factors (i.e., integrative models). We also compared four prediction methods including Elastic net‐Cox proportional hazards (Elastic‐coxph; Friedman et al., 2010), Random survival forest (RSF) (Ishwaran et al., 2008), Cox‐nnet (Ching et al., 2018), and DeepSurv (Katzman et al., 2018) (see Methods for details). In general, the four prediction methods did not show major differences in predicting mortality outcomes as assessed by multiple evaluation metrics (Table S9 lists the evaluation metrics across all four methods). To simplify the presentation of results, we focused on the Elastic‐coxph method (Figure 3).
FIGURE 3

Kaplan–Meier estimates of mortality risk scores with respect to mortality outcomes in ARIC study. (a) Survival curves with respect to all‐cause mortality; (b) survival curves with respect to CVD death; (c) survival curves with respect to cancer death. The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years. For cancer death, we excluded samples who had any type of cancer before blood drawn for DNA methylation measurements. The mortality risk scores for (a) and (b) were computed by the model (Table S10), and for (c) was computed by the model (Table S11)

Kaplan–Meier estimates of mortality risk scores with respect to mortality outcomes in ARIC study. (a) Survival curves with respect to all‐cause mortality; (b) survival curves with respect to CVD death; (c) survival curves with respect to cancer death. The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years. For cancer death, we excluded samples who had any type of cancer before blood drawn for DNA methylation measurements. The mortality risk scores for (a) and (b) were computed by the model (Table S10), and for (c) was computed by the model (Table S11)

Clinical risk factors strongly predict all‐cause mortality and CVD death

The C‐index of the clinical risk factor models (age, sex, and 12 clinical risk factors) was 0.80 for all‐cause mortality, 0.81 for CVD death and 0.77 for cancer death in FHS (reflecting the average values of 10‐fold cross‐validation). We considered 12 clinical risk factors including BMI, smoking, alcohol consumption, physical activity, educational attainment, and prevalent diseases including hypertension, CHD, heart failure, stroke, type 2 diabetes, and cancer. Among the 12 clinical risk factors, prevalent cancer status was the major contributor to predicting cancer death. After excluding individuals with prevalent cancer at the time of blood draw for DNAm measurements (i.e., the start of follow‐up), the C‐index of the clinical risk factor model was 0.57 for cancer death. Finally, two clinical risk models were built using the optimum parameters selecting by cross‐validation (see Methods). The first one was trained using all FHS participants and included 10 risk factors selected by the Elastic‐coxph method (to predict all‐cause mortality and CVD death, Table S10), and the second was trained using FHS participants excluding those with prevalent cancer cases and including 10 risk factors (to predict cancer death, Table S11). The corresponding C‐index of the clinical risk factor model was 0.75 for all‐cause mortality (HR = 2.64 per SD in the risk score, 95% CI [2.21, 3.15], p = 4.4 × 10−27), 0.81 for CVD death (HR = 3.51, 95% CI [2.58, 4.79], p = 2.1 × 10−15), and 0.71 for cancer death (excluding prevalent cancer samples, HR = 2.35, 95% CI [1.74, 3.18], p = 2.3 × 10−8) in ARIC EA participants with follow‐up truncated at 15 years (Table 3).
TABLE 3

Performance robustness comparison of mortality predictors in FHS and ARIC cohorts

ModelFHS a ARIC b
HRC‐indexIBSHR (95% CI)C‐indexIBS
All‐cause mortality
Clinical risk factor model3.370.800.072.64 (2.21–3.15)0.750.04
CpG model2.910.770.072.24 (1.89–2.66)0.720.04
Integrative model3.500.800.062.95 (2.45–3.55)0.770.04
CVD death
Clinical risk factor model3.740.810.023.51 (2.57–4.79)0.810.02
CpG model3.850.820.022.62 (1.56–3.91)0.770.02
Integrative model3.900.830.023.65 (2.63–5.05)0.800.02
Cancer Death (excluding prevalent cancer cases)
Clinical risk factor model1.250.570.012.35 (1.74–3.18)0.710.02
CpG model1.710.650.012.22 (1.64–2.89)0.730.02
Integrative model1.780.680.012.58 (1.90–3.50)0.760.02

Abbreviation: HR, hazard ratio per standard deviation; IBS: Integrated brier score.

Note: The clinical risk factor models were trained by using clinical risk factors as the sole input features. The CpG Models were trained by using CpGs selecting in the discovery meta‐analysis. The integrative model was trained by using both clinical risk factors and CpGs selecting in the discovery meta‐analysis.

The Clinical Risk Factor Model used to predict all‐cause mortality and CVD death was shown in Table S10, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S11. The CpG model used to predict all‐cause mortality and CVD death was shown in Table S12, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S13. The integrative model used to predict all‐cause mortality and CVD death was shown in Table S14, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S15.

HR, C‐index and IBS values in FHS reflect the average values of 10 times cross‐validation.

The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years.

Performance robustness comparison of mortality predictors in FHS and ARIC cohorts Abbreviation: HR, hazard ratio per standard deviation; IBS: Integrated brier score. Note: The clinical risk factor models were trained by using clinical risk factors as the sole input features. The CpG Models were trained by using CpGs selecting in the discovery meta‐analysis. The integrative model was trained by using both clinical risk factors and CpGs selecting in the discovery meta‐analysis. The Clinical Risk Factor Model used to predict all‐cause mortality and CVD death was shown in Table S10, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S11. The CpG model used to predict all‐cause mortality and CVD death was shown in Table S12, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S13. The integrative model used to predict all‐cause mortality and CVD death was shown in Table S14, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S15. HR, C‐index and IBS values in FHS reflect the average values of 10 times cross‐validation. The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years.

DNAm predicts mortality independently of age and clinical risk factors

The models using all‐cause mortality‐related CpGs identified in the discovery cohorts as the sole input feature (the CpG model) were predictive of all‐cause mortality, CVD death, and cancer death in the replication set. As shown in Fig. S3, when more discovery CpGs were added to the model, the prediction performance metrics did not always improve. In FHS, the models with discovery CpGs at p < 1 × 10−3 showed the best predictive performance for all‐cause mortality (C‐index = 0.77) and CVD death (C‐index = 0.82), but the model with discovery CpGs at p < 1 × 10−5 showed the best predictive performance for cancer death (excluding prevalent cancer cases, [C‐index = 0.65]). The final CpG models that were trained using all FHS participants are provided in Table S12 including 76 CpGs to predict all‐cause mortality and CVD death, and in Table S13 including 56 CpGs to predict cancer death (excluding prevalent cancer cases). The C‐index of the CpG models with the best predictive performance in ARIC were 0.72 for all‐cause mortality (HR = 2.21, 95% CI [1.86, 2.62], P = 2.0 × 10−20), 0.77 for CVD death (HR = 2.62, 95% CI [1.96, 3.51], p = 9.9 × 10−11), and 0.73 for cancer death (HR = 2.22, 95% CI [1.67, 2.95], p = 3.2 × 10−8, Table 3). The association of the mortality risk scores calculated by the CpG models with mortality outcomes remained significant after adjusting for age, sex, and clinical risk factors; for all‐cause mortality (HR = 1.68, 95% CI [1.37, 2.07], p = 9.8 × 10−7), CVD death (HR = 1.81, 95% CI [1.24, 2.64], p = 0.002), and cancer death (HR = 2.04, 95% CI [1.46, 2.86], P = 3.0 × 10−5).

The integrative model (trained by CpGs and clinical risk factors) moderately improved upon the clinical risk factor model for all‐cause mortality and CVD death, and substantially improved the prediction of cancer death

As shown in Table 3, the integrative models demonstrated robustness for predicting mortality outcomes, with a good C‐index, HR, and low brier error rate. The final integrative models trained using data from all FHS participants are provided in Table S14 including nine clinical risk factors and 36 CpGs to predict all‐cause mortality and CVD death, and in Table S15 including seven clinical risk factors and 42 CpGs to predict cancer death (excluding prevalent cancer cases). The C‐index values of the integrative models were 0.80 (FHS, reflecting the average values of 10‐fold cross‐validation) and 0.77 (ARIC) for all‐cause mortality; 0.83 (FHS) and 0.80 (ARIC) for CVD death; and 0.69 (FHS) and 0.76 (ARIC) for cancer death. Kaplan–Meier survival curves for the mortality risk scores (split into high‐, middle‐, and low‐risk groups) in the ARIC EA cohort (computed by the integrative models using clinical risk factors and CpGs at discovery p < 1 × 10−6, Tables S14–S15) illustrate the higher death rate for those with a higher mortality risk score (log‐rank p < 1 × 10−6, Figure 4). In comparison with the clinical risk factor models, the integrative models slightly improved prediction of all‐cause mortality (0.7% increase in C‐index with addition of CpGs in FHS and 2% increase in ARIC), and of CVD death (2% increase in C‐index in FHS, but no increase in ARIC). We speculate that the reason for this minor increase is because the mortality‐related CpGs capture the contributions of clinical risk factors for CVD death. For cancer death, however, the C‐index of the integrative model revealed an 11% increase in FHS above and beyond the clinical risk factor model and a corresponding 5% increase in ARIC (C‐index for the clinical risk factor model is 0.71 [0.67, 0.75], and for the integrative model is 0.76 [0.72, 0.80], one‐tailed t‐test p = 0.036).
FIGURE 4

Hazard ratios per standard deviation increment with 95% confidence intervals for mortality. (a) With respect to all‐cause mortality; (b) with respect to CVD death; and (c) with respect to cancer death. The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years. For cancer death, samples who had any type of cancer before blood drawn for DNA methylation measurements were excluded. Cox regression models were used to relate mortality outcomes to inversely transformed mortality risk scores computed by Integrative models (Tables S12–S13) and CpG models (Tables S10–S11), and inversely transformed DNAm age including GrimAge (Lu et al., 2019), PhenoAge (Levine et al., 2018), Horvath Age (Horvath, 2013), and Hannum Age (Hannum et al., 2013). Adj age and sex indicated the association further adjusted for age and sex. Adj age, sex and risk factors indicated the association further adjusted for age, sex and the other clinical risk factors

Hazard ratios per standard deviation increment with 95% confidence intervals for mortality. (a) With respect to all‐cause mortality; (b) with respect to CVD death; and (c) with respect to cancer death. The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years. For cancer death, samples who had any type of cancer before blood drawn for DNA methylation measurements were excluded. Cox regression models were used to relate mortality outcomes to inversely transformed mortality risk scores computed by Integrative models (Tables S12–S13) and CpG models (Tables S10–S11), and inversely transformed DNAm age including GrimAge (Lu et al., 2019), PhenoAge (Levine et al., 2018), Horvath Age (Horvath, 2013), and Hannum Age (Hannum et al., 2013). Adj age and sex indicated the association further adjusted for age and sex. Adj age, sex and risk factors indicated the association further adjusted for age, sex and the other clinical risk factors We also tested the mortality prediction models’ performance using the entire ARIC EA data (without truncation, Table S16). Due to the long follow‐up time in this older cohort (mean age 59.8 at baseline, with 20 ± 5.5 years follow‐up), the integrative model exhibits very similar performance features as the model using age and sex as the sole input features for predicting all‐cause mortality and CVD death. The integrative model improved prediction of cancer death with 2% increase in the C‐index versus the clinical risk factor model. We further tested all‐cause mortality prediction models in the CARDIA study (baseline age 45 ± 3 years). The CARDIA study has 12 years of follow‐up, during which there were 27 deaths from all causes in 905 participants with DNA methylation. As shown in Table S17, the clinical risk factor model, the CpG model, and the integrative model each predicted all‐cause mortality, and each outperformed the DNAm age models.

Comparing the mortality prediction model with DNAm age

We compared four DNAm age models (i.e., PhenoAge (Levine et al., 2018), Horvath Age (Horvath, 2013), Hannum Age (Hannum et al., 2013), and GrimAge (Lu et al., 2019)) with our mortality prediction models (CpG only models and integrative CpG plus 12 risk factor models) for all‐cause mortality, CVD death, and cancer death in ARIC participants. The associations of mortality risk scores calculated by mortality prediction models with mortality outcomes were statistically significant, and the associations remained significant after adjusting for age and sex, and after additionally adjusting for the clinical risk factors. The four DNAm age models were significantly associated with mortality outcomes. After adjusting for age, sex, and clinical risk factors, however, only GrimAge remained associated with all‐cause mortality, CVD death, and cancer death. None of the other three DNAm age predictors was associated with mortality outcomes after additionally adjusting for clinical risk factors (Figure 4). The mortality prediction models (both the CpG only model and the integrative model that included the clinical risk factors and CpGs) outperformed the GrimAge model in prediction of mortality outcomes in terms of HRs and p values. The associations of mortality risk scores with mortality outcomes remained significant after adjusting for the four DNAm age terms (Table S18).

Associations of DNAm with genetic variants and Mendelian randomization analysis

Among the 177 all‐cause mortality‐related CpGs (union of EA and AA results at p < 1 × 10−7), 123 CpGs had significant associations with genetic variants (i.e., cis‐ or trans‐meQTL variants). meQTL variants for 80 CpGs could be linked to 618 GWAS Catalog (Buniello et al., 2019) index SNPs associated with 432 complex traits or diseases (Table S19). We further performed multiple instrumental variable (IV) MR analysis for the 17 CpGs having ≥3 independent cis‐meQTL SNPs (pruned by LD r 2 < 0.01, as IVs, to model the causal relations of differential methylation at these CpGs (as the exposure) in relation to the various outcomes, including longevity (Deelen et al., 2019), CVD, CVD risk factors, and cancer (Evangelou et al., 2018; Locke et al., 2015; Michailidou et al., 2017; Phelan et al., 2017; Schumacher et al., 2018; Scott et al., 2017; Wang et al., 2014; Willer et al., 2013). At p MR < 0.05, MR supported causal effects of 15 CpGs on one or more outcome (Table S20), and 4 CpGs were statistically significant at p MR < 0.05/17, including cg06885782 (within 1500 bases upstream of transcription start site [TSS1500] of KCNQ4) and cg04907244 (TSS1500 of SNORD93) in relation to prostate cancer (Schumacher et al., 2018; Beta = 1.2 and 2.1; and p MR = 4.1 × 10−4 and 0.003, respectively), cg07094298 (in the gene body of TNIP2) in relation to lung cancer (Wang et al., 2014; Beta = 2.2, and p MR = 0.003), and cg18241337 (in the gene body of SSR3) in relation to total cholesterol (Willer et al., 2013; Beta = 0.5, and p MR = 0.003). cg06885782 (KCNQ4) also was associated with longevity (Deelen et al., 2019; Beta = −1.9, p MR = 0.02).

Associations of DNAm with gene expression and pathway analysis

For the 177 all‐cause mortality‐related CpGs at p < 1 × 10−7, we assessed associations of CpGs with nearby gene expression (i.e., cis gene expression; within ± 1 Mb) and identified 15 cis‐ DNAm‐mRNA associated pairs (13 CpGs and 15 mRNAs) at p < 3 × 10−10. The genes located at these CpGs or cis‐eQTM mRNAs were not enriched for any biological processes or pathways. For the 719 all‐cause mortality‐related CpGs at p < 1 × 10−5, genes located at CpG sites were enriched for multiple immune functions, cellular response to organic substance, and negative regulation of cell communication (Gene Ontology [GO] (Ashburner et al., 2000), FDR < 0.05), and pathways for multiple types of cancer (Kyoto Encyclopedia of Genes and Genomes [KEGG] pathway (Kanehisa & Goto, 2000), p < 0.05, Table S21). There were 79 cis‐DNAm‐mRNA pairs (63 CpGs and 67 mRNAs, Table S22).

DISCUSSION

By performing EWAS using whole blood‐derived DNA from 15,013 individuals from 15 cohorts with the accrual of 4314 deaths during a mean follow‐up of more than 10 years, we identified robust DNAm signatures of all‐cause and cause‐specific mortality. We developed replicable mortality predictors by integrating mortality‐related CpGs with traditional clinical risk factors. The integrative models that included clinical risk factors and CpGs showed small improvements in prediction of all‐cause mortality and CVD death, and a more substantial improvement in prediction of cancer death compared to the traditional risk factor model. Our study is one of the largest EWAS of mortality to date (Colicino et al., 2020; Svane et al., 2018; Zhang et al., 2017), and it revealed many replicable DNAm signatures for all‐cause mortality. Our results are consistent with those from previous EWAS of all‐cause mortality; the vast majority of CpGs (85% in our study, 84% in (Zhang et al., 2017), and 67% in (Colicino et al., 2020)) were inversely associated with mortality suggesting a greater mortality risk with lower CpG methylation. Our study identified more CpGs in EA cohorts (n = 163) than in AA cohorts (n = 17). As shown in Table 2, the effect sizes (i.e., HR) of mortality‐related CpGs in EA and AA participants were quite similar. We speculate that our study identified many more CpGs in EA participants than AA participants due the greater statistical power of the larger EA sample size. Using different DNAm data normalization methods (such as Noob (Triche Jr et al., 2013), SWAN (Maksimovic et al., 2012), BMIQ (Teschendorff et al., 2013), and Dasen (Pidsley et al., 2013), see File S1) in different cohorts may also affect the reproducibility of the results. For the 177 all‐cause mortality‐related CpGs (union of EA and AA results at p < 1 × 10−7), we examined their overlap with trait‐associated CpGs in the EWAS catalog (Table S23; Battram et al., 2021). We found that 172 CpGs (97%) have been reported to be associated with human age, 123 with smoking, 49 with alcohol consumption, 42 with sex, and 140 with the other diseases or traits. Many CpGs were associated with multiple traits. For example, the top two CpGs, cg02583484 and cg18181703, were have been reported to be associated with smoking, alcohol consumption, prenatal smoking, healthy diet, forced expiratory volume, C reactive protein, and many other traits. We speculate that many of the CpGs identified by EWAS reflect DNA methylation changes due to disease, human aging, lifestyle, and environmental influences. By linking CpGs with meQTLs and performing MR analysis, it is possible to further infer putatively causal CpGs. However, to identify definitely causal effects of CpGs on outcomes, functional studies are necessary. Among the 177 all‐cause mortality‐related CpGs, 123 CpGs had significant associations with genetic variants (i.e., cis‐ or trans‐meQTL variants identified previously; Huan et al., 2019). For the remaining 44 CpGs, however, this does not mean that their methylation levels have nothing to do with genetic variation. It is possible that the previous meQTL study lacked sufficient statistical power to identify meQTLs for those CpGs. The mortality‐related CpGs are linked to hundreds of human complex diseases/traits via their cis‐meQTL SNPs, which coincide with 618 GWAS Catalog (Buniello et al., 2019) index SNPs. This leads us to hypothesize that many disease/phenotype‐associated SNPs may contribute to disease processes via effects on mortality‐related CpGs. In this way, the mortality‐related CpGs may contribute causally to disease. To test this hypothesis, we conducted MR analyses that confirmed several putatively causal associations of mortality‐related CpGs with longevity (Deelen et al., 2019), CVD (Nikpay et al., 2015), CVD risk factors, and several types of cancer (Evangelou et al., 2018; Locke et al., 2015; Michailidou et al., 2017; Phelan et al., 2017; Schumacher et al., 2018; Scott et al., 2017; Wang et al., 2014; Willer et al., 2013; Table S20). Among the four CpGs passing a Bonferroni‐corrected threshold in MR analyses, cg06885782 in KCNQ4 was reported to be associated with risk for prostate cancer (beta = 1.2, p MR = 4.1 × 10−4) and negatively associated with longevity (beta = −1.9, p MR = 0.02). KCNQ4 (potassium voltage‐gated channel subfamily Q member 4) was previously reported to be associated with age‐related hearing impairment (Van Eyken et al., 2006), and it contains genetic variants associated with all‐cause mortality and survival free of major diseases (Walter et al., 2011). cg07094298 in the gene body of TNIP2 was previously identified as causal for lung cancer. A recent study reported TNIP2‐ALK fusion in lung adenocarcinoma (Feng et al., 2019). cg04907244 (in TSS1500 of SNORD93) was identified as causal for prostate cancer by MR. SNORD93 and its methylation was reported to be associated with several cancer types including uveal melanoma (Gong et al., 2017), breast cancer (Patterson et al., 2017), and renal clear cell carcinoma (Zhao et al., 2020). Pathway analysis further supported a role of mortality‐related CpGs in relation to cancer risk. The intragenic CpGs were enriched for genes in cancer pathways, possibly as a consequence of the expression of nearby genes (cis‐eQTMs analysis, Table S21) related to immune function. The 14 clinical risk factors for mortality were chosen based on prior knowledge. In contrast, there are far fewer established risk factors for cancer death other than age, sex, BMI, smoking, and alcohol consumption. It is not a surprise that the clinical risk factors themselves accurately predicted all‐cause mortality (C‐index = 0.80 in FHS, and 0.75 in ARIC) and CVD death (0.81 in FHS and 0.81 in ARIC), but not cancer death (0.57 in FHS and 0.71 in ARIC). Even though the clinical risk factors are important for stratifying CVD risk, clinical risk factors themselves are unable to reveal molecular mechanism and are thereby unable to highlight causal mechanisms or promising therapeutic targets. After integrating clinical risk factors with DNAm in the all‐cause mortality prediction model, the C‐index only slightly increased (<2%) compared with the clinical risk factors model with regard to all‐cause mortality and CVD death. As shown in Table S14, nine of the 14 clinical risk factors, including age, sex, physical activity, prevalent cancer, type II diabetes, hypertension, CHD, heart failure, and stroke, as well as 36 CpGs that were selected as the representative features. Compared with clinical risk factors, the individual coefficients of the CpGs are much smaller. The small increase in the C‐index and the small coefficients of the CpGs suggest that the contribution of CpGs to the prediction of death may overlap with these clinical risk factors. We also found that the mortality‐related CpGs as the sole input features were still able to predict mortality outcomes after adjusting for clinical risk factors. This suggests that mortality‐related CpGs may identify novel molecular mechanisms contributing to CVD mortality that cannot be captured by existing clinical risk factors. In contrast to CVD and CVD mortality, for which established risk factors are highly predictive of risk, the prediction of cancer and cancer mortality has proved much more challenging. Owing to the lower prediction using clinical risk factors alone (0.57 in FHS and 0.71 in ARIC), the mortality‐related CpGs improved risk prediction of cancer death over and above the clinical risk factor model with an 11% increase in the C‐index in FHS and a 5% increase in ARIC. We further tested whether the all‐cause mortality prediction model can be used to predict mortality among all participants in the FHS with prevalent cancer (n = 389). During a mean follow‐up of 9 years, there were 165 deaths in this group. The integrative mortality model predicted mortality risk among cancer cases (HR [95%CI]: 4.23 [2.63–6.80], p = 2.9 × 10−9). These results in conjunction with MR and pathway analysis show strong evidence of potential causal relations between mortality‐related CpCs and pathways in cancer. Based on these results, we hypothesize that mortality‐related CpGs can shed light on the epigenetic regulation of molecular interactions and help to identify novel therapeutic targets to reduce mortality risk for both CVD and cancer death. Recent studies have used DNAm of multiple CpG sites to predict chronological age (i.e., DNAm age) and showed that DNAm age was associated with all‐cause mortality. We explored the prediction provided by these DNAm age models and show that PhenoAge (Levine et al., 2018), Horvath Age (Horvath, 2013), Hannum Age (Hannum et al., 2013), and GrimAge (Lu et al., 2019) were associated with mortality before accounting for risk factors. Only GrimAge, however, remained associated with mortality after adjusting for clinical risk factors. In contrast, the other three DNAm age models were no longer associated with mortality (Figure 4). One possible explanation is that the three DNAm age predictors (i.e., PhenoAge, Horvath Age, and Hannum Age) identify CpGs associated with age, but are not specific for all‐cause or cause‐specific mortality risk. Of note, the CpGs that serve as DNAm mortality predictors and those that predict DNAm age in the three models do not overlap. Among the top CpGs (N = 177) associated with all‐cause mortality in our EWAS, only cg00687674 in TMEM84 is included in PhenoAge (Levine et al., 2018), and only cg19935065 in DNTT appears in Hannum Age (Hannum et al., 2013). GrimAge may have outperformed the other three DNAm age models in predicting mortality because the CpGs that it uses are associated with the levels of 80 CVD‐related blood proteins, and with lifestyle and clinical risk factors (such as smoking), and mortality (Ho et al., 2018; Shah et al., 2019; Yao et al., 2018). However, because the CpGs in the GrimAge model are not disclosed (i.e., they are proprietary), we were unable to determine whether any of the mortality‐related CpGs in our study overlap with CpGs in the GrimAge model. Of note, our mortality prediction models (both the CpG only model and the integrative model that included CpGs and the clinical risk factors) outperformed GrimAge in prediction of mortality outcomes. We tested and compared four prediction methods including Elastic‐coxph (Friedman et al., 2010), a regression‐based method, and three machine learning methods (Ching et al., 2018; Ishwaran et al., 2008; Katzman et al., 2018). The machine learning models did not outperform Elastic‐coxph (Table S9 and Fig. S2). The clinical risk factor model trained by machine learning methods did not perform well in independent external replication. For example, the C‐index of the clinical risk factor model for all‐cause mortality was 0.67 using RSF17 versus 0.75 using Elastic‐coxph in ARIC participants. Based on this metric, the machine learning methods did not outperform the regression‐based methods when there were relatively few features as inputs. The primary outcome of our study was all‐cause mortality. We did not train prediction models for CVD death or cancer death, but we tested the prediction ability of the all‐cause mortality predictor on CVD death and cancer death. The CpGs in the model were restricted to all‐cause mortality‐related CpGs. As shown in Figure 1, the top DNAm signatures for all‐cause mortality showed the same direction of effect for CVD death and cancer death. It is possible that some CpGs show opposite directions in relation to CVD death and cancer death, but we did not train separate models for these outcomes. Therefore, developing separate prediction models for CVD death and cancer death with a very large sample size would be an important next step.

CONCLUSIONS

In conclusion, the ancestry‐stratified epigenome‐wide meta‐analyses in 15 population‐based cohorts identified replicable DNAm signatures of all‐cause and cause‐specific mortality. The top mortality‐associated CpGs were linked with genes involved in immune‐ and cancer‐related pathways, and were reported to be associated with human longevity, CVD risk factors, and several types of cancer. We constructed and validated DNAm‐based prediction models that predicted mortality risk independent of established clinical risk factors. The prediction model trained by integrating DNAm with clinical risk factors showed small improvement in prediction of all‐cause mortality and CVD death, and a more substantial improvement in prediction of cancer death, compared with the model trained by clinical risk factors alone. The mortality‐related CpG sites and the DNAm‐based prediction models may serve as useful clinical tools for assessing all‐cause and cause‐specific mortality risk and for developing new therapeutic strategies.

METHODS

This study included 15,013 participants from 15 population‐based cohorts. There were 11,684 European ancestry (EA) participants from 12 cohorts, including the Atherosclerosis Risk in Communities (ARIC) Study, the Cardiovascular Health Study (CHS), the Danish Twin Register sample (DTR), the Epidemiologische Studie zu Chancen der Verhütung, Früherkennung und optimierten Therapie chronischer Erkrankungen in der älteren Bevölkerung (ESTHER), the Framingham Heart Study (FHS), the Invecchiare in Chianti (InCHIANTI) Study, the Cooperative Health Research in the Region of Augsburg (KORA F4), the Lothian Birth Cohorts of 1921 (LBC1921) and 1936 (LBC1936), the Normative Aging Study (NAS), the Rotterdam Study (RS), and Women's Health Initiative (WHI); and 3329 Africa ancestry (AA) participants from 3 cohorts, including ARIC, CHS, and WHI. For each participant, we calculated the follow‐up time between the date of the blood draw for DNAm measurements and the date at death or last follow‐up. Mean follow‐up was less than 15 years (range 6.2–13.7) for most cohorts, except for ARIC (mean 20.0 for EA and 18.6 for AA). The protocol for each study was approved by the institutional review board of each cohort. Further details for each cohort were included in File S1.

Mortality ascertainment and clinical phenotypes

Outcomes including death from all causes, deaths from CVD, and deaths from cancer were prospectively ascertained in each cohort. Survival status and details of death were ascertained using multiple strategies, including routine contact with participants for health history updates, surveillance at the local hospital, review of obituaries in the local newspaper, and National Death Index queries. Death certificates, hospital and nursing home records prior to death, and autopsy reports were requested and reviewed. Date and cause of death were determined separately for each cohort following review of all available medical records and /or were register‐based. The clinical and lifestyle risk factors (referred to as clinical risk factors for simplicity thereafter) used as covariates in this study included age, sex, body mass index (BMI), smoking, alcohol consumption, physical activity, educational attainment, and prevalent diseases including hypertension, coronary heart disease (CHD), heart failure, stroke, type II diabetes, and cancer. Fourteen clinical risk factors were chosen based on prior knowledge; most of these are key CVD risk factors. The clinical risk factors were ascertained at the time of blood draw for DNAm measurements. BMI was calculated as weight (kg) divided by height squared (m2). Educational attainment (years of educational schooling), physical activity (frequency, intensity, or the metabolic equivalent of task [MET] scores), smoking status (yes/no, or cigs/day), and alcohol consumption (drinks per day) were self‐reported or ascertained by an administered questionnaire at routine research clinic visits. Diabetes was defined as a measured fasting blood glucose level of >125 mg/dl or current use of glucose‐lowering prescription medication. Hypertension was defined as a measured systolic blood pressure (BP) ≥140 mm Hg or diastolic BP ≥90 mm Hg or use of antihypertensive prescription medication. Cancer was defined as the occurrence of any type of cancer excluding non‐melanoma skin cancer.

DNA methylation measurements and quality control

For each cohort, DNA was extracted from whole blood and bisulfite‐converted using a Zymo EZ DNA methylation kit. DNAm was measured using the Illumina Infinium HumanMethylation450 (450K) BeadChip platform (Illumina Inc., San Diego, CA). Each cohort conducted independent laboratory DNAm measurement, quality control (including sample‐wise and probe‐wise filtering, and probe intensity background correction; see File S1).

Cohort‐specific epigenome‐wide association analysis

The correction of methylation data for technical covariates was cohort specific. Each cohort performed an independent investigation to select an optimized set of technical covariates (e.g., batch, plate, chip, row, and column), using measured or imputed blood cell type fractions, surrogate variables, and/or principal components. Most cohorts had previous publications using the same dataset for EWAS of different traits, such as EWAS of alcohol drinking and smoking (Mendelson et al., 2017; Michailidou et al., 2017). In this study, those cohorts used the same strategies as they did previously for correcting for technical variables including batch (see File S1). To avoid false positives driven by single CpG extreme values, in each cohort, we first performed rank‐based inverse normal transformation (INT) of DNAm β‐values (the ratio of methylated probe intensity divided by the sum of the methylation and unmethylated probe intensity). We then conducted time‐to‐event analyses using Cox proportional hazards models to test for associations between each CpG and mortality outcomes including all‐cause mortality, CVD death, and cancer death using the coxph() function in the “survival” R library, adjusting for clinical risk factors (see Mortality ascertainment and clinical risk factors), technical confounders, and familial relatedness. Because ARIC cohorts had much longer follow‐up than the other cohorts, ARIC follow‐up was truncated at 15 years and results were compared to those before truncation to determine whether results were impacted by duration of follow‐up. In this study, we performed INT of DNAm β‐values to avoid false positives driven by extreme values of single CpGs. Using the FHS EWAS results as an example, Table S24 shows that the top CpGs associated with all‐cause mortality (without INT) were no longer significant after performing INT. This finding suggests that if we directly use DNAm β‐values, those extreme outlier values could lead to false‐positive results. Clearly, the distribution of DNA β‐values is non‐normal, and for this reason, we believe that the conservative INT approach we took protected against false‐positive results.

Meta‐analysis

The meta‐analysis was performed for all‐cause mortality, CVD death, and cancer death in EA (n = 11,684) and AA (n = 3329) participants, respectively, using inverse variance‐weighted random‐effects models implemented in metagen() function R packages (https://rdrr.io/cran/meta/man/metagen.html). We chose a random‐effects model because of the heterogeneity in follow‐up length and population demographics in the different cohorts (Table S1). We excluded the EWAS results for a study with <20 deaths. We excluded probes mapping to multiple locations on the sex chromosomes or with an underlying SNP (MAF > 5% in 1000 Genome Project data) at the CpG site or within 10bp of the single base extension. In addition, the meta‐analysis was constrained to methylation probes passing filtering criteria in five or more cohorts (see File S1), which resulted in ~400,000 CpGs that were included in the final analyses. The statistical significance threshold was p < 0.05/400,000 ≈ 1 × 10−7. Three types of sensitivity analyses were performed including (1) correcting for λ values in each cohorts (Devlin et al., 2001), (2) excluding two cohorts with λ > 1.5 from the meta‐analysis, and (3) excluding results of RS, because the cohort‐specific analysis in RS having a strange distribution of top hits. There were 157 CpGs identified at p < 1e‐7 in the RS cohort‐specific analysis. The number is much more than the number of all‐cause mortality‐associated CpGs identified in the other cohorts.

Mortality prediction models

Mortality prediction models based on clinical risk factors and with the addition of DNAm were built and tested in EA cohorts. The analysis flowchart is shown in Figure 1. To ensure unbiased validation, we split the EA cohorts into discovery and replication sets. The discovery cohorts consisted of 8288 participants from 10 cohorts, excluding FHS (n = 2427) and ARIC (n = 969). Candidate CpGs as input features were selected by meta‐analysis in the discovery cohorts. FHS was used to train prediction models. ARIC was used to validate models. To build and replicate a prediction model, the DNAm data were preprocessed utilizing the same strategy as in the EWAS analysis.

Input features

To evaluate the prediction performance of clinical risk factors and DNAm comprehensively, we tested 13 sets of features, Feature set 1 (F1) included age (years), sex (male as 1 and female as 2), and 12 other clinical risk factors including BMI (kg/m2), smoking (current smoker as 1, and former and never smoker as 0), alcohol consumption (g/day), physical activity (MET scores), educational attainment (education years), and prevalent diseases (yes as 1 and no as 0) including hypertension, CHD, heart failure, stroke, type 2 diabetes, and cancer. F2‐F7 were mortality‐related CpGs selected by meta‐analysis in the discovery cohorts by inverse variance‐weighted random‐effects models at a series of p value thresholds, including F2 CpGs at p < 1e‐7, F3 CpGs at p < 1e‐6, F4 CpGs at p < 1e‐5, F5 CpGs at p < 1e‐4, F6 CpGs at p < 1e‐3, and F7 CpGs at p < 0.05. F8‐F13 are F1 (age, sex, and 12 clinical phenotypes) plus F2‐F7, respectively. In doing so, we were able to evaluate the prediction performance based on the clinical risk factors (F1) and the DNAm (F2‐F7), and test if the combination of DNAm with clinical risk factors (F8‐F13) could be able to improve the prediction performance by using clinical risk factors (F1) only and DNAm only (F2‐F7).

Model building

We compared four methods of building prediction models, including 1) Elastic net ‐ Cox proportional hazards method (Elastic‐coxph, using glmnet, a R package) (Friedman et al., 2010); 2) Random survival forest (RSF, using randomForestSRC, a R package) (Ishwaran et al., 2008); 3) Cox‐nnet (https://github.com/lanagarmire/cox‐nnet, a Python package) (Ching et al., 2018); and 4) DeepSurv (https://github.com/jaredleekatzman/DeepSurv, a Python package) (Katzman et al., 2018). The first method is a penalized linear regression method, while the other three are non‐linear machine learning methods. Elastic‐coxph is a Cox regression model regularized with elastic net penalty (Friedman et al., 2010). Performing this method requires to identify best values of two parameters, α and λ. We tuned each model by iterating over a number of α and λ values under cross‐validation. α indicated linearly combined penalties of the lasso (α=0) and ridge (α=1) regression. λ is the shrinkage parameter, when λ =0 indicated no shrinkage, and as λ increases, the coefficients are shrunk ever more strongly. Effectively this will shrink some coefficients close to 0 for optimizing a set of features. The α value was set to 0.5, and the λ value was set to lambda.min when training models. RSF is an ensemble tree model that is based on the random forest method for survival analysis (Ishwaran et al., 2008). The optimized values of parameters in RSF models, including the number of trees (nTrees=100) and nodeSize =15, were chosen by iterating over a number of values which maximized the accuracy of RSF models tested in the replication sets under cross‐validation. RSF can compute feature importance scores for feature selection. Cox‐nnet is an artificial neural network‐based method for survival analysis (Ching et al., 2018). Cox‐nnet includes two layer neural network: one hidden layer and one output layer. The output layer was used to perform Cox regression based on the activation levels of the hidden layer. Cox‐nnet could also compute feature importance scores for feature selection. For each model training, the L2 regularization parameter is optimized using the L2CVProfile Python function by iterating over a number of values under cross‐validation. DeepSurv is a deep learning‐based survival prediction method (Katzman et al., 2018). DeepSurv uses a multi‐layer feed forward neural network, of which the hidden layers consist of a fully connected layer of nodes, followed by a dropout layer, and the output is a single node with a linear activation which estimated the log‐risk function in the Cox model, parameterized by the weight of the network. The values of hyperparameters when using DeepSurv were L2 regularization =0.8, dropout =0.4, learning rate =0.02, hidden layer size (4 layers with nodes 500, 200, 100 and 50), lr_decay =0.001, momentum =0.9 and the activation method (using Scaled Exponential Linear Units), which were optimized by iterating over a number of values each‐by‐each and under cross‐validation. DeepSurv has not been used previously for selecting features. The 2427 FHS participants were randomly split into 5 equal sets (n=485 or 486 in each set), and each set included approximately equal numbers of deaths. We then used 3 of the 5 sets (60%) for model training and the remaining 2 sets (40%) for model testing. In doing so, we obtained 10 combinations. In each training / testing combination, we constructed a model using the training data, and then used the model to generate a mortality risk score based on the testing data. We assessed associations of the predicted mortality risk score (after inversely normal transformation) with all‐cause mortality, CVD death, and cancer death in the testing data using time‐to‐event proportional hazards models. This data partitioning and cross‐validation strategy was only used to assess the robustness of prediction models when using different features and methods, and to select the optimized parameters for training models. The final models reported were built on all FHS participants using the optimized parameters. We also repeated the same analysis steps using FHS participants without cancer at baseline (n=2038; 238 deaths from all causes, 70 from CVD, and 42 from cancer).

Validation

The prediction models built using all FHS participants were tested in ARIC EA participants for the prediction of mortality outcomes. We performed tests on all‐cause mortality and CVD death on all ARIC EA participants truncated at 15 years of follow‐up, and tests on cancer death after excluding prevalent cancer. We further tested the all‐cause mortality prediction model in the CARDIA study. The CARDIA study has 12 years of follow‐up, during which there were 27 deaths from all causes in 905 participants with DNA methylation.

Evaluation of model performance

We used four evaluation metrics to assess model performance, including the concordance index (C‐index) (Harrell Jr et al., 1996), hazards ratio of predicted risk score (inversely transformed) for prediction of mortality, the integrated brier score (IBS) (Brier, 1950), and Kaplan–Meier (KM) survival curves for high‐, medium‐, and low‐risk groups (Kaplan & Meier, 1958). The C‐index reflects the percentage of individuals whose predicted survival times are correctly ordered. A C‐index of 0.50 reflects no improvement in prediction over chance. The brier score measures the mean of the difference between the observed and the estimated survival beyond a certain time. The brier score ranges between 0 and 1, and a larger score indicates higher inaccuracy. The integrated brier score is the brier score averaged over the entire time interval.

DNAm Age

We compared the prediction performance of DNAm age with our DNAm‐based mortality prediction model in relation to all‐cause mortality, CVD death, and cancer death in the ARIC EA cohort (truncating follow‐up at 15 years). Four measures of DNAm age were used in this study, including PhenoAge (Levine et al., 2018), Horvath age (Horvath, 2013), Hannum age (Hannum et al., 2013), and GrimAge (Lu et al., 2019). The Horvath Age is based on 353 CpGs, the Hannum age is based on 71 CpGs, and PhenoAge is based on 513 CpGs. DNAm age was calculated as the sum of the beta values multiplied by the reported effect size. Due to the GrimAge model was not publicly available, the GrimAge was calculated by uploading the DNAm data to the website (http://dnamage.genetics.ucla.edu/). Proportional hazards regression models were used to test the association between inversely rank transformed DNAm age (all 3 approaches) and mortality outcomes, adjusting for age, sex, and clinical covariates (see Mortality ascertainment and clinical phenotypes).

meQTLs

meQTLs (SNPs associated with DNA methylation) were identified from 4170 FHS participants as reported previously, including 4.7 million cis‐meQTLs and 630K trans‐meQTLs at p < 2 × 10−11 for cis and p < 1.5 × 10−14 for trans (Huan et al., 2019). The genotypes were measured using Affymetrix SNP 500K mapping and Affymetrix 50K gene‐focused MIP arrays. Genotypes were imputed using the 1000 Genomes Project panel phase 3 using MACH / Minimac software. SNPs with MAF > 0.01 and imputation quality ratio >0.3 were retained. cis‐meQTLs were defined as SNPs residing within 1 Mb upstream or downstream of a CpG site. The FHS meQTL data resource includes 3.5 times more cis‐, and 10 times more trans‐meQTL SNPs than the other published studies to date (https://ftp.ncbi.nlm.nih.gov/eqtl/original_submissions/FHS_meQTLs/).

Mendelian randomization

Two‐sample Mendelian randomization (MR) was used to identify putatively causal CpGs for human longevity, CVD and CVD risk factors, and cancer types using a multi‐step strategy. Estimated associations and effect sizes between SNPs and traits were based on the latest published GWAS meta‐analysis of human longevity (Deelen et al., 2019), coronary heart disease (CHD) (Nikpay et al., 2015); myocardial infarction (MI) (Nikpay et al., 2015); type II diabetes (T2D) (Scott et al., 2017); body mass index (BMI) (Locke et al., 2015); lipids traits including high‐density lipoprotein (HDL) cholesterol, low‐density lipoprotein (LDL) cholesterol, total cholesterol (TC), and triglycerides (TG) (Willer et al., 2013); systolic blood pressure (SBP) and diastolic blood pressure (DBP) (Evangelou et al., 2018), and cancer types including breast cancer (Michailidou et al., 2017), prostate cancer (Schumacher et al., 2018), lung cancer (Wang et al., 2014) and ovarian cancer (Phelan et al., 2017). We were unable to include some other popular cancer types, because their GWAS data were not be accessible by us. Instrumental variables (IVs) for each CpG site consisted of independent cis‐meQTLs pruned at linkage disequilibrium (LD) r 2 < 0.01, retaining only one cis‐meQTL variant with the lowest SNP‐CpG p value in each LD block. LD proxies were defined using 1000 genomes imputation in EA. Inverse variance‐weighted (IVW) MR tests were performed on CpGs with at least three independent cis‐meQTL variants, which is the minimum number of IVs needed to perform multiple instruments MR. The multiple instruments improved the precision of IV estimates and allowed the examination of underlying IV assumption (Palmer et al., 2012). Among 177 all‐cause mortality‐related CpGs at p < 1 × 10−7, MR tests were performed on 17 CpGs having ≥3 independent cis‐meQTL SNPs. To test the validity of IVW‐MR results, we performed heterogeneity and MR‐EGGER pleiotropy tests for all IVs. The statistical significance threshold for MR is p MR < 0.05/17, and both p heter and p pleio were required to be >0.05.

eQTMs

Association tests of DNAm and gene expression were performed in 3684 FHS participants with available DNAm and gene expression data. mRNA was extracted from whole blood (collected in PAXgene tubes) and profiled using the Affymetrix Human Exon 1.0 ST GeneChip platform. Raw gene expression data were first normalized using the RMA (robust multi‐array average) from Affymetrix Power Tools (APT, thermofisher.com/us/en/home/life‐science/microarray‐analysis/affymetrix.html#1_2) with quantile normalization. Then, output expression values of 17,318 genes were extracted by APT based on NetAffx annotation version 31. DNAm β values were adjusted for age, sex, predicted blood cell fraction, the two top PCs of DNAm, and 25 surrogate variables (SVs), with DNAm as a fixed effect, and batch as a random effect by fitting LME models. Residuals (DNAm_resid) were retained. The gene expression values were adjusted for age, sex, predicted blood cell fraction, a set of technical covariates, the two top PCs and 25 SVs, with gene expression as a fixed effect, and batch as a random effect by LME, and residuals (mRNA_resid) were retained. Then, linear regression models were used to assess pair‐wise associations between DNAm_resid and mRNA_resid. SVs were calculated using the SVA package in R. A cis‐CpG‐mRNA pair was defined as a CpG residing ± 1 Mb of the TSS of the corresponding gene encoding the mRNA (cis‐eQTM). The annotations of CpGs and transcripts were obtained from annotation files of the HumanMethylation450K BeadChip and the Affymetrix exon array S1.0 platforms. We estimated that there were 1.6 × 108 potential cis‐ CpG‐mRNA pairs. We only used cis‐eQTMs in this study because trans‐eQTMs were not replicated in independent external studies. The statistical significance threshold was p < 3 × 10−10 (0.05/1.6 × 108).

Gene ontology and pathway enrichment analysis

Gene ontology and pathway enrichment analyses were performed on the genes annotated in relation to the 177 all‐cause mortality‐related CpGs at p < 1 × 10−7or p < 1 × 10−5 as well as the cis‐eQTM genes associated with those CpGs. To improve focus in this study, we only used results of KEGG and Gene Ontology–biological process (GO‐BP) terms. Enrichment tests used gometh function in missMethy R package, which can take into account two types of bias in DNA methylation study: (1) the differing number of probes per gene present on the array, and (2) CpGs that are annotated to multiple genes (Maksimovic et al., 2021).

CONFLICT OF INTEREST

The authors declare no conflict of interest.

AUTHORS’ CONTRIBUTIONS

T. H., D.L., and J.P. designed, directed, and supervised the project. T. H. and D.L. drafted the manuscript. T.H., S.N., E.C., C. R., D.H., J.B., M.S., Y.Z., A.B., E.M., and T.T. conducted the analyses. All authors participated in revising and editing the manuscripts. All authors have read and approved the final version of the manuscript.

CONSENT TO PARTICIPATE

This study included participants from 12 population‐based cohorts studies, including the Atherosclerosis Risk in Communities (ARIC) Study, the Cardiovascular Health Study (CHS), the Danish Twin Register sample (DTR), the Epidemiologische Studie zu Chancen der Verhütung, Früherkennung und optimierten Therapie chronischer Erkrankungen in der älteren Bevölkerung (ESTHER), the Framingham Heart Study (FHS), the Invecchiare in Chianti (InCHIANTI) Study, the Cooperative Health Research in the Region of Augsburg (KORA F4), the Lothian Birth Cohorts of 1921 (LBC1921) and 1936 (LBC1936), the Normative Aging Study (NAS), the Rotterdam Study (RS), and Women's Health Initiative (WHI). All study participants provided written informed consent. Fig S1‐3 Click here for additional data file. Tables S1‐S4 Click here for additional data file.
  53 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  The role of DNA methylation in mammalian epigenetics.

Authors:  P A Jones; D Takai
Journal:  Science       Date:  2001-08-10       Impact factor: 47.728

3.  The epigenetic clock is correlated with physical and cognitive fitness in the Lothian Birth Cohort 1936.

Authors:  Riccardo E Marioni; Sonia Shah; Allan F McRae; Stuart J Ritchie; Graciela Muniz-Terrera; Sarah E Harris; Jude Gibson; Paul Redmond; Simon R Cox; Alison Pattie; Janie Corley; Adele Taylor; Lee Murphy; John M Starr; Steve Horvath; Peter M Visscher; Naomi R Wray; Ian J Deary
Journal:  Int J Epidemiol       Date:  2015-01-22       Impact factor: 7.196

4.  Genome-wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease.

Authors:  Chen Yao; George Chen; Ci Song; Joshua Keefe; Michael Mendelson; Tianxiao Huan; Benjamin B Sun; Annika Laser; Joseph C Maranville; Hongsheng Wu; Jennifer E Ho; Paul Courchesne; Asya Lyass; Martin G Larson; Christian Gieger; Johannes Graumann; Andrew D Johnson; John Danesh; Heiko Runz; Shih-Jen Hwang; Chunyu Liu; Adam S Butterworth; Karsten Suhre; Daniel Levy
Journal:  Nat Commun       Date:  2018-08-15       Impact factor: 14.919

5.  The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019.

Authors:  Annalisa Buniello; Jacqueline A L MacArthur; Maria Cerezo; Laura W Harris; James Hayhurst; Cinzia Malangone; Aoife McMahon; Joannella Morales; Edward Mountjoy; Elliot Sollis; Daniel Suveges; Olga Vrousgou; Patricia L Whetzel; Ridwan Amode; Jose A Guillen; Harpreet S Riat; Stephen J Trevanion; Peggy Hall; Heather Junkins; Paul Flicek; Tony Burdett; Lucia A Hindorff; Fiona Cunningham; Helen Parkinson
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

6.  A meta-analysis of genome-wide association studies identifies multiple longevity genes.

Authors:  Joris Deelen; Daniel S Evans; Dan E Arking; Niccolò Tesi; Marianne Nygaard; Xiaomin Liu; Mary K Wojczynski; Mary L Biggs; Ashley van der Spek; Gil Atzmon; Erin B Ware; Chloé Sarnowski; Albert V Smith; Ilkka Seppälä; Heather J Cordell; Janina Dose; Najaf Amin; Alice M Arnold; Kristin L Ayers; Nir Barzilai; Elizabeth J Becker; Marian Beekman; Hélène Blanché; Kaare Christensen; Lene Christiansen; Joanna C Collerton; Sarah Cubaynes; Steven R Cummings; Karen Davies; Birgit Debrabant; Jean-François Deleuze; Rachel Duncan; Jessica D Faul; Claudio Franceschi; Pilar Galan; Vilmundur Gudnason; Tamara B Harris; Martijn Huisman; Mikko A Hurme; Carol Jagger; Iris Jansen; Marja Jylhä; Mika Kähönen; David Karasik; Sharon L R Kardia; Andrew Kingston; Thomas B L Kirkwood; Lenore J Launer; Terho Lehtimäki; Wolfgang Lieb; Leo-Pekka Lyytikäinen; Carmen Martin-Ruiz; Junxia Min; Almut Nebel; Anne B Newman; Chao Nie; Ellen A Nohr; Eric S Orwoll; Thomas T Perls; Michael A Province; Bruce M Psaty; Olli T Raitakari; Marcel J T Reinders; Jean-Marie Robine; Jerome I Rotter; Paola Sebastiani; Jennifer Smith; Thorkild I A Sørensen; Kent D Taylor; André G Uitterlinden; Wiesje van der Flier; Sven J van der Lee; Cornelia M van Duijn; Diana van Heemst; James W Vaupel; David Weir; Kenny Ye; Yi Zeng; Wanlin Zheng; Henne Holstege; Douglas P Kiel; Kathryn L Lunetta; P Eline Slagboom; Joanne M Murabito
Journal:  Nat Commun       Date:  2019-08-14       Impact factor: 14.919

7.  A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data.

Authors:  Andrew E Teschendorff; Francesco Marabita; Matthias Lechner; Thomas Bartlett; Jesper Tegner; David Gomez-Cabrero; Stephan Beck
Journal:  Bioinformatics       Date:  2012-11-21       Impact factor: 6.937

8.  Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits.

Authors:  Evangelos Evangelou; Helen R Warren; David Mosen-Ansorena; Borbala Mifsud; Raha Pazoki; He Gao; Georgios Ntritsos; Niki Dimou; Claudia P Cabrera; Ibrahim Karaman; Fu Liang Ng; Marina Evangelou; Katarzyna Witkowska; Evan Tzanis; Jacklyn N Hellwege; Ayush Giri; Digna R Velez Edwards; Yan V Sun; Kelly Cho; J Michael Gaziano; Peter W F Wilson; Philip S Tsao; Csaba P Kovesdy; Tonu Esko; Reedik Mägi; Lili Milani; Peter Almgren; Thibaud Boutin; Stéphanie Debette; Jun Ding; Franco Giulianini; Elizabeth G Holliday; Anne U Jackson; Ruifang Li-Gao; Wei-Yu Lin; Jian'an Luan; Massimo Mangino; Christopher Oldmeadow; Bram Peter Prins; Yong Qian; Muralidharan Sargurupremraj; Nabi Shah; Praveen Surendran; Sébastien Thériault; Niek Verweij; Sara M Willems; Jing-Hua Zhao; Philippe Amouyel; John Connell; Renée de Mutsert; Alex S F Doney; Martin Farrall; Cristina Menni; Andrew D Morris; Raymond Noordam; Guillaume Paré; Neil R Poulter; Denis C Shields; Alice Stanton; Simon Thom; Gonçalo Abecasis; Najaf Amin; Dan E Arking; Kristin L Ayers; Caterina M Barbieri; Chiara Batini; Joshua C Bis; Tineka Blake; Murielle Bochud; Michael Boehnke; Eric Boerwinkle; Dorret I Boomsma; Erwin P Bottinger; Peter S Braund; Marco Brumat; Archie Campbell; Harry Campbell; Aravinda Chakravarti; John C Chambers; Ganesh Chauhan; Marina Ciullo; Massimiliano Cocca; Francis Collins; Heather J Cordell; Gail Davies; Martin H de Borst; Eco J de Geus; Ian J Deary; Joris Deelen; Fabiola Del Greco M; Cumhur Yusuf Demirkale; Marcus Dörr; Georg B Ehret; Roberto Elosua; Stefan Enroth; A Mesut Erzurumluoglu; Teresa Ferreira; Mattias Frånberg; Oscar H Franco; Ilaria Gandin; Paolo Gasparini; Vilmantas Giedraitis; Christian Gieger; Giorgia Girotto; Anuj Goel; Alan J Gow; Vilmundur Gudnason; Xiuqing Guo; Ulf Gyllensten; Anders Hamsten; Tamara B Harris; Sarah E Harris; Catharina A Hartman; Aki S Havulinna; Andrew A Hicks; Edith Hofer; Albert Hofman; Jouke-Jan Hottenga; Jennifer E Huffman; Shih-Jen Hwang; Erik Ingelsson; Alan James; Rick Jansen; Marjo-Riitta Jarvelin; Roby Joehanes; Åsa Johansson; Andrew D Johnson; Peter K Joshi; Pekka Jousilahti; J Wouter Jukema; Antti Jula; Mika Kähönen; Sekar Kathiresan; Bernard D Keavney; Kay-Tee Khaw; Paul Knekt; Joanne Knight; Ivana Kolcic; Jaspal S Kooner; Seppo Koskinen; Kati Kristiansson; Zoltan Kutalik; Maris Laan; Marty Larson; Lenore J Launer; Benjamin Lehne; Terho Lehtimäki; David C M Liewald; Li Lin; Lars Lind; Cecilia M Lindgren; YongMei Liu; Ruth J F Loos; Lorna M Lopez; Yingchang Lu; Leo-Pekka Lyytikäinen; Anubha Mahajan; Chrysovalanto Mamasoula; Jaume Marrugat; Jonathan Marten; Yuri Milaneschi; Anna Morgan; Andrew P Morris; Alanna C Morrison; Peter J Munson; Mike A Nalls; Priyanka Nandakumar; Christopher P Nelson; Teemu Niiranen; Ilja M Nolte; Teresa Nutile; Albertine J Oldehinkel; Ben A Oostra; Paul F O'Reilly; Elin Org; Sandosh Padmanabhan; Walter Palmas; Aarno Palotie; Alison Pattie; Brenda W J H Penninx; Markus Perola; Annette Peters; Ozren Polasek; Peter P Pramstaller; Quang Tri Nguyen; Olli T Raitakari; Meixia Ren; Rainer Rettig; Kenneth Rice; Paul M Ridker; Janina S Ried; Harriëtte Riese; Samuli Ripatti; Antonietta Robino; Lynda M Rose; Jerome I Rotter; Igor Rudan; Daniela Ruggiero; Yasaman Saba; Cinzia F Sala; Veikko Salomaa; Nilesh J Samani; Antti-Pekka Sarin; Reinhold Schmidt; Helena Schmidt; Nick Shrine; David Siscovick; Albert V Smith; Harold Snieder; Siim Sõber; Rossella Sorice; John M Starr; David J Stott; David P Strachan; Rona J Strawbridge; Johan Sundström; Morris A Swertz; Kent D Taylor; Alexander Teumer; Martin D Tobin; Maciej Tomaszewski; Daniela Toniolo; Michela Traglia; Stella Trompet; Jaakko Tuomilehto; Christophe Tzourio; André G Uitterlinden; Ahmad Vaez; Peter J van der Most; Cornelia M van Duijn; Anne-Claire Vergnaud; Germaine C Verwoert; Veronique Vitart; Uwe Völker; Peter Vollenweider; Dragana Vuckovic; Hugh Watkins; Sarah H Wild; Gonneke Willemsen; James F Wilson; Alan F Wright; Jie Yao; Tatijana Zemunik; Weihua Zhang; John R Attia; Adam S Butterworth; Daniel I Chasman; David Conen; Francesco Cucca; John Danesh; Caroline Hayward; Joanna M M Howson; Markku Laakso; Edward G Lakatta; Claudia Langenberg; Olle Melander; Dennis O Mook-Kanamori; Colin N A Palmer; Lorenz Risch; Robert A Scott; Rodney J Scott; Peter Sever; Tim D Spector; Pim van der Harst; Nicholas J Wareham; Eleftheria Zeggini; Daniel Levy; Patricia B Munroe; Christopher Newton-Cheh; Morris J Brown; Andres Metspalu; Adriana M Hung; Christopher J O'Donnell; Todd L Edwards; Bruce M Psaty; Ioanna Tzoulaki; Michael R Barnes; Louise V Wain; Paul Elliott; Mark J Caulfield
Journal:  Nat Genet       Date:  2018-09-17       Impact factor: 41.307

9.  DNA Methylation and All-Cause Mortality in Middle-Aged and Elderly Danish Twins.

Authors:  Anne Marie Svane; Mette Soerensen; Jesper Lund; Qihua Tan; Juulia Jylhävä; Yunzhang Wang; Nancy L Pedersen; Sara Hägg; Birgit Debrabant; Ian J Deary; Kaare Christensen; Lene Christiansen; Jacob B Hjelmborg
Journal:  Genes (Basel)       Date:  2018-02-08       Impact factor: 4.096

10.  Gene set enrichment analysis for genome-wide DNA methylation data.

Authors:  Jovana Maksimovic; Alicia Oshlack; Belinda Phipson
Journal:  Genome Biol       Date:  2021-06-08       Impact factor: 13.583

View more
  1 in total

1.  Integrative analysis of clinical and epigenetic biomarkers of mortality.

Authors:  Tianxiao Huan; Steve Nguyen; Elena Colicino; Carolina Ochoa-Rosales; W David Hill; Jennifer A Brody; Mette Soerensen; Yan Zhang; Antoine Baldassari; Mohamed Ahmed Elhadad; Tanaka Toshiko; Yinan Zheng; Arce Domingo-Relloso; Dong Heon Lee; Jiantao Ma; Chen Yao; Chunyu Liu; Shih-Jen Hwang; Roby Joehanes; Myriam Fornage; Jan Bressler; Joyce B J van Meurs; Birgit Debrabant; Jonas Mengel-From; Jacob Hjelmborg; Kaare Christensen; Pantel Vokonas; Joel Schwartz; Sina A Gahrib; Nona Sotoodehnia; Colleen M Sitlani; Sonja Kunze; Christian Gieger; Annette Peters; Melanie Waldenberger; Ian J Deary; Luigi Ferrucci; Yishu Qu; Philip Greenland; Donald M Lloyd-Jones; Lifang Hou; Stefania Bandinelli; Trudy Voortman; Brenner Hermann; Andrea Baccarelli; Eric Whitsel; James S Pankow; Daniel Levy
Journal:  Aging Cell       Date:  2022-05-12       Impact factor: 11.005

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.