| Literature DB >> 36175966 |
Andrea Cappozzo1, Cathal McCrory2, Oliver Robinson3, Anna Freni Sterrantino3,4, Carlotta Sacerdote5, Vittorio Krogh6, Salvatore Panico7, Rosario Tumino8, Licia Iacoviello9,10, Fulvio Ricceri11,12, Sabina Sieri6, Paolo Chiodini13, Gareth J McKay14, Amy Jayne McKnight14, Frank Kee14, Ian S Young14, Bernadette McGuinness14, Eileen M Crimmins15, Thalida Em Arpawong15, Rose Anne Kenny2, Aisling O'Halloran2, Silvia Polidoro16, Giuliana Solinas17, Paolo Vineis3, Francesca Ieva1,18, Giovanni Fiorito19,20,21.
Abstract
BACKGROUND: Recent evidence highlights the epidemiological value of blood DNA methylation (DNAm) as surrogate biomarker for exposure to risk factors for non-communicable diseases (NCD). DNAm surrogate of exposures predicts diseases and longevity better than self-reported or measured exposures in many cases. Consequently, disease prediction models based on blood DNAm surrogates may outperform current state-of-the-art prediction models. This study aims to develop novel DNAm surrogates for cardiovascular diseases (CVD) risk factors and develop a composite biomarker predictive of CVD risk. We compared the prediction performance of our newly developed risk score with the state-of-the-art DNAm risk scores for cardiovascular diseases, the 'next-generation' epigenetic clock DNAmGrimAge, and the prediction model based on traditional risk factors SCORE2.Entities:
Keywords: Cardiovascular risk; DNA methylation; Epigenetics; Molecular epidemiology; Risk scores; Surrogate biomarkers
Mesh:
Substances:
Year: 2022 PMID: 36175966 PMCID: PMC9521011 DOI: 10.1186/s13148-022-01341-4
Source DB: PubMed Journal: Clin Epigenetics ISSN: 1868-7075 Impact factor: 7.259
Study sample description
| Study name | Description | Country | Age means (min; max) | Female % | Training/Testing set | |
|---|---|---|---|---|---|---|
| EPIC Italy | Italian sub-sample of the European Investigation into Cancer and Nutrition study | Italy | 1803 | 53.3 (34.7; 74.9) | 62 | Training set for DNAm surrogates and |
| EXPOsOMICS CVD | Case–control study on CVD nested in the EPIC Italy cohort | Italy | 315 | 54.9 (35.2; 69.3) | 53 | Validation set for DNAm surrogates and |
| Understanding Society | The United Kingdom Household Panel Study (UKHLS) | UK | 1174 | 58.0 (28.0; 98.0) | 59 | Validation set for DNAm surrogates |
| TILDA | The Irish Longitudinal Study on Ageing | Ireland | 490 | 62.1 (50.0; 80.0) | 50 | Validation set for DNAm surrogates |
| GSE174818 | Case–control study on COVID-19 susceptibility and progression | USA | 127 | 61.8 (21.0; 90.0) | 40 | Validation set for DNAm surrogates |
| NICOLA | The Northern Ireland Cohort for the Longitudinal Study of Ageing | UK | 1728 | 63.99 (40.0; 96.0) | 52 | Validation set for DNAmCVDscore |
| HRS | The Health and Retirement Study | USA | 2146 | 68.76 (50.0;100.0) | 60 | Validation set for DNAmCVDscore |
Fig. 1Flow chart for development and validation of DNAmCVDscore. Step 1: We train prediction models for developing DNAm surrogates for 13 CVD risk factors/biomarkers using data from the EPIC Italy study (n = 1803). We tested the validity of DNAm surrogates in four independent studies (n = 2107). Nine out of 13 DNAm biomarkers were validated in the testing set. Step 2: 60 candidate DNAm surrogates (nine newly developed + 51 from the literature) were regressed against the time from study recruitment to cardiovascular event in EPIC Italy (n = 1803). The elastic net regression model selected ten DNAm surrogates as components of the DNAmCVDscore. Step 3: In EXPOsOMICS CVD data set (N = 315), NICOLA (N = 1728), and HRS (N = 2146) we evaluated the prediction performance of DNAmCVDscore at different time points (right-censoring follow-up time) using logistic regression models adjusted for chronological age, sex, and recruitment centre (matching variables in EXPOsOMICS CVD) or Cox regression models (in NICOLA and HRS). DNAmCVDscore has a higher AUC for short-term cardiovascular events than for long-term CVD. Step 4: We compared the prediction performance of DNAmCVDscore with previously developed composite biomarkers: MRS, DNAmGrimAge, SCORE2 and SCORE2 + DNAmCVDscore. SCORE2 outperforms epigenetic predictors for long-term CVD risk (occurred more than 8 years after recruitment), whereas DNAmCVDscore predicts short-term events (occurred within 7 years after recruitment) better than other biomarkers. The enriched SCORE2 + DNAmCVDscore model outperformed all the competitors for the entire time horizon considered in the study
List of newly developed DNAm surrogate biomarkers
| Model training, EPIC Italy training set, | Results on EPIC ITALY test set | Results on the validation set | ||||||
|---|---|---|---|---|---|---|---|---|
| Risk factor/biomarker | Model type | Number of CpGs | Pearson R | Validation data sets (N) | Pearson R | Validated DNAm surrogate | ||
| BMI | Mixed-effect LASSO | 405 | 0.59 | < 0.0001 | US, TILDA, EXPOsOMICS, GSE174848 (2,045) | 0.27 | < 0.0001 | Yes |
| CRP | LASSO | 265 | 0.57 | < 0.0001 | US, TILDA, EXPOsOMICS, GSE174849 (1,893) | 0.23 | < 0.0001 | Yes |
| D-dimer | LASSO | 483 | 0.72 | < 0.0001 | EXPOsOMICS, GSE174848 (248) | 0.17 | 0.56 | No |
| Diastolic blood pressure | Mixed-effect LASSO | 401 | 0.57 | < 0.0001 | EXPOsOMICS, TILDA (772) | 0.10 | 0.36 | No |
| Glucose | Mixed-effect LASSO | 354 | 0.67 | < 0.0001 | EXPOsOMICS, TILDA, US (1,810) | 0.28 | 0.007 | Yes |
| HDL cholesterol | Mixed-effect LASSO | 151 | 0.58 | < 0.0001 | EXPOsOMICS, TILDA, US (1,829) | 0.08 | 0.001 | Yes |
| Insulin | Mixed-effect LASSO | 574 | 0.66 | < 0.0001 | EXPOsOMICS (170) | 0.44 | < 0.0001 | Yes |
| LDL cholesterol | Mixed-effect LASSO | 368 | 0.62 | < 0.0001 | EXPOsOMICS, TILDA (661) | 0.15 | 0.36 | No |
| PAI-1 | LASSO | 90 | 0.43 | < 0.0001 | EXPOsOMICS (171) | 0.28 | 0.0001 | Yes |
| Systolic blood pressure | Mixed-effect LASSO | 275 | 0.64 | < 0.0001 | EXPOsOMICS, TILDA (772) | 0.28 | 0.001 | Yes |
| Tissue factor (CD142) | Mixed-effect LASSO | 197 | 0.62 | < 0.0001 | EXPOsOMICS (171) | 0.16 | 0.03 | Yes |
| Total cholesterol | Mixed-effect LASSO | 257 | 0.53 | < 0.0001 | EXPOsOMICS, TILDA, US (1,830) | 0.13 | 0.14 | No |
| Triglycerides | LASSO | 471 | 0.73 | < 0.0001 | EXPOsOMICS, TILDA (661) | 0.22 | 0.0003 | Yes |
For each candidate marker, we reported: the model used to extract significant CpGs (LASSO or mixed-effect LASSO depending on the association with the centre of recruitment), the number of CpGs whose linear combination constitute the best marker prediction, the Pearson correlation coefficient and p value in the primary test set (random 25% of EPIC Italy samples), the Pearson correlation coefficient and p value in independent test sets (random effect meta-analysis across studies). Nine out of 13 DNAm surrogates for CVD risk factors/markers were validated in independent testing set (P value for the Pearson correlation test lower than 0.05). The lists of CpGs and their weights to compute DNAm surrogates in independent data sets are provided in Additional file 1
DNAm surrogates composing the DNAmCVD score
| Study | DNAm surrogate biomarker | Original biomarker/risk factor | |
|---|---|---|---|
| This study | DNAmGlucose | 0.0329 | Blood glucose |
| This study | DNAmHDL | − 0.4473 | Blood HDL cholesterol |
| This study | DNAmSBP | 0.1420 | Systolic blood pressure |
| This study | DNAmCRP | 0.0276 | Blood C-reactive protein |
| This study | DNAmPAI1 | 0.1679 | Blood Plasminogen activator inhibitor 1 |
| Gadd et al. 2022 | DNAmSKR3 | 0.0362 | Blood Serine/threonine-protein kinase receptor R3 |
| Gadd et al. 2022 | DNAmHGF | 0.0371 | Blood Hepatocyte growth factor |
| Colicino et al. 2021 | DNAmLeadPatella | 0.0402 | Lead levels in Patella’s bone |
| Lu et al. 2019 | DNAmGDF15 | 0.0947 | Blood Growth Differentiation Factor 15 |
| Lu et al. 2019 | DNAmPACKYRS | 0.1192 | Smoking pack-years |
DNAmCVDscore is computed as a linear combination of standardised (mean = 0, variance = 1) DNAm surrogates with weights listed in the coefficient column. All biomarkers but DNAmHDL have positive coefficients (higher CVD risk associated with a higher value for the biomarker)
Results from the ROC curve analyses in EXPOsOMICS CVD
| Follow-up time | # Events | DNAmGrimAge | SCORE2 | MRS | SCORE2 + DNAmCVDscore | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| AUC (95% CI) | Sensitivity; specificity | AUC (95% CI) | Sensitivity; specificity | AUC (95% CI) | Sensitivity; specificity | AUC (95% CI) | Sensitivity; specificity | AUC (95% CI) | Sensitivity; specificity | ||
| 18 years | 160 | 0.525 (0.461; 0.589) | 0.477; 0.580 | 0.569 (0.505; 0.632) | 0.647; 0.482 | 0.749 (0.696; 0.803) | 0.719; 0.698 | 0.484 (0.420; 0.548) | 0.523; 0.500 | 0.753 (0.700; 0.807) | 0.732; 0.685 |
| 17 years | 159 | 0.527 (0.463; 0.591) | 0.520; 0.553 | 0.562 (0.498; 0.626) | 0.669; 0.472 | 0.753 (0.700; 0.806) | 0.753; 0.671 | 0.516 (0.452; 0.580) | 0.545; 0.484 | 0.757 (0.704; 0.810) | 0.740; 0.689 |
| 16 years | 158 | 0.527 (0.463; 0.591) | 0.513; 0.554 | 0.559 (0.496; 0.623) | 0.583; 0.572 | 0.751 (0.698; 0.805) | 0.718; 0.692 | 0.527 (0.463; 0.591) | 0.449; 0.610 | 0.756 (0.704; 0.809) | 0.705; 0.717 |
| 15 years | 149 | 0.541 (0.477; 0.604) | 0.518; 0.577 | 0.560 (0.496; 0.623) | 0.681; 0.450 | 0.739 (0.684; 0.794) | 0.741; 0.671 | 0.536 (0.472; 0.600) | 0.509; 0.477 | 0.745 (0.691; 0.799) | 0.687; 0.732 |
| 14 years | 139 | 0.567 (0.504; 0.630) | 0.483; 0.662 | 0.581 (0.518; 0.644) | 0.494; 0.662 | 0.730 (0.674; 0.786) | 0.705; 0.691 | 0.573 (0.509; 0.636) | 0.682; 0.475 | 0.741 (0.686; 0.796) | 0.688; 0.683 |
| 13 years | 123 | 0.595 (0.531; 0.659) | 0.625; 0.537 | 0.592 (0.528; 0.656) | 0.526; 0.667 | 0.720 (0.663; 0.777) | 0.662; 0.699 | 0.595 (0.531; 0.659) | 0.531; 0.650 | 0.731 (0.676; 0.787) | 0.651; 0.740 |
| 12 years | 111 | 0.596 (0.530; 0.662) | 0.529; 0.649 | 0.594 (0.529; 0.659) | 0.510; 0.694 | 0.709 (0.650; 0.768) | 0.588; 0.775 | 0.592 (0.525; 0.658) | 0.588; 0.613 | 0.718 (0.661; 0.775) | 0.598; 0.757 |
| 11 years | 95 | 0.622 (0.554; 0.690) | 0.536; 0.705 | 0.618 (0.549; 0.686) | 0.568; 0.663 | 0.693 (0.630; 0.755) | 0.582; 0.758 | 0.610 (0.540; 0.680) | 0.536; 0.695 | 0.710 (0.651; 0.770) | 0.600; 0.737 |
| 10 years | 84 | 0.622 (0.551; 0.693) | 0.580; 0.607 | 0.619 (0.548; 0.691) | 0.576; 0.631 | 0.700 (0.636; 0.765) | 0.550; 0.845 | 0.602 (0.530; 0.673) | 0.541; 0.667 | 0.721 (0.661; 0.781) | 0.554; 0.821 |
| 9 years | 67 | 0.647 (0.571; 0.722) | 0.601; 0.612 | 0.645 (0.569; 0.721) | 0.649; 0.582 | 0.683 (0.611; 0.754) | 0.609; 0.746 | 0.623 (0.546; 0.700) | 0.581; 0.627 | 0.708 (0.642; 0.775) | 0.577; 0.731 |
| 8 years | 55 | 0.687 (0.611; 0.762) | 0.569; 0.709 | 0.676 (0.601; 0.752) | 0.627; 0.673 | 0.692 (0.618; 0.766) | 0.623; 0.746 | 0.638 (0.554; 0.722) | 0.635; 0.618 | 0.732 (0.664; 0.800) | 0.612; 0.782 |
| 7 years | 37 | 0.711 (0.626; 0.796) | 0.615; 0.703 | 0.707 (0.625; 0.789) | 0.554; 0.811 | 0.678 (0.587; 0.770) | 0.622; 0.703 | 0.668 (0.566; 0.770) | 0.669; 0.622 | 0.721 (0.641; 0.801) | 0.741; 0.649 |
| 6 years | 28 | 0.778 (0.702; 0.854) | 0.673; 0.786 | 0.771 (0.688; 0.853) | 0.634; 0.821 | 0.753 (0.668; 0.839) | 0.686; 0.786 | 0.730 (0.616; 0.844) | 0.669; 0.750 | 0.803 (0.740; 0.867) | 0.676; 0.893 |
| 5 years | 23 | 0.774 (0.695; 0.853) | 0.623; 0.826 | 0.773 (0.694; 0.851) | 0.634; 0.826 | 0.761 (0.665; 0.856) | 0.723; 0.783 | 0.735 (0.618; 0.852) | 0.688; 0.783 | 0.800 (0.741; 0.860) | 0.757; 0.783 |
| 4 years | 16 | 0.823 (0.746; 0.899) | 0.799; 0.688 | 0.821 (0.736; 0.906) | 0.632; 0.938 | 0.785 (0.653; 0.917) | 0.722; 0.875 | 0.783 (0.649; 0.917) | 0.689; 0.875 | 0.830 (0.755; 0.906) | 0.773; 0.750 |
| 3 years | 13 | 0.811 (0.720; 0.901) | 0.642; 0.846 | 0.806 (0.724; 0.888) | 0.652; 0.846 | 0.772 (0.620; 0.923) | 0.772; 0.846 | 0.762 (0.628; 0.896) | 0.768; 0.769 | 0.809 (0.726; 0.892) | 0.705; 0.846 |
| 2 years | 7 | 0.851 (0.767; 0.935) | 0.799; 0.857 | 0.842 (0.745; 0.939) | 0.737; 0.857 | 0.751 (0.575; 0.926) | 0.656; 0.857 | 0.717 (0.562; 0.872) | 0.779; 0.571 | 0.863 (0.766; 0.960) | 0.766; 0.857 |
For each composite biomarker, we report the AUC (95% CI), sensitivity, and specificity according to the best threshold (minimising the distance from the top left corner of the ROC curve) derived from logistic regression model adjusted for matching parameters (age, sex, and centre of recruitment). Predictive performance was evaluated at different time points, right-censoring the follow-up time in the range of 18 to two years, with one year interval
Fig. 2Prediction performance of DNAmCVDscore, MRS, DNAmGrimAge, SCORE2 and SCORE2 + DNAmCVDscore. Area under the ROC curve (AUC), on the y-axis, as a function of the follow-up length (x-axis) for the five composite biomarkers investigated in this study. MRS has the worst prediction performance at each time point. SCORE2 outperforms epigenetic predictors for long-term CVD risk (occurred more than 8 years after recruitment), whereas DNAmCVDscore and DNAmGrimAge predict short-term risk (CVD events within 7 years after recruitment or less) better than the other biomarkers. The enriched SCORE2 + DNAmCVDscore model outperformed all the competitors for the entire time horizon considered in the study
Results from the Cox (proportional hazard) regression models in NICOLA and HRS validation data sets
| NICOLA ( | HRS ( | |
|---|---|---|
| C-index (95% CI) | C-index (95% CI) | |
| 0.73 (0.68; 0.78) | 0.70 (0.65; 0.73) | |
| DNAmGrimAge | 0.72 (0.67; 0.77) | 0.70 (0.65; 0.74) |
| SCORE2 | 0.72 (0.67; 0.77) | 0.69 (0.65; 0.74) |
| MRS | 0.69 (0.64; 0.75) | 0.68 (0.63; 0.72) |
| 0.75 (0.69; 0.80) | 0.71 (0.67; 0.75) |
For each composite biomarker, we report the C-index (95% CI) derived from Cox regression models adjusted for age and sex