| Literature DB >> 35984867 |
Adam Li1, Amber Mueller1, Brad English1, Anthony Arena1, Daniel Vera1, Alice E Kane1, David A Sinclair1.
Abstract
Epigenetic clocks allow us to accurately predict the age and future health of individuals based on the methylation status of specific CpG sites in the genome and are a powerful tool to measure the effectiveness of longevity interventions. There is a growing need for methods to efficiently construct epigenetic clocks. The most common approach is to create clocks using elastic net regression modelling of all measured CpG sites, without first identifying specific features or CpGs of interest. The addition of feature selection approaches provides the opportunity to optimise the identification of predictive CpG sites. Here, we apply novel feature selection methods and combinatorial approaches including newly adapted neural networks, genetic algorithms, and 'chained' combinations. Human whole blood methylation data of ~470,000 CpGs was used to develop clocks that predict age with R2 correlation scores of greater than 0.73, the most predictive of which uses 35 CpG sites for a R2 correlation score of 0.87. The five most frequent sites across all clocks were modelled to build a clock with a R2 correlation score of 0.83. These two clocks are validated on two external datasets where they maintain excellent predictive accuracy. When compared with three published epigenetic clocks (Hannum, Horvath, Weidner) also applied to these validation datasets, our clocks outperformed all three models. We identified gene regulatory regions associated with selected CpGs as possible targets for future aging studies. Thus, our feature selection algorithms build accurate, generalizable clocks with a low number of CpG sites, providing important tools for the field.Entities:
Mesh:
Year: 2022 PMID: 35984867 PMCID: PMC9432708 DOI: 10.1371/journal.pcbi.1009938
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.779
Feature selection methodology (in descending order of correlation scores).
Number of features selected by each method parenthesized in the first column.
| Average R2 Score (from 10CV) | STD (Years) | Mean Absolute Error (Years) | Median Absolute Error (Years) | |
|---|---|---|---|---|
| KBest 2000 de novo then Boruta (35) | 0.873 | 0.05 | 3.82 | 3.08 |
| Intersection of all methods per CV fold then Boruta (102) | 0.865 | 0.06 | 3.9 | 3 |
| KBest 25 de novo (36) | 0.862 | 0.06 | 3.96 | 3.14 |
| Boruta de novo (53) | 0.861 | 0.06 | 3.95 | 3.08 |
| %-RFE de novo to 1500 then Boruta (52) | 0.835 | 0.07 | 4.35 | 3.57 |
| ElasticNet de novo/No Feature Selection (276) | 0.827 | 0.06 | 4.64 | 3.91 |
| %-RFE de novo to 100 (161) | 0.825 | 0.07 | 4.69 | 3.83 |
| Top 10 Most Frequent (10) | 0.825 | 0.08 | 4.59 | 3.7 |
| Top 5 Most Frequent (5) | 0.82 | 0.08 | 4.6 | 3.79 |
| %-RFE de novo to 10000 then Genetic Algorithm (54) | 0.818 | 0.08 | 4.61 | 3.76 |
| SFM ElasticNet de novo then Boruta (7) | 0.813 | 0.07 | 4.7 | 3.71 |
| Genetic Algorithm de novo (85) | 0.812 | 0.08 | 4.72 | 3.68 |
| SFM ElasticNet de novo (16) | 0.81 | 0.07 | 4.74 | 3.84 |
| %-RFE de novo to 1500 then SFM (16) | 0.81 | 0.07 | 4.74 | 3.84 |
| SFM ExtraTrees de novo (5) | 0.77 | 0.08 | 5.36 | 4.27 |
| SFM ExtraTrees de novo then Boruta (5) | 0.77 | 0.08 | 5.36 | 4.271 |
| Neural Network feature selection (65) | 0.76 | 0.08 | 5.65 | 4.79 |
| Post Feature Selection Intersection of all methods (1) | 0.73 | 0.09 | 5.75 | 4.38 |
| Variance Threshold de novo (2) | 0.02 | 0.02 | 11.9 | 10.61 |
Fig 1Comparative methods and the number of features used in each model on the x-axis and their average R2 scores on the y-axis.
R2 scores are relatively similar across the board despite the number of features needed for prediction varying widely.
The five CpG sites that are chosen as most frequent predictors for aging and their associated gene symbols.
| Most Frequent 5 CpG Sites | Associated GeneID |
|---|---|
| cg16867657 | ELOVL2 |
| cg10501210 | C1orf132 |
| cg22454769 | FHL2 |
| cg04875128 | OTUD7A |
| cg19283806 | CCDC102B |
Table showing the results of the two final models trained on the Hannum dataset (GSE40279) [6]) validated on external datasets: Horvath down syndrome blood dataset (GSE52588)[22], Martens exercise blood dataset (GSE85311)[24], and buccal dataset (GSE137688)[23].
Number of CpG sites/features in parentheses.
| Feature Selection Methods | Data set | R2 Score | Mean Absolute Error (Years) | Median Absolute Error (Years) |
|---|---|---|---|---|
| KBest 2000 de novo then Boruta (35) | GSE85311 | 0.931 | 4.66 | 4.18 |
| GSE52588 | 0.946 | 3.35 | 2.68 | |
| GSE137688 | 0.710 | 2.0 | 1.6 | |
| Top 5 Most Frequent (5) | GSE85311 | 0.964 | 5.71 | 5.60 |
| GSE52588 | 0.932 | 4.56 | 3.98 | |
| GSE137688 | 0.470 | 2.72 | 2.29 |
Fig 2Figure showing the Predicted Ages vs Chronological Ages from our two final models on the two external validation datasets GSE85311 and GSE52588.
(A-B) KBest 2000 de novo then Boruta (C-D) Top 5 Most Frequent.
Fig 3Figure showing the Predicted Ages vs Chronological Ages from Horvath’s, Weidner’s and Hannum’s publicly available models/equations on the two external validation datasets GSE85311 and GSE52588.
(A-B) Horvath’s model (C-D) Weidner’s model (E-F) Hannum’s model.
Results of the two models created from the Horvath down syndrome blood dataset (GSE52588) [22] using the same CpGs selected from the two feature selection methods from the initial Hannum experiment.
These models were validated using the same 10CV scheme from the initial Hannum experiment. Number of CpG sites/features in parentheses.
| Feature Selection Method CpGs used | Average R2 Score (from 10CV) | Mean Absolute Error (Years) | Median Absolute Error (Years) |
|---|---|---|---|
| KBest 2000 de novo then Boruta (35) | 0.928 | 3.39 | 2.92 |
| Top 5 Most Frequent (5) | 0.911 | 4.02 | 3.72 |
Fig 4The workflow for feature selection and model evaluation.
Feature selection was performed on training data for each iteration of 10-fold cross validation. The selected features of each iteration are aggregated into a list for each feature selection method type. The unique selected features for each method are collected into a dataframe where post-selection processes such as intersections are performed. We add the results to a dataframe. Each column of selected features in the results dataframe (each representing a different feature selection method) is tested using another training-testing split on the original data. This is done 10 times for 10-CV with the average of all scores being the performance estimate for that feature selection method.