| Literature DB >> 34943227 |
Helena Correia Dias1,2,3, Licínio Manco1, Francisco Corte Real3,4, Eugénia Cunha2,3.
Abstract
The development of age prediction models (APMs) focusing on DNA methylation (DNAm) levels has revolutionized the forensic age estimation field. Meanwhile, the predictive ability of multi-tissue models with similar high accuracy needs to be explored. This study aimed to build multi-tissue APMs combining blood, bones and tooth samples, herein named blood-bone-tooth-APM (BBT-APM), using two different methodologies. A total of 185 and 168 bisulfite-converted DNA samples previously addressed by Sanger sequencing and SNaPshot methodologies, respectively, were considered for this study. The relationship between DNAm and age was assessed using simple and multiple linear regression models. Through the Sanger sequencing methodology, we built a BBT-APM with seven CpGs in genes ELOVL2, EDARADD, PDE4C, FHL2 and C1orf132, allowing us to obtain a Mean Absolute Deviation (MAD) between chronological and predicted ages of 6.06 years, explaining 87.8% of the variation in age. Using the SNaPshot assay, we developed a BBT-APM with three CpGs at ELOVL2, KLF14 and C1orf132 genes with a MAD of 6.49 years, explaining 84.7% of the variation in age. Our results showed the usefulness of DNAm age in forensic contexts and brought new insights into the development of multi-tissue APMs applied to blood, bone and teeth.Entities:
Keywords: DNA methylation (DNAm); SNaPshot; Sanger sequencing; epigenetic age estimation; multi-tissue age prediction models (APMs)
Year: 2021 PMID: 34943227 PMCID: PMC8698317 DOI: 10.3390/biology10121312
Source DB: PubMed Journal: Biology (Basel) ISSN: 2079-7737
Simple and multiple linear regression statistics of the age predictors in ELOVL2, FHL2, EDARADD, PDE4C and C1orf132 genes to test for association between the DNAm levels and chronological age using Sanger sequencing methodology.
| Locus | CpG Site | Location | Multi-Tissue: Type of Samples Included | N | R | R2 | Corrected R2 | SE | MAD | |
|---|---|---|---|---|---|---|---|---|---|---|
| Simple linear regression | ||||||||||
|
| CpG6 | Chr6:11044644 | Blood * + Bones + Teeth | 185 | 0.759 | 0.576 | 0.573 | 14.70 | 6.87 × 10−36 | 12.01 |
|
| CpG1 | Chr2:105399282 | Blood * + Bones + Teeth | 185 | 0.692 | 0.479 | 0.476 | 16.29 | 1.11 × 10−27 | 13.16 |
|
| CpG3 | Chr1:236394382 | Blood * + Bones + Teeth | 185 | −0.682 | 0.465 | 0.462 | 16.51 | 1.21 × 10−26 | 13.52 |
|
| CpG1 | Chr1:207823681 | Blood * + Bones + Teeth | 185 | −0.654 | 0.428 | 0.425 | 17.07 | 5.67 × 10−24 | 13.23 |
|
| CpG2 | Chr19:18233133 | Blood * + Bones + Teeth | 185 | 0.613 | 0.376 | 0.372 | 17.83 | 1.79 × 10−20 | 13.58 |
| Multiple linear regression | ||||||||||
| APM ( | Blood * + Bones + Teeth | 185 | 0.940 | 0.883 | 0.878 | 7.86 | 7.36 × 10−79 | 6.06 | ||
Abbreviations: N, number of samples; R, correlation coefficient; SE, standard error; MAD, mean absolute deviation (years) between chronological and predicted ages. Genomic positions were based on the GRCh38/hg38 assembly. * Blood samples from living and deceased individuals.
Figure 1Predicted age versus chronological age using the multi-locus multi-tissue APM developed for ELOVL2, FHL2, EDARADD, PDE4C and C1orf132 genes including blood samples from living individuals (1), blood samples from deceased individuals (2), bone samples (3), tooth samples from living individuals (4) and tooth samples from deceased individuals (5). The corresponding Spearman correlation coefficients (r) are depicted inside each plot.
Simple and multiple linear regression statistics at the five CpGs of the ELOVL2, FHL2, KLF14, TRIM59 and C1orf132 genes to test for association between the DNAm levels and chronological age using SNaPshot assay.
| Locus | Location | Multi-Tissue: Type of Samples Included | N | R | R2 | Corrected R2 | SE | MAD | |
|---|---|---|---|---|---|---|---|---|---|
| Simple linear regression | |||||||||
|
| Chr6:11044628 | Blood * + Bones + Teeth | 168 | 0.772 | 0.597 | 0.594 | 13.896 | 1.54 × 10−34 | 10.95 |
|
| Chr2:105399282 | Blood * + Bones + Teeth | 168 | 0.686 | 0.471 | 0.468 | 15.885 | 1.36 × 10−24 | 12.63 |
|
| Chr7:130734355 | Blood * + Bones + Teeth | 168 | 0.677 | 0.459 | 0.456 | 16.091 | 6.57 × 10−24 | 12.74 |
|
| Chr1:207823681 | Blood * + Bones + Teeth | 168 | −0.693 | 0.480 | 0.477 | 15.779 | 2.49 × 10−25 | 12.10 |
|
| Chr3:160450189 | Blood * + Bones + Teeth | 168 | 0.584 | 0.341 | 0.337 | 17.780 | 1.17 × 10−16 | 13.64 |
| Multiple linear regression | |||||||||
| APM ( | Blood * + Bones + Teeth | 168 | 0.922 | 0.850 | 0.847 | 8.53 | 3.14 × 10−67 | 6.49 | |
Abbreviations: N, number of samples; R, correlation coefficient; SE, standard error; MAD, mean absolute deviation (years) between chronological and predicted ages. Genomic positions were based on the GRCh38/hg38 assembly. * Blood samples from living and deceased individuals.
Figure 2Predicted age versus chronological age using the multi-tissue APM developed for ELOVL2, C1orf132 and KLF14 genes including blood samples from living individuals (1), blood samples from deceased individuals (2), bone samples (3), tooth samples from living individuals (4) and tooth samples from deceased individuals (5). The corresponding Spearman correlation coefficients (r) are depicted inside each plot.
Evaluation of mean absolute deviation (MAD) between chronological and predicted ages according to four age-range groups in the training set of blood, bone and tooth samples using both methodologies.
| Method | Sanger Sequencing | SNaPshot | |||
|---|---|---|---|---|---|
| Age Range | N | MAD (Years) | N | MAD (Years) | |
| <30 years | 33 | 4.73 | 23 | 5.51 | |
| 31–55 years | 58 | 6.37 | 56 | 6.23 | |
| 56–79 years | 74 | 5.67 | 68 | 6.74 | |
| >80 years | 20 | 8.81 | 21 | 7.37 | |
Comparison between Sanger sequencing and SNaPshot methodologies.
| Method | Sanger Sequencing | SNaPshot |
|---|---|---|
| CpGs and genes included in the APM | 7 CpGs located at 5 genes | 3 CpGs located at 3 genes |
| Age correlation value | 0.940 | 0.922 |
| Variance in age explained | 87.8% | 84.7% |
| Accuracy (MAD) | 6.06 years | 6.49 years |
| Results | Using the Sanger sequencing methodology, more CpGs and genes were included in the APM, but higher age correlation, higher explained variance in age, and a better | |