| Literature DB >> 35198010 |
Junyan Wang1, Chunyan Wang2, Yangyan Wei1, Yanhao Zhao1, Can Wang1, Chaolong Lu1, Jin Feng2, Shujin Li1, Bin Cong1.
Abstract
In forensic science, accurate estimation of the age of a victim or suspect can facilitate the investigators to narrow a search and aid in solving a crime. Aging is a complex process associated with various molecular regulations on DNA or RNA levels. Recent studies have shown that circular RNAs (circRNAs) upregulate globally during aging in multiple organisms such as mice and C.elegans because of their ability to resist degradation by exoribonucleases. In the current study, we attempted to investigate circRNAs' potential capability of age prediction. Here, we identified more than 40,000 circRNAs in the blood of thirteen Chinese unrelated healthy individuals with ages of 20-62 years according to their circRNA-seq profiles. Three methods were applied to select age-related circRNA candidates including the false discovery rate, lasso regression, and support vector machine. The analysis uncovered a strong bias for circRNA upregulation during aging in human blood. A total of 28 circRNAs were chosen for further validation in 30 healthy unrelated subjects by RT-qPCR, and finally, 5 age-related circRNAs were chosen for final age prediction models using 100 samples of 19-73 years old. Several different algorithms including multivariate linear regression (MLR), regression tree, bagging regression, random forest regression (RFR), and support vector regression (SVR) were compared based on root mean square error (RMSE) and mean average error (MAE) values. Among five modeling methods, regression tree and RFR performed better than the others with MAE values of 8.767 years (S.rho = 0.6983) and 9.126 years (S.rho = 0.660), respectively. Sex effect analysis showed age prediction models significantly yielded smaller prediction MAE values for males than females (MAE = 6.133 years for males, while 10.923 years for females in the regression tree model). In the current study, we first used circRNAs as additional novel age-related biomarkers for developing forensic age estimation models. We propose that the use of circRNAs to obtain additional clues for forensic investigations and serve as aging indicators for age prediction would become a promising field of interest.Entities:
Keywords: age prediction; biomarkers; circular RNA; forensic genetics; machine learning
Year: 2022 PMID: 35198010 PMCID: PMC8858837 DOI: 10.3389/fgene.2022.825443
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Landscape of circRNA profiles in sequencing samples. (A) Heatmap of circRNA expressions across all 13 samples; (B) Box plots of circRNA TPM readcounts between different ages. (C) Classification of mapped reads taking sample 1 (20 years old) for example. (D) Genomic features of circRNAs in each sample.
Final circRNA markers for the age prediction models. Description of their features, originated gene name, and tendency during aging.
| circRNA ID | Gene | Full gene name | Chr. | Length (nt) | Trend |
|---|---|---|---|---|---|
| hsa_circ_0015789 | DENND1B | DENN domain containing 1B | 1 | 477 | Up |
| hsa_circ_0086306 | UHRF2 | Ubiquitin like with PHD and ring finger domains 2 | 9 | 297 | up |
| hsa_circ_0002454 | DNAJC6 | DnaJ heat shock protein family (Hsp40) member C6 | 1 | 350 | up |
| hsa_circ_0000524 | RBM23 | RNA-binding motif protein 23 | 14 | 189 | up |
| hsa_circ_0004689 | SWT1 | SWT1 RNA endoribonuclease homolog | 1 | 469 | up |
FIGURE 2Histogram of the age distribution for 100 healthy volunteers. The x-axis represents the chronological age of the individuals (age unit is years), and the y-axis (counts) represents the number of individuals.
FIGURE 3Scatter plots of the ΔCt-value versus age for 5 age-related circRNAs in 100 samples for modeling.
Comparison of five prediction models.
| Models | Training set ( | Testing set ( | ||||
|---|---|---|---|---|---|---|
| MAE (years) | RMSE (years) | S.rho | MAE (years) | RMSE (years) | S.rho | |
| Tree | 9.343 | 11.162 | 0.7405 | 8.767 | 10.584 | 0.6983 |
| Bagging | 12.311 | 14.650 | 0.4888 | 10.175 | 12.04 | 0.5866 |
| RFR | 12.442 | 14.543 | 0.4975 | 9.126 | 11.168 | 0.660 |
| SVR | 8.367 | 11.187 | 0.7423 | 11.814 | 13.716 | 0.4511 |
| MLR | 11.925 | 13.690 | 0.5662 | 12.190 | 13.825 | 0.4683 |
FIGURE 4Predicted versus chronological ages using regression tree, bagging, random forest regression, support vector regression, and multivariate linear regression.
Age prediction performance of different models between female and male groups.
| Models | Training set ( | Testing set ( | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Female ( | Male ( | Female ( | Male ( | |||||||||
| MAE (years) | RMSE (years) | S.rho | MAE (years) | RMSE (years) | S.rho | MAE (years) | RMSE (years) | S.rho | MAE (years) | RMSE (years) | S.rho | |
| Tree | 10.284 | 12.226 | 0.654 |
|
|
| 10.923 | 12.634 | 0.562 |
|
|
|
| Bagging | 13.355 | 15.569 | 0.368 |
|
|
| 10.192 | 12.742 | 0.526 |
|
|
|
| RFR | 13.692 | 15.559 | 0.373 |
|
|
| 9.531 | 11.919 | 0.599 |
|
|
|
| SVR | 9.168 | 11.947 | 0.673 |
|
|
| 11.380 | 13.649 | 0.497 | 12.345 | 13.798 | 0.409 |
| MLR | 12.385 | 13.963 | 0.502 |
|
|
| 10.804 | 13.188 | 0.51 | 13.883 | 14.566 | 0.461 |
The bold values means models with a lower MAE value.