| Literature DB >> 31867319 |
Tongtong Zhu1, Yue Gao1, Junwei Wang2, Xin Li1, Shipeng Shang1, Yanxia Wang1, Shuang Guo1, Hanxiao Zhou1, Hongjia Liu1, Dailin Sun1, Hong Chen2, Li Wang1, Shangwei Ning1.
Abstract
Many biological indicators related to chronological age have been proposed. Recent studies found that epigenetic clock or DNA methylation age is highly correlated with chronological age. In particular, a significant difference between DNA methylation age (m-age) and chronological age was observed in cancers. However, the prediction and characterization of m-age in pan-cancer remains an explored area. In this study, 1,631 age-related methylation sites in normal tissues were discovered and analyzed. A comprehensive computational model named CancerClock was constructed to predict the m-age for normal samples based on methylation levels of the extracted methylation sites. LASSO linear regression model was used to screen and train the CancerClock model in normal tissues. The accuracy of CancerClock has proved to be 81%, and the correlation value between chronological age and m-age was 0.939 (P < 0.01). Next, CancerClock was used to evaluate the difference between m-age and chronological age for 33 cancer types from TCGA. There were significant differences between predicted m-age and chronological age in large number of cancer samples. These cancer samples were defined as "age-related cancer samples" and they have some differential methylation sites. The differences between predicted m-age and chronological age may contribute to cancer development. Some of these differential methylation sites were associated with cancer survival. CancerClock provided assistance in estimating the m-age in normal and cancer samples. The changes between m-age and chronological age may improve the diagnosis and prognosis of cancers.Entities:
Keywords: LASSO; chronological age; methylation age; pan-cancer; survival
Year: 2019 PMID: 31867319 PMCID: PMC6905170 DOI: 10.3389/fbioe.2019.00388
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Figure 1Identification of chronological age-related DNA methylation sites in human normal tissues. (A) The barplot shows sample distribution of four age segments in diverse cancer types. The lighter color represents larger age of samples. (B) Distribution of correlation scores between all methylation sites and chronological age. (C) Spearman correlation between age and methylation (P < 0.05, |cor| > 0.3). The pie chart shows the proportion of 1,631 positive and negative methylation sites correlated with chronological age. (D) The scatter diagram shows the correlation between methylation sites (cg16867657, cg23606718) and chronological age in KIRC and BRCA. The overall background is the methylation sites of all samples (gray color), and the blue color represents the correlation between chronological age and methylation age in a certain type of samples such as BRCA and KIRC.
Figure 2Feature selection and model construction of CancerClock. (A) The heatmap shows the 282 age-related methylation sites extracted by the model in all cancer samples. The red and blue represent high and low levels of methylation sites. (B) The barplot shows the number of 282 positive and negative age-related methylation sites. (C) The selection of thresholds in LASSO regression model. The line represents the coefficient values, and the minimum mean-square error corresponding to log(Lambda) is−1.95197. (D) Scatter diagram shows the correlations between correlation coefficient and model weight of CancerClock. (E) The correlation between chronological age and predicted m-age in the training set and the correlation between chronological age and predicted m-age in the test set.
Figure 3Biological processes and phenotypic traits of age-related methylation sites for CancerClock. (A) The barplot shows the combined score and –log (p-values) for enrichment GO terms of 282 extracted methylation sites. Pink, green, and gray colors represent biological process (BP), molecular function (MF), and cellular component (CC), respectively. (B) Network shows the interactions between GO terms and genes. The color of node in network indicates enrichment strength, and the three different shapes represent different biological types of GO terms. The circle represents the genes. (C) The bar chart shows the enrichment counts and significance P-values of each trait from EWAS atlas analysis. (D) The corresponding relationship between methylation site and the genes in which it is located. The relationship is usually one-to-one or one-to-many. (E) The distribution of 282 features of CancerClock model in genome position.
Figure 4The levels of some methylation sites were differential in age-related cancer samples between m-age and chronological age. (A) Age different scores between chronological age and predicted m-age among 33 cancer types. (B) The relationship between chronological age and predicted m-age for all the 33 disease types. Compared to the chronological age group, the pink color indicates that the predicted m-age group was down-regulated, the blue color indicates that the predicted m-age group was up-regulated, and the gray color indicates that the predicted m-age group remained unchanged.
Figure 5The differential methylation sites in age-related samples were associated with survival. Kaplan-Meier survival analysis of two groups of patients with high- (blue line) and low- score (red line) groups. Survival days are shown along the X-axis. Overall survival rates are shown along the Y-axis.