| Literature DB >> 34178621 |
Hao Huang1, Jinming Fu1, Lei Zhang1, Jing Xu1, Dapeng Li1, Justina Ucheojor Onwuka1, Ding Zhang1, Liyuan Zhao1, Simin Sun1, Lin Zhu1, Ting Zheng1, Chenyang Jia1, Binbin Cui2, Yashuang Zhao1.
Abstract
BACKGROUND: Aberrant DNA methylation is a critical regulator of gene expression and plays a crucial role in the occurrence, progression, and prognosis of colorectal cancer (CRC). We aimed to identify methylation-driven genes by integrative epigenetic and transcriptomic analysis to predict the prognosis of CRC patients.Entities:
Keywords: colorectal cancer; integrative analysis; methylation-driven genes; overall survival; prognostic risk model
Year: 2021 PMID: 34178621 PMCID: PMC8231008 DOI: 10.3389/fonc.2021.629860
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Summary of patient demographics and clinical characteristics.
| Characteristics | Groups | Patients | |||||
|---|---|---|---|---|---|---|---|
| Total (N = 722) | Training set (N = 367) | Testing set (N = 355) | |||||
| No. | % | No. | % | No. | % | ||
| Age at diagnosis | |||||||
| Median | 65.3 | 64.4 | 63.7 | ||||
| Range | 21.0–97.0 | 31.0–90.0 | 21.0–94.0 | ||||
| <65 years | 354 | 49.0 | 172 | 46.9 | 182 | 51.3 | |
| ≥65 years | 368 | 51.0 | 195 | 53.1 | 173 | 48.7 | |
| Gender | |||||||
| Male | 394 | 54.6 | 199 | 54.2 | 195 | 54.9 | |
| Female | 328 | 45.4 | 168 | 45.8 | 160 | 45.1 | |
| TNM stage | |||||||
| I | 86 | 11.9 | 55 | 15.0 | 31 | 8.7 | |
| II | 216 | 29.9 | 141 | 38.4 | 75 | 21.1 | |
| III | 208 | 28.8 | 117 | 31.9 | 91 | 25.6 | |
| IV | 212 | 29.4 | 54 | 14.7 | 158 | 44.5 | |
| Vital status | |||||||
| Living | 458 | 63.4 | 287 | 78.2 | 171 | 48.2 | |
| Dead | 264 | 36.6 | 80 | 21.8 | 184 | 51.8 | |
Figure 1Identification of methylation-driven genes in CRC patients. (A) Heat map of 143 CRC-related methylation-driven genes. The color change from green to red illustrates a trend from hypomethylation to hypermethylation. |log FC|≥0, adjusted P < 0.05, and Cor <−0.5. CRC, colorectal cancer; FC, fold change. (B) Selection of driven genes in the LASSO model. (C) Tuning parameter (λ) selection in the LASSO model used cross-validation via the maximum criteria. The dotted vertical lines were drawn at the optimal values using the maximum criteria and the one standard error of the maximum criteria.
Identified four methylation-driven genes in the prognostic signature and their multivariable Cox associated with prognosis.
| Gene symbol | Coefficient | HR | HR (95% Low) | HR (95% High) |
|
|---|---|---|---|---|---|
|
| 0.253 | 1.288 | 1.088 | 1.526 | 0.003 |
|
| 0.147 | 1.158 | 1.046 | 1.282 | 0.005 |
|
| −0.183 | 0.833 | 0.691 | 1.003 | 0.053 |
|
| −0.172 | 0.842 | 0.732 | 0.968 | 0.015 |
Derived from the multivariable Cox regression analysis in the training set.
Figure 2Construction of four-gene risk score model in the TCGA dataset. (A) Distribution of risk scores in the high-risk and low-risk groups. (B) Survival overview in two high-risk and low-risk groups. (C) Heatmap of the four-gene expression profiles corresponding risk scores in the high-risk and low-risk groups in the TCGA database. (D) Comparison of OS between the high-risk and low-risk groups. OS, overall survival.
Figure 3Mediation analysis for methylation-driven prognostic signature through mRNA expression. (A) Diagram of a mediation model. (B) The risk score of four methylation-driven genes’ methylation level was considered as “exposure” (scoremethylation); the mediator was the linear combination of the corresponding four genes’ expression level (scoreexpression) (Overall model). Total prognostic effect in the hazard ratio (HR) was described as direct effect (HRdirect), indirect effect (HRindirect), corresponding 95% CI, and the proportion of effect mediated (M%). Furthermore, sensitivity analyses were performed by excluding each gene, respectively, which retained statistical significance for the mediation effect. CI, confidence interval.
Figure 4Predictive OS performance of the signature using time-dependent ROC analysis and the nomogram in training and validation sets. (A) Time-dependent ROC curves analysis for the 3-, 5-, and 10-year OS prediction by signature in the training set. (B) Time-dependent ROC curves analysis for the 3-, 5-, and 8-year OS prediction by signature in the testing set. (C) Nomogram to predict the 1-, 5-, and 10-year OS of CRC patients in the training set. (D) Calibration curves of 5-year OS nomogram model in the training set. (E) Nomogram to predict the 1-, 3-, and 5-year OS of CRC patients in the testing set. (F) Calibration curves of 5-year OS nomogram model in the testing set. The gray line represents the ideal predictive model, and the red line represents the observed model.
Univariable and multivariable Cox regression analyses of the four methylation-driven genes signature and survival of CRC patients in the training and testing sets.
| Variables | Training set (N = 367) | Testing set (N = 355) | ||||||
|---|---|---|---|---|---|---|---|---|
| 95% CI | 95% CI | |||||||
| HR | Lower | Upper |
| HR | Lower | Upper |
| |
|
| ||||||||
| Age | ||||||||
| ≥65 years | 2.170 | 1.328 | 3.547 | 0.002 | 0.938 | 0.702 | 1.253 | 0.664 |
| Sex | ||||||||
| Male | 1.449 | 0.923 | 2.274 | 0.107 | 0.958 | 0.717 | 1.282 | 0.774 |
| TNM stage | ||||||||
| III+IV | 2.765 | 1.741 | 4.391 | 0.000 | 4.251 | 2.742 | 6.591 | 0.000 |
|
| ||||||||
| High risk | 2.351 | 1.472 | 3.755 | 0.000 | 1.963 | 1.456 | 2.647 | 0.000 |
|
| ||||||||
| Age | ||||||||
| ≥65 years | 2.355 | 1.421 | 3.903 | 0.001 | 1.270 | 0.942 | 1.712 | 0.117 |
| Sex | ||||||||
| Male | 1.123 | 0.712 | 1.771 | 0.618 | 0.942 | 0.702 | 1.264 | 0.690 |
|
| ||||||||
| III+IV | 3.291 | 2.049 | 5.286 | 0.000 | 3.967 | 2.508 | 6.274 | 0.000 |
|
| ||||||||
| High risk | 2.221 | 1.382 | 3.571 | 0.001 | 1.436 | 1.051 | 1.962 | 0.023 |