| Literature DB >> 34996932 |
Na Sun1, Jiadong Chu1, Wei Hu1, Xuanli Chen1, Nengjun Yi2, Yueping Shen3.
Abstract
There have been few investigations of cancer prognosis models based on Bayesian hierarchical models. In this study, we used a novel Bayesian method to screen mRNAs and estimate the effects of mRNAs on the prognosis of patients with lung adenocarcinoma. Based on the identified mRNAs, we can build a prognostic model combining mRNAs and clinical features, allowing us to explore new molecules with the potential to predict the prognosis of lung adenocarcinoma. The mRNA data (n = 594) and clinical data (n = 470) for lung adenocarcinoma were obtained from the TCGA database. Gene set enrichment analysis (GSEA), univariate Cox proportional hazards regression, and the Bayesian hierarchical Cox proportional hazards model were used to explore the mRNAs related to the prognosis of lung adenocarcinoma. Multivariate Cox proportional hazard regression was used to identify independent markers. The prediction performance of the prognostic model was evaluated not only by the internal cross-validation but also by the external validation based on the GEO dataset (n = 437). With the Bayesian hierarchical Cox proportional hazards model, a 14-gene signature that included CPS1, CTPS2, DARS2, IGFBP3, MCM5, MCM7, NME4, NT5E, PLK1, POLR3G, PTTG1, SERPINB5, TXNRD1, and TYMS was established to predict overall survival in lung adenocarcinoma. Multivariate analysis demonstrated that the 14-gene signature (HR 3.960, 95% CI 2.710-5.786), T classification (T1, reference; T3, HR 1.925, 95% CI 1.104-3.355) and N classification (N0, reference; N1, HR 2.212, 95% CI 1.520-3.220; N2, HR 2.260, 95% CI 1.499-3.409) were independent predictors. The C-index of the model was 0.733 and 0.735, respectively, after performing cross-validation and external validation, a nomogram was provided for better prediction in clinical application. Bayesian hierarchical Cox proportional hazards models can be used to integrate high-dimensional omics information into a prediction model for lung adenocarcinoma to improve the prognostic prediction and discover potential targets. This approach may be a powerful predictive tool for clinicians treating malignant tumours.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34996932 PMCID: PMC8741994 DOI: 10.1038/s41598-021-03645-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The workflow of this study.
Clinical characteristics of the lung adenocarcinoma cohort in the study.
| Factor | TCGA | GEO |
|---|---|---|
| No. of patients | 470 | 437 |
| Age, years, mean (SD) | 65.2 (10.01) | 64.4 (10.10) |
| Female | 251 (53.40) | 218 (49.89) |
| Male | 219 (46.60) | 219 (50.11) |
| White | 371 (78.94) | 289 (66.13) |
| Other | 57 (12.13) | 19 (4.35) |
| Unknown | 42 (8.94) | 129 (29.52) |
| T1 | 160 (34.04) | 149 (34.10) |
| T2 | 251 (53.40) | 249 (56.98) |
| T3 | 42 (8.94) | 28 (6.41) |
| T4 | 17 (3.62) | 11 (2.52) |
| N0 | 312 (66.38) | 297 (67.96) |
| N1 | 91 (19.36) | 87 (19.91) |
| N2 | 67 (14.26) | 53 (12.13) |
| M0 | 314 (66.81) | 437 (100.00) |
| M1 | 21 (4.47) | 0 (0.00) |
| MX | 135 (28.72) | 0 (0.00) |
| I | 253 (53.83) | – |
| II | 114 (24.26) | – |
| III | 76 (16.17) | – |
| IV | 21 (4.47) | – |
| Unknown | 6 (1.28) | – |
| Neither | 0 (0.00) | 316 (72.31) |
| Chemotherapy | 242 (51.49) | 43 (9.84) |
| Radiotherapy | 228 (48.51) | 20 (4.58) |
| Chemotherapy and Radiotherapy | 0 (0.00) | 44 (10.07) |
| Unknown | 0 (0.00) | 14 (3.20) |
| No | 63 (13.40) | 48 (10.98) |
| Yes | 389 (82.77) | 296 (67.73) |
| Unknown | 18 (3.83) | 93 (21.28) |
Figure 2GSEA results from the c2 reference gene sets of the tumour group.
The measurements of the optimal models for the TCGA lung adenocarcinoma (LUAD) dataset mRNAs by tenfold with 10 repeats cross-validation.
| Method | Deviance | |||
|---|---|---|---|---|
| Mean | SD | Mean | SD | |
| LASSO Cox | 0.649 | 0.007 | 1779.056 | 4.590 |
| 0.626 | 0.013 | 1787.719 | 6.311 | |
| 0.637 | 0.006 | 1786.711 | 3.539 | |
| 0.645 | 0.006 | 1781.919 | 2.940 | |
| 0.649 | 0.006 | 1779.648 | 3.111 | |
| sλ − 0.01, 0.5 | 0.651 | 0.006 | 1779.030 | 3.328 |
| 0.650 | 0.006 | 1779.095 | 3.984 | |
| 0.649 | 0.007 | 1779.092 | 4.659 | |
| 0.648 | 0.007 | 1779.666 | 5.335 | |
| 0.646 | 0.007 | 1780.872 | 5.847 | |
| 0.645 | 0.007 | 1782.788 | 6.259 | |
| 0.643 | 0.007 | 1785.146 | 6.803 | |
Significant values are in bold.
s = c (s − 0.05, s − 0.04, s − 0.03, s − 0.02, s − 0.01, s, s + 0.01, s + 0.02, s + 0.03, s + 0.04, s + 0.05), s = 0.0843.
Figure 314-gene prognostic signature. (A) Estimate of HR for 14 genes using the Bayesian hierarchical Cox proportional hazards model with a spike-and-slab prior. (B) The chord diagram of prognosis-related mRNAs. Genes are represented on the left, and pathways are represented on the right. Different pathways are differentiated by different colours.
Figure 4(A) Kaplan–Meier curve of TCGA-LUAD survival data for high-risk and low-risk groups with P < 0.001. (B) Kaplan–Meier curve of GEO survival data for high-risk and low-risk groups with P < 0.001. (C) The ROC curve of the risk score for predicting survival in the TCGA-LUAD Cohort. (D) The ROC curve of the risk score for predicting survival in the GEO Cohort.
Figure 5The CTPS2 and DARS2 expression in lung adenocarcinoma and normal groups in TCGA data. (A) The mRNA level of CTPS2 was dramatically increased in lung adenocarcinoma samples compared with normal lung samples. (B) The mRNA level of DARS2 was significantly higher in lung adenocarcinoma samples compared with normal lung samples.
Figure 6The final prognostic model. (A) Forest diagram of the risk score and clinical variables. (B) The nomogram for predicting the survival probability of lung adenocarcinoma patients at 3, 5 and 10 years. (C–E) Calibration curve of the nomogram for predicting 3-year, 5-year, and 10-year overall survival probability.