| Literature DB >> 32328180 |
Hongwei Liu1,2, Qi Yang1,2, Yi Xiong1,2,3, Zujian Xiong1,2,3, Xuejun Li1,2.
Abstract
Glioblastoma (GBM) is a common malignant brain tumor of the central nervous system with a poor prognosis. In order to identify the prognostic signatures of GBM, we screened differentially expressed genes (DEGs) that were based on a single-cell RNA sequencing (scRNA-seq) dataset. These genes characteristically represent the intra-tumor heterogenicity of glioblastoma. Moreover, we performed univariate analysis, log-rank test and multivariate Cox regression analyses to confirm a gene set that could be related to the overall survival (OS) among DEGs. Prognostic associated signatures (PAS) were utilized to construct a model for predicting OS in GBM patients. When considering either the training or the validation sets, time-dependent receiver operating characteristic (ROC) curves all indicated that our model displayed an excellent predictive ability. Additionally, we analyzed PAS at the single-cell level and found that the PAS score was associated with somatic mutations and clinical factors. Three factors, which included the PAS score, radiotherapy status, and age, were all used to establish a nomogram to predict the 6-month and 1-year survival probabilities. In conclusion, we constructed an optimal model that was derived from scRNA-seq to better predict the survival probability of GBM patients. These genes might also act as potential prognostic biomarkers and enable surgeons to develop individually therapeutic schedules and improve the prognosis of GBM patients. © The author(s).Entities:
Keywords: differentially expressed genes; glioblastoma; prognostic model; single cell; survival analysis
Year: 2020 PMID: 32328180 PMCID: PMC7171486 DOI: 10.7150/jca.44034
Source DB: PubMed Journal: J Cancer ISSN: 1837-9664 Impact factor: 4.207
Figure 1Workflow of the analytical procedure for constructing and validating prognostic signatures in GBM patients.
Figure 2Characteristics of single-cell RNA-seq data and DEGs. (A) Jack Straw Plot showing the p-value distributions for each PC. (B) Dimension reduction analysis of single-cell RNA-seq data by the UMAP algorithm that clusters cells into four groups. (C) Heat map expression profiles of the top 10 genetic markers in each cluster. (D) PCA of tumorigenic and normal cells that can be clearly separated. (E) Volcano plot of DEGs with a log2 fold-change >2, and an adjusted alpha value of p <0.01. (F) Heat map expression profiles of the 100 most significant genes for tumorigenic and normal cells.
Figure 3GSEA results and KEGG analysis of differentially expressed genes. (A) The five most significant biological processes for tumor cells in the GSEA of GO. (B) The five most significant biological processes for normal cells in the GSEA of GO. (C) Diagram illustrating the top 10 pathways in the KEGG enrichment analysis of DEGs.
Figure 4Establishing a survival model in the GBM cohort. (A) Summary of the selected DEGs with a prognostic capacity, in which we only selected genes during the area covered by indicated colors. (B) The optimal account of genes that corresponded to minimum lambda was 18 in the TCGA-GBM cohort. (C) The ROC analysis curves in predicting OS by the PAS score in the TCGA-GBM cohort. (D) The KM curves for low- and high-risk groups in the TCGA-GBM cohort. (E) The ROC curves for predicting OS by the PAS score in the GSE16011 cohort. (F) The KM curves for low- and high-risk groups in the GSE16011 cohort.
Statistical analysis of 18 genes associated with survival in the TCGA-GBM cohort.
| Variables | Log-rank test | Univariate cox regression analysis | LASSO | |
|---|---|---|---|---|
| P value | HR (95% CI) | P value | Coefficient | |
| AGAP2-AS1 | 0.018 | 1.24(1.10-1.40) | <0.001 | 0.057478 |
| CLEC18C | 0.012 | 3.73(1.58-8.80) | 0.003 | 0.2067 |
| CNPY4 | 0.046 | 1.87(1.32-2.65) | <0.001 | 0.054552 |
| COL22A1 | 0.015 | 1.38(1.15-1.65) | <0.001 | 0.090813 |
| CRNDE | 0.010 | 1.31(1.04-1.66) | 0.025 | 0.002454 |
| HOXB2 | 0.001 | 1.19(1.05-1.34) | 0.007 | 0.014237 |
| HOXD11 | 0.049 | 1.32(1.06-1.64) | 0.012 | 0.019821 |
| LOXL1 | 0.011 | 1.43(1.21-1.69) | <0.001 | 0.034217 |
| MBLAC1 | 0.006 | 1.79(1.18-2.73) | 0.007 | 0.10079 |
| OSMR-AS1 | 0.013 | 2.92(1.69-5.05) | <0.001 | 0.44677 |
| PCDHB3 | 0.029 | 1.24(1.01-1.52) | 0.039 | 0.01033 |
| PTPRN | <0.001 | 1.42(1.21-1.67) | <0.001 | 0.19405 |
| RGS14 | 0.006 | 1.57(1.18-2.07) | 0.002 | 0.121256 |
| TCAF2 | <0.001 | 1.85(1.31-2.62) | <0.001 | 0.09673 |
| TSHZ2 | <0.001 | 1.93(1.34-2.78) | <0.001 | 0.023308 |
| TSPAN4 | 0.015 | 2.11(1.40-3.18) | <0.001 | 0.0621 |
| BEST3 | 0.037 | 0.73(0.59-0.91) | 0.006 | -0.0173 |
| FERMT1 | 0.004 | 0.81(0.70-0.94) | 0.006 | -0.02211 |
Figure 5Analysis of 18 prognostic genes in the single-cell and bulk analysis dataset. (A) Differential expression levels of each gene in four clusters in the single-cell dataset. (B) The somatic mutation landscape of TCGA-GBM patients with a high PAS score. (C) The somatic mutational landscape of TCGA-GBM patients with a low PAS score.
Figure 6Development of a nomogram in predicting the 6-month and 1-year survival probabilities in GBM patients. (A) Multivariate Cox regression analysis of the PAS score with several clinical factors. (B) Integration of three factors to construct a nomogram for the prediction of 6-month and 1-year OS rates. (C) Calibration curve to validate the predictive efficacy of the model for 6 month and 1-year OS rates.