| Literature DB >> 34565408 |
Minjie Fu1,2,3, Jinsen Zhang1,2,3, Weifeng Li4, Shan He4, Jingwen Zhang1,2,3, Daniel Tennant5, Wei Hua6,7,8, Ying Mao9,10,11.
Abstract
BACKGROUND: The molecular profiling of glioblastoma (GBM) based on transcriptomic analysis could provide precise treatment and prognosis. However, current subtyping (classic, mesenchymal, neural, proneural) is time-consuming and cost-intensive hindering its clinical application. A simple and efficient method for classification was imperative.Entities:
Keywords: CD276; Glioblastoma; Molecular subtype; OLIG2; Random forest
Mesh:
Substances:
Year: 2021 PMID: 34565408 PMCID: PMC8474912 DOI: 10.1186/s12967-021-03083-y
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
Fig. 1The flowchart of study design
The baseline of training cohort and test cohort
| Characteristic | Training cohort | Test cohort |
|---|---|---|
| Age, year | ||
| Median | 59.2 | 60.1 |
| Range | 10.9–89.3 | 25.2–86.6 |
| Age, no. (%) | ||
| < 60 year | 52.9 | 49 |
| ≥ 60 year | 46.2 | 49 |
| NA | 0 | 2 |
| Sex, no. (%) | ||
| Male | 59.9 | 61.2 |
| Female | 38.9 | 35.4 |
| NA | 1.2 | 3.4 |
| Primary or secondary, no. (%) | ||
| Primary | 94.4 | 93.2 |
| Secondary | 3.5 | 2.0 |
| Recurrent | 1.2 | 2.7 |
| NA | 0.8 | 2.0 |
| IDH status, no. (%) | ||
| Wild type | 68.7 | 70.7 |
| Mutant | 6.1 | 4.1 |
| NA | 25.1 | 25.2 |
| Subtype, no. (%) | ||
| Classical | 27.2 | 25.9 |
| Mesenchymal | 28.9 | 32.7 |
| Neural | 16.4 | 19 |
| Proneural | 27.5 | 32.7 |
There is no significant difference of the population baseline between the training cohort and test cohort
Fig. 2The exclusive expression correlation of OLIG2 and CD276 in different GBM subtypes. The expression of OLIG2 and CD276 is negatively correlated in TCGA GBM dataset (A). OLIG2 expression is high in proneural subtypes, while CD276 in mesenchymal (B–D). In GBM with G-CIMP status, IDH mutation status and MGMT methylation status, OLIG2 is highly expressed and CD276 shared exclusive expression pattern (E–G). The full view of the correlation of OLIG2/CD276 expression and other phenotypes is shown (G)
Fig. 3Gene clusters based on OLIG2 and CD276 generated by PCA analysis. 26 genes are obtained by random forest algorithm according to the expression subtypes (A). The full view of the locations of genes on chromatin is shown (B). PCA analysis revealed GBM subtypes can be identified clearly (C). Gene expression of three modules (module-classic, module-mesenchymal, and module-proneural) generated by PCA are shown in heatmap (D). Three gene modules expressed differently in four subtypes (E). ROC curve of the random forest algorithm for subtype classification is shown and the AUC reaches 0.855 (F)
Fig. 4Gene clusters based on OLIG2 and CD276 generated by WCGNA algorithm. The soft threshold with corresponding scale free topology model fit and mean connection is set as 8 (A). TOM heatmap shows good cohesion of six modules generated by WCGNA algorithm (B). The Sankey diagram reveals existed correspondence between the two kinds of modules generated by PCA and WCGNA algorithm (C). RAB33A and RAB34 were exclusively expressed in mesenchymal and proneural GBM (D). Protein–protein network shows interaction among the gene clusters
Fig. 5Validation of gene clusters in GSE84010 and Gravandeel’s GBM datasets. Heatmap shows good classifying ability of gene clusters in two independent datasets (A, D). PCA analysis reveals gene clusters could distinguish mesenchymal and proneural subtypes perfectly, but less distinguishable from classic and neural subtypes (B, E). ROC of random forest classification model reveals good efficacy (C, F) (AUC = 0.816 in the GSE84010 dataset, AUC = 0.820 in Gravandeel’s dataset)
Fig. 6Functional enrichments of gene clusters. GO pathway enrichment revealed both classic and mesenchymal modules are enriched in DNA elements related pathway (A). GO biological process shows lymphocyte differentiation and T cell activation associated with four genes in the cluster (B). KEGG pathways reveals that signaling regulating pluripotency of stem cells is enriched (C). In DO database, genes in the cluster have a close connection with cancers (D)
Fig. 7Survival prediction model based on genes in clusters. Multivariate Cox regression analysis revealed the association of 26 genes with OS and PFS (A). The predictive value of genes in cluster for OS and PFS is shown in 2-dimension plot (B). A signature of five genes is obtained by LASSO regression algorithm (C, D). The coefficients of five genes in the signature is shown (E). The full view of the risk score and the survival status based on five genes signature (F). The survival prediction model is tested in training cohort and test cohort (G)