| Literature DB >> 30626092 |
Kelvin K Wong1,2,3, Robert Rostomily4, Stephen T C Wong5,6,7,8.
Abstract
This study aims to discover genes with prognostic potential for glioblastoma (GBM) patients' survival in a patient group that has gone through standard of care treatments including surgeries and chemotherapies, using tumor gene expression at initial diagnosis before treatment. The Cancer Genome Atlas (TCGA) GBM gene expression data are used as inputs to build a deep multilayer perceptron network to predict patient survival risk using partial likelihood as loss function. Genes that are important to the model are identified by the input permutation method. Univariate and multivariate Cox survival models are used to assess the predictive value of deep learned features in addition to clinical, mutation, and methylation factors. The prediction performance of the deep learning method was compared to other machine learning methods including the ridge, adaptive Lasso, and elastic net Cox regression models. Twenty-seven deep-learned features are extracted through deep learning to predict overall survival. The top 10 ranked genes with the highest impact on these features are related to glioblastoma stem cells, stem cell niche environment, and treatment resistance mechanisms, including POSTN, TNR, BCAN, GAD1, TMSB15B, SCG3, PLA2G2A, NNMT, CHI3L1 and ELAVL4.Entities:
Keywords: deep learning; discovery; glioblastoma; glioblastoma stem cells; survival prediction
Year: 2019 PMID: 30626092 PMCID: PMC6356839 DOI: 10.3390/cancers11010053
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Frequency analysis of important genes in the 27 deep-learned network nodes at the top hidden layer. Only the top 100 frequently occurring genes are listed for brevity.
| Gene | Frequency | Gene | Frequency | Gene | Frequency | Gene | Frequency |
|---|---|---|---|---|---|---|---|
|
| 17 |
| 10 |
| 9 |
| 8 |
|
| 16 |
| 10 |
| 9 |
| 8 |
|
| 15 |
| 10 |
| 9 |
| 8 |
|
| 15 |
| 10 |
| 9 |
| 8 |
|
| 15 |
| 10 |
| 9 |
| 8 |
|
| 14 |
| 10 |
| 9 |
| 8 |
|
| 13 |
| 10 |
| 9 |
| 8 |
|
| 13 |
| 10 |
| 9 |
| 8 |
|
| 13 |
| 10 |
| 9 |
| 8 |
|
| 13 |
| 10 |
| 9 |
| 8 |
|
| 13 |
| 10 |
| 9 |
| 8 |
|
| 12 |
| 10 |
| 8 |
| 8 |
|
| 12 |
| 10 |
| 8 |
| 8 |
|
| 12 |
| 10 |
| 8 |
| 8 |
|
| 12 |
| 9 |
| 8 |
| 8 |
|
| 12 |
| 9 |
| 8 |
| 8 |
|
| 12 |
| 9 |
| 8 |
| 8 |
|
| 12 |
| 9 |
| 8 |
| 8 |
|
| 12 |
| 9 |
| 8 |
| 8 |
|
| 11 |
| 9 |
| 8 |
| 8 |
|
| 11 |
| 9 |
| 8 |
| 8 |
|
| 11 |
| 9 |
| 8 |
| 8 |
|
| 10 |
| 9 |
| 8 |
| 8 |
|
| 10 |
| 9 |
| 8 |
| 8 |
|
| 10 |
| 9 |
| 8 |
| 8 |
The prognostic value of network node outputs at the top hidden layer are evaluated using Cox proportional hazard model. The hazard ratios (HR) and 95% confidence intervals (95% CI) are listed with corresponding p-value. Thirteen network nodes are statistically significant in overall survival prognosis.
| Cox Model with Deep Learning Features | HR (95% CI) | |
|---|---|---|
| Network Node 0 | 1.26 (0.98–1.62) | 0.0718 |
| Network Node 1 | 1.13 (0.94–1.36) | 0.1996 |
| Network Node 2 | 1.03 (0.81–1.32) | 0.7931 |
| Network Node 3 | 1.15 (0.94–1.41) | 0.1681 |
| Network Node 4 | 0.73 (0.59–0.89) | 0.0022 |
| Network Node 5 | 0.95 (0.77–1.16) | 0.5935 |
| Network Node 6 | 1.13 (0.88–1.44) | 0.341 |
| Network Node 7 | 1.19 (0.97–1.46) | 0.0929 |
| Network Node 8 | 1.71 (1.40–2.08) | <0.0001 |
| Network Node 9 | 1.02 (0.81–1.29) | 0.8505 |
| Network Node 10 | ||
| ≥1.6 | 0.45 (0.25–0.81) | 0.0076 |
| <1.6 | 1 | |
| Network Node 11 | 0.80 (0.66–0.96) | 0.0197 |
| Network Node 12 | 1.36 (1.11–1.68) | 0.0034 |
| Network Node 13 | 0.93 (0.77–1.14) | 0.4994 |
| Network Node 14 | 1.12 (0.88–1.42) | 0.3495 |
| Network Node 15 | 0.86 (0.67–1.10) | 0.2324 |
| Network Node 16 | 0.57 (0.45–0.72) | <0.0001 |
| Network Node 17 | 1.35 (1.09–1.67) | 0.0056 |
| Network Node 18 | 0.80 (0.64–1.00) | 0.0478 |
| Network Node 19 | 0.78 (0.64–0.95) | 0.0132 |
| Network Node 20 | 0.91 (0.73–1.15) | 0.4437 |
| Network Node 21 | 1.34 (1.10–1.62) | 0.0035 |
| Network Node 22 | 1.16 (0.95–1.42) | 0.1575 |
| Network Node 23 | 0.77 (0.62–0.97) | 0.0281 |
| Network Node 24 | 1.87 (1.54–2.27) | <0.0001 |
| Network Node 25 | 1.41 (1.10–1.80) | 0.0063 |
| Network Node 26 | 0.80 (0.65–1.00) | 0.0476 |
| Overall Model | <0.0001 |
Combined multivariate Cox proportional hazard model including clinical covariates and deep learning network node covariates to predict overall survival (p-value < 0.0001). The hazard ratios (HR) and 95% confidence intervals (95% CI) are listed with corresponding p-value.
| Cox Model with Clinical Covariates and Deep Learning Features | HR (95% CI) | |
|---|---|---|
| Age | ||
| ≥54 years old | 1.50 (1.10–2.03) | 0.0098 |
| <54 years old | 1 | |
| Gender | ||
| Male | 1.25 (0.92–1.68) | 0.1542 |
| Female | 1 | |
| KPS | ||
| ≥60 | 0.35 (0.17–0.72) | 0.0042 |
| <60 | 1 | |
| Therapy | ||
| Chemoradiation | 0.27 (0.12–0.62) | 0.0018 |
| Chemotherapy | 1.06 (0.35–3.17) | 0.9193 |
| Radiation | 0.51 (0.22–1.17) | 0.1122 |
| Subtype | ||
| Proneural | 1.70 (1.01–2.87) | 0.0464 |
| Classical | 1.26 (0.79–2.01) | 0.3311 |
| Mesenchymal | 1.41 (0.89–2.25) | 0.1462 |
| MGMT Methylated | 1.18 (0.85–1.62) | 0.3181 |
| G-CIMP Methylated | 1.03 (0.35–3.06) | 0.9553 |
| R132C/R132G/R132H Mutation | 1.08 (0.35–3.31) | 0.8986 |
| Network Node 0 | 1.09 (0.80–1.48) | 0.5888 |
| Network Node 1 | 1.10 (0.87–1.39) | 0.4348 |
| Network Node 2 | 1.15 (0.86–1.55) | 0.3407 |
| Network Node 3 | 1.11 (0.86–1.44) | 0.4264 |
| Network Node 4 | 0.77 (0.60–0.99) | 0.0387 |
| Network Node 5 | 0.85 (0.64–1.12) | 0.2558 |
| Network Node 6 | 1.12 (0.80–1.55) | 0.514 |
| Network Node 7 | 1.12 (0.86–1.44) | 0.4041 |
| Network Node 8 | 1.73 (1.36–2.21) | <0.0001 |
| Network Node 9 | 1.07 (0.81–1.42) | 0.6384 |
| Network Node 10 | ||
| ≥1.6 | 0.44 (0.20–0.95) | 0.0363 |
| Network Node 11 | 0.86 (0.67–1.10) | 0.2336 |
| Network Node 12 | 1.27 (0.99–1.65) | 0.0645 |
| Network Node 13 | 1.04 (0.82–1.32) | 0.7484 |
| Network Node 14 | 1.10 (0.82–1.48) | 0.5313 |
| Network Node 15 | 0.79 (0.57–1.11) | 0.1711 |
| Network Node 16 | 0.64 (0.48–0.86) | 0.0029 |
| Network Node 17 | 1.55 (1.16–2.07) | 0.0027 |
| Network Node 18 | 0.79 (0.60–1.05) | 0.1049 |
| Network Node 19 | 0.86 (0.68–1.08) | 0.2001 |
| Network Node 20 | 0.74 (0.54–1.00) | 0.0528 |
| Network Node 21 | ||
| ≥1.6 | 0.96 (0.55–1.69) | 0.8937 |
| Network Node 22 | 1.22 (0.94–1.58) | 0.1327 |
| Network Node 23 | 0.75 (0.57–1.00) | 0.048 |
| Network Node 24 | 1.66 (1.30–2.12) | <0.0001 |
| Network Node 25 | 1.49 (1.10–2.01) | 0.0101 |
| Network Node 26 | 0.83 (0.64–1.07) | 0.1555 |
| Overall Model | <0.0001 |
Figure 1Probability distribution of important genes occurring at different network nodes. It is very rare for an important gene to occur in many nodes that are prognostic to survival. Using a threshold of p < 0.01, only genes that occurred at least 10 out of 27 network nodes meet the criteria and are included in the 39-gene signature.
Gene list of the 39-gene signature selected based on p < 0.01 occurrence at the deep-learned network nodes at the top hidden layer.
| 39-Gene Signature | |||
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Figure 2Kaplan–Meier survival fraction versus survival time (months) of the low-risk (green color) and high-risk (red color) groups are well separated using top 39 genes across nine different datasets, including data from seven glioblastoma and three low-grade glioma studies.
List of glioblastoma studies used in survival prognosis validation using the gene set discovered by deep learning.
| Study Datasets | Samples | Source |
|---|---|---|
| Lee Nelson Glioblastoma GSE13041 GPL96 | 218 | Lee [ |
| Freije Nelson Glioblastoma GSE4412 GPL96 | 85 | Freije [ |
| Gravendeed French Glioblastoma GSE16011 | 284 | Gravendeel [ |
| Nutt Louis Glioblastoma BROAD | 50 | Nutt [ |
| Murat Hegi Glioblastoma GSE7696 | 84 | Murat [ |
| Joo Kim Jin Kim Seol Nam Glioblastoma GSE42669 | 58 | Joo [ |
| Philips Aldape Astrocytoma GSE4271 GPL96 | 100 | Phillips [ |
| Brain Low Grade Glioma TCGA 2016 | 110 | TCGA |
| GBM-TCGA June 2016 | 148 | TCGA |
| LGG-TCGA-Low Grade Gliomas June 2016 | 512 | TCGA |