| Literature DB >> 30228192 |
Yeni L Bernal Rubio1, Agustin González-Reymúndez1,2, Kuan-Han H Wu1,3, Corinne E Griguer4, Juan P Steibel5, Gustavo de Los Campos1,2,6, Andrea Doseff7,8, Kathleen Gallo7, Ana I Vazquez9,2.
Abstract
Glioblastoma multiforme (GBM) has been recognized as the most lethal type of malignant brain tumor. Despite efforts of the medical and research community, patients' survival remains extremely low. Multi-omic profiles (including DNA sequence, methylation and gene expression) provide rich information about the tumor. These profiles are likely to reveal processes that may be predictive of patient survival. However, the integration of multi-omic profiles, which are high dimensional and heterogeneous in nature, poses great challenges. The goal of this work was to develop models for prediction of survival of GBM patients that can integrate clinical information and multi-omic profiles, using multi-layered Bayesian regressions. We apply the methodology to data from GBM patients from The Cancer Genome Atlas (TCGA, n = 501) to evaluate whether integrating multi-omic profiles (SNP-genotypes, methylation, copy number variants and gene expression) with clinical information (demographics as well as treatments) leads to an improved ability to predict patient survival. The proposed Bayesian models were used to estimate the proportion of variance explained by clinical covariates and omics and to evaluate prediction accuracy in cross validation (using the area under the Receiver Operating Characteristic curve, AUC). Among clinical and demographic covariates, age (AUC = 0.664) and the use of temozolomide (AUC = 0.606) were the most predictive of survival. Among omics, methylation (AUC = 0.623) and gene expression (AUC = 0.593) were more predictive than either SNP (AUC = 0.539) or CNV (AUC = 0.547). While there was a clear association between age and methylation, the integration of age, the use of temozolomide, and either gene expression or methylation led to a substantial increase in AUC in cross-validaton (AUC = 0.718). Finally, among the genes whose methylation was higher in aging brains, we observed a higher enrichment of these genes being also differentially methylated in cancer.Entities:
Keywords: Gene expression; glioblastoma multiforme; methylation; prognosis; single nucleotide polymorphism
Mesh:
Substances:
Year: 2018 PMID: 30228192 PMCID: PMC6222579 DOI: 10.1534/g3.118.200391
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Regression of principal components derived from similarity matrices on demographical covariates. The first ten principal components (PC, x-axis) were obtained from regression of SNP, methylation, gene expression and CNV on the clinical covariates of: A) age, B) race, and C) sex. The proportion of variance explained is presented in terms of R-square (y-axis).
Figure 2Fourth (x-axis) principal component derived from methylation vs. age at diagnosis (y-axis).
Figure 3Overlap between previously reported hypermethylated genes (involved in GBM and brain aging) and genes differentially methylated with age in our dataset. The t- values from tests of differential methylation with age where split in four quartiles. The green and red bars represent the number of genes in each quartile overlapping with previously reported hypermethylated genes: involved in GBM (Lai – green bars), and in brain aging (Horvath – red bars).
Survival analysis based on clinical and demographical covariates. Hazard ratio, confidence interval (95%) and p-values for clinical and treatment covariates for GBM patients (n = 502)
| Covariate | Hazard ratio (95% CI) | Pr(>|z|) |
|---|---|---|
| 1.029 (1.020,1.038) | < 0.0001 | |
| 1.064 (0.859, 1.319) | 0.5684 | |
| 1.128 (0.709, 1.795) | 0.6119 | |
| 1.194 (0.877, 1.625) | 0.2591 | |
| 0.641 (0.496, 0.828) | 0.0007 | |
| 0.317 (0.078, 1.281) | 0.1069 | |
| 0.918 (0.612, 1.377) | 0.6793 | |
| 2.731 (0.644, 11.576) | 0.1727 | |
| 0.839 (0.483, 1.459) | 0.5360 | |
| 0.463 (0.181, 1.187) | 0.1089 |
Clinical covariates included in the model; 2Hazard ratio and corresponding 95% confidence interval; 3p-value for significance of the covariate.
Inter-individual variation in survival, residual variance and Deviance Information Criteria (DIC) for models including age, temozolomide and whole omic profiles. Standard error (se) in parenthesis
| Model | Variance explained (se) | Residual variance (se) | DIC | |||
|---|---|---|---|---|---|---|
| SNP | Methylation | Gene expression | CNV | |||
| – | – | – | – | 1.14 (0.128) | 413.1 | |
| – | – | – | – | 0.998 (0.113) | 396.6 | |
| – | – | – | – | 0.993 (0.111) | 398.9 | |
| 0.657 (0.196) | – | – | – | 0.538 (0.188) | 386.8 | |
| – | 0.876 (0.248) | – | – | 0.468 (0.158) | 372.9 | |
| – | – | 0.726 (0.281) | – | 0.724 (0.155) | 401.0 | |
| – | – | – | 0.264 (0.074) | 1.114 (0.094) | 873.6 | |
| – | – | – | – | 0.904 (0.101) | 389.7 | |
| 0.414 (0.157) | – | – | – | 0.521 (0.159) | 370.8 | |
| – | 0.527 (0.191) | – | – | 0.498 (0.142) | 366.3 | |
| – | 0.371 (0.168) | – | 0.684 (0.118) | 383.6 | ||
| 0.149 (0.05) | 0.921 (0.077) | 824.9 | ||||
| 0.166 (0.087) | 0.184 (0.101) | 0.141 (0.076) | 0.108 (0.051) | 0.424 (0.117) | 331.2 | |
Prediction accuracy of single predictors models in terms of AUC from 5-fold 20 CV
| Model | Mean AUC | SD | Proportion of times model in row (column) had AUC > model in column (row) over 20 CV | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | Tmz | SNP | Meth | Ge | CNV | Age + Tmz | Age + Tmz + SNP | Age + Tmz + Meth | Age + Tmz + Ge | Age + Tmz + CNV | Age + Tmz + Ge + Meth + SNP + CNV | |||
| 0.664 | 0.036 | 0.75 | >0.95 | 0.88 | >0.95 | >0.95 | 0.29 | 0.29 | 0.21 | 0.08 | 0.17 | 0.08 | ||
| 0.606 | 0.075 | 0.27 | 0.67 | 0.42 | 0.63 | 0.71 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | ||
| 0.539 | 0.014 | <0.05 | 0.33 | <0.05 | <0.05 | 0.50 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | ||
| 0.623 | 0.059 | 0.13 | 0.58 | >0.95 | 0.67 | >0.95 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | ||
| 0.593 | 0.045 | <0.05 | 0.38 | >0.95 | 0.33 | >0.95 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | ||
| 0.547 | 0.020 | <0.05 | 0.29 | 0.50 | <0.05 | 0.04 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | ||
Models used for analysis, described on Table 2; 2Mean of prediction accuracy measured by AUC by using 20 5-fold cross-validations for each model; 3Standard deviation of AUC.
Figure 4Prediction accuracy of survival time for models including age, temozolomide (TMZ), SNP, methylation (METH), gene expression (GE) and CNV. Prediction accuracy was measured in terms of AUC (y-axis) vs. months (x-axis). A histogram of survival time for GBM patients is represented in both figures. Models include: a) single predictors; b) predictors incorporating omic profiles into the baseline model (age + temozolomide).
Prediction accuracy of integrative models in terms of AUC from 5-fold 20 CV
| Model | Mean AUC | SD | Proportion of times model in row (column) had AUC > model in column (row) over 20 CV | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | Tmz | SNP | Meth | Ge | CNV | Age + Tmz | Age + Tmz + SNP | Age + Tmz + Meth | Age + Tmz + Ge | Age + Tmz + CNV | Age + Tmz + Ge + Meth + SNP + CNV | |||
| 0.705 | 0.063 | 0.71 | >0.95 | >0.95 | >0.95 | >0.95 | >0.95 | 0.42 | 0.08 | 0.25 | 0.38 | 0.33 | ||
| 0.706 | 0.065 | 0.71 | >0.95 | >0.95 | >0.95 | >0.95 | >0.95 | 0.58 | 0.08 | 0.29 | 0.38 | 0.33 | ||
| 0.717 | 0.073 | 0.79 | >0.95 | >0.95 | >0.95 | >0.95 | >0.95 | 0.92 | 0.92 | 0.50 | 0.71 | 0.46 | ||
| 0.718 | 0.058 | 0.92 | >0.95 | >0.95 | >0.95 | >0.95 | >0.95 | 0.75 | 0.71 | 0.50 | 0.71 | 0.71 | ||
| 0.706 | 0.064 | 0.79 | >0.95 | >0.95 | >0.95 | >0.95 | >0.95 | 0.63 | 0.63 | 0.29 | 0.29 | 0.29 | ||
| 0.715 | 0.067 | 0.82 | >0.95 | >0.95 | >0.95 | >0.95 | >0.95 | 0.43 | 0.64 | 0.37 | 0.43 | 0.71 | ||
Models used for analysis, described on Table 2; 2Mean of prediction accuracy measured by AUC by using 20 5-fold cross-validations for each model; 3Standard deviation of AUC.