| Literature DB >> 25288878 |
Yang Ni1, Francesco C Stingo2, Veerabhadran Baladandayuthapani2.
Abstract
Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.Entities:
Keywords: Bayesian network; glioblastoma multiforme; integrative analysis; multiple platforms
Year: 2014 PMID: 25288878 PMCID: PMC4179606 DOI: 10.4137/CIN.S13786
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Three scenarios considered in our simulation study.
Figure 2Bayesian approach. The percentage of correctly selected models (PCM), the percentage of incorrectly selected models (PIM), and the true positive rate (TPR) are plotted against the median correlation complexity measure (MCC) for each scenario. The gradient reflects the Frobenius norm complexity measure (FNC) level.
Simulation study. Results for eight regression coefficients settings in Scenario 3.
| REGRESSION COEFFICIENTS | MCC | FNC | PCM | PIM | TPR |
|---|---|---|---|---|---|
| (−0.1, −0.1, −0.1) | 0.93 | 1 | 0.02 | 0.27 | 0.54 |
| (−0.4, −0.1, −0.1) | 0.85 | 0.75 | 0.26 | 0.12 | 0.82 |
| (0.1, 0.4, 0.1) | 0.89 | 0.7 | 0.06 | 0.28 | 0.67 |
| (0.1, 0.1, 0.4) | 0.98 | 0.74 | 0.08 | 0.3 | 0.66 |
| (0.4, 0.4, −0.1) | 0.56 | 0.32 | 0.44 | 0.12 | 0.9 |
| (−0.4, −0.1, −0.4) | 0.84 | 0.47 | 0.44 | 0.1 | 0.91 |
| (−0.1, 0.4, −0.4) | 0.96 | 0.37 | 0.19 | 0.3 | 0.8 |
| (0.4, 0.4, −0.4) | 0 | 0.02 | 0.83 | 0.01 | 1 |
Figure 3Simulation study. 3D plot for Scenario 1 with each coordinate being one regression coefficient in Figure 3. Large dot size indicates lower percentage of correctly selected models (PCM) and darker color shows lower true positive rate (TPR).
Sensitivity analysis (Scenario 1).
| HYPERPARAMETERS | PCM | PIM | TPR | FDR | ||
|---|---|---|---|---|---|---|
| 0.73 | 0.27 | 0.92 | 0.01 | |||
| 0.50 | 0.09 | 0.92 | 0.01 | |||
| 0.49 | 0.10 | 0.91 | 0.01 | |||
| 0.46 | 0.12 | 0.90 | 0.01 | |||
| 0.34 | 0.19 | 0.86 | 0.01 | |||
| 0.40 | 0.03 | 0.94 | 0.02 | |||
| 0.46 | 0.07 | 0.92 | 0.01 | |||
| 0.48 | 0.15 | 0.89 | 0.01 | |||
| 0.48 | 0.10 | 0.91 | 0.01 | |||
| 0.48 | 0.10 | 0.91 | 0.01 | |||
| 0.50 | 0.10 | 0.91 | 0.01 | |||
| 0.49 | 0.10 | 0.91 | 0.01 | |||
Figure 4Frequentist approach. The percentage of correctly selected models (PCM), the percentage of incorrectly selected models (PIM), and the true positive rate (TPR) are plotted against the median correlation complexity measure (MCC) for each scenario. The gradient reflects the Frobenius norm complexity measure (FNC) level.
Figure 5Glioblastoma multiforme (GBM) data analysis. Top four networks for epigenomic, genomic, and transcriptomic networks. They are ranked based on the posterior probability which is shown at the top of each network, along with the corresponding gene/probe symbol.
Notes: Blue arrows are activations. Red arrows are inhibitions. Bi-directed edges can be interpreted in either direction. Next to each arrow is the posterior probability of the corresponding arrow.
Glioblastoma multiforme (GBM) data analysis. A list of genes that have epigenomic, genomic, and transcriptomic regulatory effects on clinical outcome. Bold genes positively affect clinical outcome, while the rest negatively affect clinical outcome. The gene names appear in the order of posterior probability (from high to low), with the posterior probability of the network in parentheses. Only genes whose network is significant are shown.
| GENES/PROBES | |
|---|---|
| Epigenomic | |
| POLR3C(0.57), BBS5(0.56), RGN(0.55), KCNA2(0.55), | |
| Genomic | MAP3K7(0.74), |
| CFDP1(0.64), CCL22(0.62), DOPEY1(0.62), OGFOD1(0.60), ZCCHC14(0.59), | |
| GNS(0.56), CGA(0.56), | |
| PLCG2(0.49), CDH11(0.48), EPHA7(0.43), GNAO1(0.42), | |
| Transcriptomic | MAN2 A2(0.73), ZNF80(0.65), ZDHHC11(0.60), |
Figure 6Glioblastoma multiforme (GBM) data analysis. The number of biomarkers vs the chromosome number. Colors correspond to different networks.