| Literature DB >> 25036210 |
Melike Korucuoglu1, Senol Isci2, Arzucan Ozgur1, Hasan H Otu3.
Abstract
High Throughput Biological Data (HTBD) requires detailed analysis methods and from a life science perspective, these analysis results make most sense when interpreted within the context of biological pathways. Bayesian Networks (BNs) capture both linear and nonlinear interactions and handle stochastic events in a probabilistic framework accounting for noise making them viable candidates for HTBD analysis. We have recently proposed an approach, called Bayesian Pathway Analysis (BPA), for analyzing HTBD using BNs in which known biological pathways are modeled as BNs and pathways that best explain the given HTBD are found. BPA uses the fold change information to obtain an input matrix to score each pathway modeled as a BN. Scoring is achieved using the Bayesian-Dirichlet Equivalent method and significance is assessed by randomization via bootstrapping of the columns of the input matrix. In this study, we improve on the BPA system by optimizing the steps involved in "Data Preprocessing and Discretization", "Scoring", "Significance Assessment", and "Software and Web Application". We tested the improved system on synthetic data sets and achieved over 98% accuracy in identifying the active pathways. The overall approach was applied on real cancer microarray data sets in order to investigate the pathways that are commonly active in different cancer types. We compared our findings on the real data sets with a relevant approach called the Signaling Pathway Impact Analysis (SPIA).Entities:
Mesh:
Year: 2014 PMID: 25036210 PMCID: PMC4103872 DOI: 10.1371/journal.pone.0102803
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Sample Observation Matrices.
| (a) | G1 | G2 | G3 | G4 | (b) | G1 | G2 | G3 | G4 |
|
| 1 | 2 | 1 | 1 |
| 1 | 2 | 2 | 1 |
|
| 1 | 1 | 2 | 1 |
| 1 | 2 | 2 | 1 |
|
| 2 | 2 | 2 | 2 |
| 1 | 2 | 2 | 1 |
|
| 1 | 2 | 2 | 2 |
| 1 | 2 | 2 | 1 |
|
| 2 | 1 | 2 | 1 |
| 1 | 2 | 2 | 1 |
|
| 2 | 2 | 1 | 1 |
| 1 | 2 | 2 | 1 |
|
| 1 | 1 | 2 | 1 |
| 1 | 2 | 2 | 1 |
Columns denote genes/nodes; rows denote observations.
Prediction accuracy of different scoring methods on synthetic datasets.
| BDe | AIC | BIC | fNML | |
|
| 0.945 | 0.964 | 0.909 | 1.000 |
|
| 0.982 | 1.000 | 0.927 | 0.964 |
|
| 0.982 | 1.000 | 0.945 | 1.000 |
|
| 0.964 | 0.982 | 0.982 | 1.000 |
|
| 0.945 | 1.000 | 0.891 | 1.000 |
|
| 0.945 | 0.982 | 0.982 | 0.964 |
|
| 0.982 | 0.982 | 0.927 | 0.982 |
|
| 1.000 | 0.982 | 0.964 | 0.982 |
|
| 0.982 | 0.982 | 0.927 | 0.982 |
|
| 0.891 | 0.945 | 0.945 | 0.964 |
|
| 0.962 | 0.982 | 0.940 |
|
|
| 0.031 | 0.017 | 0.030 |
|
BDe: Bayesian Dirichlet Equivalent; AIC: Akaike Information Criterion; BIC: Bayesian Information Criterion; fNML: factorized Normalized Maximum Likelihood.
Cancer Data Sets and numbers of active pathways Identified by BPA and SPIA analyses.
| GEO# (GSE) | Cancer Type | Chip Type (HG-U133) | # of Samples | BPA | SPIA |
|
| bladder | Plus2 | 12 (9C, 3N) | 57 | 40 |
|
| brain | A | 25 (21C, 4N) | 81 | 23 |
|
| brain | Plus2 | 35 (30C, 5N) | 46 | 32 |
|
| breast | Plus2 | 22 (7C, 15N) | 16 | 25 |
|
| breast | Plus2 | 18 (14C, 4N) | 66 | 36 |
|
| colon | Plus2 | 20 (10C, 10N) | 36 | 39 |
|
| liver | A2 | 43 (22C, 21N) | 77 | 22 |
|
| liver | A2 | 66 (47C, 19N) | 59 | 17 |
|
| lung | Plus2 | 19 (16C, 3N) | 58 | 43 |
|
| ovarian | Plus2 | 24 (12C, 12N) | 5 | 18 |
|
| thyroid | Plus2 | 14 (7C, 7N) | 4 | 27 |
|
| thyroid | Plus2 | 18 (14C, 4N) | 10 | 27 |
Figure 1Commonality of significant pathways using the BPA analysis on the same cancer types.
Figure 2Number of pathways found significant in real microarray data sets using BPA and SPIA methods.
Significant pathways identified by the improved BPA system using the NCI-60 microarray data set on samples with and without p53 mutation (p53+: 17 Samples, p53−: 33 samples).
| ID | Name |
|
| Butanoate metabolism |
|
| Fatty acid biosynthesis |
|
| Glioma |
|
| Melanogenesis |
|
| Melanoma |
|
| Pancreatic cancer |
|
| Purine metabolism |
|
| Pyrimidine metabolism |
|
| T cell receptor signaling pathway |