| Literature DB >> 25214892 |
Dokyoon Kim1, Ruowang Li1, Scott M Dudek1, Alex T Frase1, Sarah A Pendergrass1, Marylyn D Ritchie1.
Abstract
BACKGROUND: Effective cancer clinical outcome prediction for understanding of the mechanism of various types of cancer has been pursued using molecular-based data such as gene expression profiles, an approach that has promise for providing better diagnostics and supporting further therapies. However, clinical outcome prediction based on gene expression profiles varies between independent data sets. Further, single-gene expression outcome prediction is limited for cancer evaluation since genes do not act in isolation, but rather interact with other genes in complex signaling or regulatory networks. In addition, since pathways are more likely to co-operate together, it would be desirable to incorporate expert knowledge to combine pathways in a useful and informative manner.Entities:
Keywords: Clinical outcome prediction; Grammatical evolution neural network; Integrative analysis; Knowledge-driven genomic interaction; Ovarian cancer
Year: 2014 PMID: 25214892 PMCID: PMC4161273 DOI: 10.1186/1756-0381-7-20
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Figure 1Schematic overview of ATHENA. ATHENA contains transformation and modeling components. The transformation component uses Biofilter, allowing researchers to transform gene-based input data into a knowledge-based matrix. Multi-omic data can be the input for developing meta-dimensional models associated with clinical outcomes of interest.
Figure 2Schematic overview of the pipeline for the analysis. Light blue vertical bars represent each step in the pipeline: (1) Transformation of a gene expression matrix to a pathway-based, GO-based, and Pfam-based matrix (2) GENN modeling (3) GENN modeling of variables from the best models of each knowledge-based data set.
Performance comparison between the model with gene expression data alone and models identified using knowledge-based matrices
| Gene expression | 0.6957 | 0.7103 |
| Pathway | 0.7451 | 0.7457 |
| GO | 0.6991 | 0.7275 |
| Pfam | 0.7046 | 0.7335 |
| Integration | 0.7882 | 0.8108 |
We compare here our results of evaluating gene-expression data alone, KEGG, GO, Pfam, and integration modeling. The integration model was developed by combining variables from KEGG pathway-based matrix, GO-based matrix, and Pfam-based matrix. Performances were measured based on the balanced accuracy and area under the ROC curve (AUC).
Figure 3Best GENN models from each knowledge-based dataset. PSUB is a subtraction activation node. Constants in the white boxes are weights. Knowledge features such as pathway, GO, and Pfam, are shown in the gray boxes. (a) KEGG pathway-based matrix (b) GO-based matrix (c) Pfam-based matrix.
Figure 4Best GENN model from integrating knowledge-based datasets. PSUB and PDIV represent a subtraction and division activation node, respectively. Constants in the white boxes are weights. Knowledge features such as KEGG pathway, GO, and Pfam, are shown in the gray boxes.