| Literature DB >> 31036036 |
So Yeon Kim1, Hyun-Hwan Jeong2,3, Jaesik Kim1, Jeong-Hyeon Moon1, Kyung-Ah Sohn4.
Abstract
BACKGROUND: Integrating the rich information from multi-omics data has been a popular approach to survival prediction and bio-marker identification for several cancer studies. To facilitate the integrative analysis of multiple genomic profiles, several studies have suggested utilizing pathway information rather than using individual genomic profiles.Entities:
Keywords: Breast cancer; Integrative analysis; Multi-omics; Neuroblastoma; Pathway-based analysis; Random walk
Mesh:
Year: 2019 PMID: 31036036 PMCID: PMC6489180 DOI: 10.1186/s13062-019-0239-8
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Fig. 1Overview of the proposed pathway-based multi-omics integration method for survival prediction
Fig. 2Survival prediction performance comparison between pathway profiles of four pathway-based methods on the gene expression data and those of the iDRW method on the gene expression and copy number data in breast cancer (a) and in neuroblastoma data (b). Performance is measured with accuracies and F-1 scores after 50 repeats of five-fold cross-validation with top-k pathways (a). In the neuroblastoma data, performances are measured using leave-one-out cross-validation due to the sample size (b). The value of k is empirically set to the optimal one for each method. The performance of the gene expression profile is shown as a dotted horizontal line
Fig. 3Classification performances of the iDRW method and four pathway-based methods with varying values of k for breast cancer (a) and neuroblastoma data (b). Classification performances with top-k pathway features are shown for each model with varying k = 5, 10, …, 45, 50. Performance is measured using precision, recall and F-1 score after 50 repeats of five-fold cross-validation in breast cancer data (a) and leave-one-out cross-validation in neuroblastoma data (b)
Fig. 4Classification performances of the iDRW method and four pathway-based methods with varying number of sample size N in breast cancer samples. Classification performances are shown with respect to the number of samples N which are 70, 80, 90, 100% out of whole samples. Performances are measured using precision, recall and F-1 score after 50 repeats of five-fold cross-validation in breast cancer data
Top-k pathways ranked by the iDRW method in breast cancer (k = 25) and neuroblastoma data (k = 5). For each pathway, total number of genes, significant genes from gene expression (EXP) and copy number data (CNA) are shown (p-value of t-test / DESeq2 or -test < 0.05)
| Dataset | Pathway ID | Pathway name | Total genes | EXP | CNA |
|---|---|---|---|---|---|
| Breast cancer | hsa04740 | Olfactory transduction | 419 | 54 | 268 |
| hsa04014 | Ras signaling pathway | 232 | 68 | 164 | |
| hsa04015 | Rap1 signaling pathway | 206 | 64 | 142 | |
| hsa04916 | Melanogenesis | 101 | 37 | 73 | |
| hsa04722 | Neurotrophin signaling pathway | 119 | 38 | 84 | |
| hsa05200 | Pathways in cancer | 526 | 166 | 359 | |
| hsa04933 | AGE-RAGE signaling pathway in diabetic complications | 99 | 37 | 67 | |
| hsa04530 | Tight junction | 170 | 53 | 107 | |
| hsa04510 | Focal adhesion | 199 | 76 | 125 | |
| hsa04080 | Neuroactive ligand-receptor interaction | 278 | 64 | 193 | |
| hsa05225 | Hepatocellular carcinoma | 168 | 56 | 112 | |
| hsa04020 | Calcium signaling pathway | 182 | 59 | 136 | |
| hsa04024 | cAMP signaling pathway | 198 | 58 | 139 | |
| hsa04217 | Necroptosis | 164 | 49 | 97 | |
| hsa04060 | Cytokine-cytokine receptor interaction | 270 | 70 | 192 | |
| hsa05152 | Tuberculosis | 179 | 58 | 112 | |
| hsa05165 | Human papillomavirus infection | 319 | 103 | 210 | |
| hsa04810 | Regulation of actin cytoskeleton | 208 | 64 | 132 | |
| hsa04151 | PI3K-Akt signaling pathway | 352 | 119 | 241 | |
| hsa04022 | cGMP-PKG signaling pathway | 163 | 58 | 109 | |
| hsa04630 | Jak-STAT signaling pathway | 162 | 43 | 112 | |
| hsa05167 | Kaposi’s sarcoma-associated herpesvirus infection | 186 | 61 | 114 | |
| hsa04010 | MAPK signaling pathway | 295 | 87 | 209 | |
| hsa04371 | Apelin signaling pathway | 137 | 46 | 99 | |
| hsa04390 | Hippo signaling pathway | 154 | 58 | 100 | |
| Neuroblastoma | hsa04976 | Bile secretion | 71 | 13 | 5 |
| hsa05034 | Alcoholism | 180 | 22 | 7 | |
| hsa01100 | Metabolic pathways | 1273 | 43 | 93 | |
| hsa04080 | Neuroactive ligand-receptor interaction | 278 | 21 | 24 | |
| hsa04151 | PI3K-Akt signaling pathway | 352 | 19 | 31 |
Fig. 5Pathway-based gene-gene interaction network between gene expression profile and copy number data in breast cancer samples. The genes in the top-25 pathways ranked by the iDRW method in the breast cancer data are shown. The hub genes whose degree is equal to or greater than three in the gene expression profile (blue ellipses) and genes in copy number data (pink diamonds) are emphasized in the network