| Literature DB >> 36167504 |
Bilin Liang1, Haifan Gong1, Lu Lu1, Jie Xu2.
Abstract
BACKGROUND: Pathway-based analysis of transcriptomic data has shown greater stability and better performance than traditional gene-based analysis. Until now, some pathway-based deep learning models have been developed for bioinformatic analysis, but these models have not fully considered the topological features of pathways, which limits the performance of the final prediction result.Entities:
Keywords: Deep learning; Graph neural network; Interpretability; Pathway; Risk classification
Mesh:
Year: 2022 PMID: 36167504 PMCID: PMC9516820 DOI: 10.1186/s12859-022-04950-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Fig. 1The visualization of pathways and clinical features in this study. (A) The density distribution of the gene number in each pathway. (B) The density distribution of the number of each gene shared between pathways
Fig. 2The deep learning framework implemented in PathGNN. (A) The overview of deep learning framework in PathGNN. PathGNN was designed as two components, named Subnetwork1 and Subnetwork2, respectively. (B) The schematic view of Subnetwork1 architecture
Fig. 3The results of ablation study. (A) “GraphSAGE, GAT and GCN” indicated the model results of the corresponding graph convolution algorithm used; “Removed Set2Set” indicated that GraphSAGE was used as the convolution algorithm, but the Set2Set layer was removed. (B) The model performance with different number of blocks
Comparison of predictive performance with benchmark methods
| Cancer | PathGNN | PGDNN | DNN | RF | LR |
|---|---|---|---|---|---|
| LUAD | 0.592 ± 0.026 | 0.536 ± 0.059 | 0.566 ± 0.043 | 0.586 ± 0.057 | |
| SKCM | 0.626 ± 0.035 | 0.590 ± 0.031 | 0.601 ± 0.055 | 0.588 ± 0.056 | |
| LGG | 0.852 ± 0.082 | 0.834 ± 0.080 | 0.808 ± 0.083 | 0.755 ± 0.068 | |
| KIRC | 0.683 ± 0.045 | 0.584 ± 0.088 | 0.608 ± 0.042 | 0.604 ± 0.027 |
The predictive performance of PathGNN was compared with pathway-guided deep neutral network (PGDNN), deep neural network (DNN), random forest (RF) and logistic regression (LR). The area under the curve (AUC: mean ± standard deviation) was recorded
Bold indicated the best predictive performance in each case study
Key pathways associated with the long-term survival in multiple cancers
| Pathway | Cancer | IG scorez |
|---|---|---|
| Aberrant regulation of mitotic cell cycle due to RB1 defects | LUAD | 2.161 |
| Aberrant regulation of mitotic G1/S transition in cancer due to RB1 defects | LUAD | 3.153 |
| E3 ubiquitin ligases ubiquitinate target proteins | LUAD | 3.100 |
| Regulation of RUNX3 expression and activity | LUAD | 2.372 |
| Signaling by ERBB4 | SKCM | 3.628 |
| Aberrant regulation of mitotic G1/S transition in cancer due to RB1 defects | KIRC | 3.523 |
| Signaling by FGFR | KIRC | 3.525 |
| Stabilization of p53 | KIRC | 1.970 |
| Aberrant regulation of mitotic G1/S transition in cancer due to RB1 defects | LGG | 4.955 |
| Regulation of TP53 Expression and Degradation | LGG | 3.927 |
IG scorez indicated the importance of the pathway in predicting clinical risk classification, and the larger the value the higher the importance
Fig. 4Kaplan–Meier curves for two groups dichotomized by a median split in IG scores. The shaded area represents the 95% confidence interval