| Literature DB >> 32863885 |
Long-Yi Guo1, Ai-Hua Wu2, Yong-Xia Wang2, Li-Ping Zhang1, Hua Chai3, Xue-Fang Liang2.
Abstract
BACKGROUND: Identifying molecular subtypes of ovarian cancer is important. Compared to identify subtypes using single omics data, the multi-omics data analysis can utilize more information. Autoencoder has been widely used to construct lower dimensional representation for multi-omics feature integration. However, learning in the deep architectures in Autoencoder is difficult for achieving satisfied generalization performance. To solve this problem, we proposed a novel deep learning-based framework to robustly identify ovarian cancer subtypes by using denoising Autoencoder.Entities:
Keywords: Deep learning; Multi-omics; Ovarian cancer
Year: 2020 PMID: 32863885 PMCID: PMC7447574 DOI: 10.1186/s13040-020-00222-x
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Fig. 1The architecture of proposed deep learning framework for identifying ovarian cancer subtypes
The clustering performances obtained by different methods in ovarian cancer
| silhouette scores | DBI | |
|---|---|---|
| 0.165 | 1.859 | |
| Hierarchical clustering | 0.310 | 1.594 |
| PCA- kmeans | 0.378 | 1.502 |
| KPCA-kmeans | 0.475 | 0.702 |
| SparseK | 0.513 | 0.681 |
| iCluster | 0.528 | 0.657 |
| AE-kmeans | 0.549 | 0.621 |
| DAE-kmeans | 0.583 | 0.562 |
The clustering performance comparison using different type of omics data
| Features | silhouette score | DBI | |
|---|---|---|---|
| mRNA | 20,502 | 0.550 | 0.607 |
| miRNA | 1870 | 0.536 | 0.644 |
| CNV | 23,606 | 0.509 | 0.713 |
| Multi-omics | 45,978 | 0.583 | 0.562 |
Fig. 2The survival curves drawn based on the identified subtypes in four ovarian cancer datasets. The green lines represent the patients have the high survival probability and the red lines mean the survival probability of the patients in this group is lower
The performance of subtypes identification for the four ovarian cancer datasets
| Censored | Uncensored | Low Risk | High Risk | P-values | |
|---|---|---|---|---|---|
| OV | 122 | 176 | 111 | 187 | 0.013 |
| GSE26712 | 56 | 129 | 82 | 103 | 3.2E-3 |
| GSE32062 | 139 | 121 | 152 | 108 | 0.047 |
| GSE53963 | 21 | 153 | 94 | 80 | 0.033 |
Fig. 3The results obtained by WGCNA in ovarian cancer dataset from TCGA: a. The gene dendrogram and identified modules in OV data; b. The correlation between the clustered modules and molecular subtypes; c. The average GS in each module in OV data
The distribution of selected genes and pathways in different modules
| Module | Candidate Gene number | Enriched KEGG pathways |
|---|---|---|
| Blue | 22 | 15 |
| Green | 0 | 0 |
| Brown | 5 | 0 |
| Turquoise | 1 | 0 |
| Yellow | 6 | 4 |
Fig. 4KEGG pathway enrichment analysis for 34 identified genes, the x-axis shows the p-value of each term and the y-axis shows the KEGG pathway terms. *(Y) means it is the pathways enriched in the yellow function module