| Literature DB >> 35474072 |
Ge Zhang1, Zhen Peng1, Chaokun Yan1, Jianlin Wang1, Junwei Luo2, Huimin Luo3.
Abstract
Liver cancer is the main malignancy in terms of mortality rate, accurate diagnosis can help the treatment outcome of liver cancer. Patient similarity network is an important information which helps in cancer diagnosis. However, recent works rarely take patient similarity into consideration. To address this issue, we constructed patient similarity network using three liver cancer omics data, and proposed a novel liver cancer diagnosis method consisted of similarity network fusion, denoising autoencoder and dense graph convolutional neural network to capitalize on patient similarity network and multi omics data. We compared our proposed method with other state-of-the-art methods and machine learning methods on TCGA-LIHC dataset to evaluate its performance. The results confirmed that our proposed method surpasses these comparison methods in terms of all the metrics. Especially, our proposed method has attained an accuracy up to 0.9857.Entities:
Mesh:
Year: 2022 PMID: 35474072 PMCID: PMC9043215 DOI: 10.1038/s41598-022-10441-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1The overall workflow of pDenseGCN. (A) Similarity network constructed by SNF. (B) Features extracted by DAE network. (C) DenseGCN for cancer diagnosis.
Figure 2The structure of DenseGCN.
The details of three omics datasets.
| Omics type | Number of samples | Number of features |
|---|---|---|
| RNA-Seq | 424 | 20,530 |
| DNA methylation | 429 | 20,421 |
| CNV | 760 | 24,924 |
Parameter settings.
| Methods | Parameters |
|---|---|
| pDenseGCN | Lr(DAE) = 0.01, epochs(DAE)=50, batch size(DAE)=8, Lr(DenseGCN)=0.01, epoch(DenseGCN)=500 |
| ASVM | m=4, n=8, q=5, numGlobal=30, numLocal=20 |
| Xgboost-AE | Lr(AE)=1.0, batch size(AE)=16, epoch(AE)=100 |
| MGRFE-GaRFE | global_bestsize = 120, layer_bestsize = 100 , total_layer = 2 |
| ET-SVM | C=0.004, kernel=‘linear’, decision_function_shape=‘ovo’, gama=1 |
| XOmiVAE | learning_rate=0.01, dropout=0.5, epoch=100 |
| LDA | solver=’svd’ |
| NB | var_smoothing=1e-09 |
| RF | n_estimators=10 |
| DT | splitter=’best’, min_samples_split=2, min_samples_leaf=1 |
Results of comparison methods and proposed method.
| Precision | Recall | F1-Score | Accuracy | AUC | |
|---|---|---|---|---|---|
| pDenseGCN | |||||
| ASVM | 0.937 | 0.9744 | 0.9553 | 0.9208 | 0.8531 |
| XGBoost-AD | 0.9736 | 0.9729 | 0.9732 | 0.9726 | 0.9759 |
| MGRFE-GaRFE | 0.9689 | 0.9397 | 0.9183 | 0.954 | 0.8306 |
| ET-SVM | 0.96 | 0.6316 | 0.7619 | 0.7945 | 0.8015 |
| XOmiVAE | 0.946 | 0.8974 | 0.9211 | 0.8537 | 0.8718 |
| LDA | 0.7262 | 0.8133 | 0.7673 | 0.7466 | 0.7447 |
| RF | 0.9605 | 0.9125 | 0.9359 | 0.937 | 0.9848 |
| NB | 0.8977 | 0.7914 | 0.8412 | 0.8452 | 0.8492 |
| DT | 0.9254 | 0.8267 | 0.8732 | 0.8767 | 0.8781 |
Significant values are in bold.
Figure 3The influence of the patient similarity network.
Results of different omics data.
| Precision | Recall | F1-score | Accuracy | AUC | |
|---|---|---|---|---|---|
| RNA-Seq | 0.8197 | 0.6757 | 0.7407 | 0.75 | 0.7545 |
| DNA methylation | 0.931 | 0.7297 | 0.8182 | 0.8286 | 0.8346 |
| CNV | 0.9574 | 0.6081 | 0.7438 | 0.7785 | 0.7889 |
| RNA-Seq+DNAMethy | 0.9855 | 0.9189 | 0.951 | 0.95 | 0.9519 |
| RNASeq+CNV | 0.8375 | 0.9853 | 0.9054 | 0.9 | 0.9024 |
| DNAMethy+CNV | 0.9589 | 0.9459 | 0.9524 | 0.95 | 0.9502 |
| Multi-omics |
Significant values are in bold.
Results of different DenseGCN layer numbers.
| Precision | Recall | F1-Score | Accuracy | AUC | |
|---|---|---|---|---|---|
| 3-layers | 1 | 0.6892 | 0.816 | 0.8357 | 0.8446 |
| 4-layers | 0.8024 | 0.8784 | 0.8387 | 0.8214 | 0.8179 |
| 5-layers | 1 | 0.9324 | 0.965 | 0.9643 | 0.9662 |
| 6-layers | 0.9125 | 0.9865 | 0.9481 | 0.9429 | 0.9402 |
| 7-layers | 1 | 0.7297 | 0.8437 | 0.8571 | 0.8649 |
| 8-layers | 0.9667 | 0.7838 | 0.8657 | 0.8714 | 0.8767 |
| 9-layers | 1 | 0.9459 | 0.9722 | 0.9714 | 0.973 |
| 10-layers | 0.9865 | ||||
| 15-layers | 0.925 | 1 | 0.961 | 0.9571 | 0.9545 |
Significant values are in bold.
Figure 4Results of different number of features.