| Literature DB >> 36203453 |
Yong Cai1, Qiongya Wu1, Yun Chen1, Yu Liu1, Jiying Wang2.
Abstract
Lung cancer is the leading cause of cancer death globally, killing 1.8 million people yearly. Over 85% of lung cancer cases are non-small cell lung cancer (NSCLC). Lung cancer running in families has shown that some genes are linked to lung cancer. Genes associated with NSCLC have been found by next-generation sequencing (NGS) and genome-wide association studies (GWAS). Many papers, however, neglected the complex information about interactions between gene pairs. Along with its high cost, GWAS analysis has an obvious drawback of false-positive results. Based on the above problem, computational techniques are used to offer researchers alternative and complementary low-cost disease-gene association findings. To help find NSCLC-related genes, we proposed a new network-based machine learning method, named deepRW, to predict genes linked to NSCLC. We first constructed a gene interaction network consisting of genes that are related and irrelevant to NSCLC disease and used deep walk and graph convolutional network (GCN) method to learn gene-disease interactions. Finally, deep neural network (DNN) was utilized as the prediction module to decide which genes are related to NSCLC. To evaluate the performance of deepRW, we ran tests with 10-fold cross-validation. The experimental results showed that our method greatly exceeded the existing methods. In addition, the effectiveness of each module in deepRW was demonstrated in comparative experiments.Entities:
Keywords: computational techniques; deep neural network; deep walk; graph convolutional network; lung cancer
Year: 2022 PMID: 36203453 PMCID: PMC9530852 DOI: 10.3389/fonc.2022.981154
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1The structure of deepRW. GCN, graph convolutional network; DNN, deep neural network.
Figure 2The structure of the DNN module.
The effectiveness of deep walk and GCN in deepRW.
| Number of layers | AUROC | AUPR |
|---|---|---|
| Two layers | 0.702 | 0.723 |
| Three layers | 0.763 | 0.795 |
| Four layers | 0.741 | 0.769 |
AUROC, area under the ROC curve; AUPR, The area under the precision recall curve.
The AUROC and AUPR scores of different methods.
| Method | AUROC | AUPR |
|---|---|---|
| DeepRW | 0.763 | 0.795 |
| KBMF | 0.701 | 0.748 |
| RF | 0.647 | 0.697 |
| RWR | 0.636 | 0.659 |
DeepRW, Deep random walk; KBMF, Kernelized Bayesian matrix factorization; RF, Random forest; RWR, Random walk with restart.
The AUROC and AUPR scores of different methods.
| Method | AUROC | AUPR |
|---|---|---|
| DeepRW | 0.763 | 0.795 |
| KBMF | 0.701 | 0.748 |
| RF | 0.647 | 0.697 |
| RWR | 0.636 | 0.659 |
DeepRW, Deep random walk; KBMF, Kernelized Bayesian matrix factorization; RF, Random forest; RWR, Random walk with restart.