Bing Niu1, Chaofeng Liang2, Yi Lu3, Manman Zhao3, Qin Chen4, Yuhui Zhang5, Linfeng Zheng6, Kuo-Chen Chou7. 1. School of Life Sciences, Shanghai University, Shanghai 200444, China; Gordon Life Science Institute, Boston, MA 02478, USA. Electronic address: bniu@gordonlifescience.org. 2. Department of Neurosurgery, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China. 3. School of Life Sciences, Shanghai University, Shanghai 200444, China. 4. School of Life Sciences, Shanghai University, Shanghai 200444, China. Electronic address: chenqincc@edu.cn. 5. Renji Hospital, Medical School, Shanghai Jiaotong University, 160 Pujian Rd, New Pudong District, Shanghai 200127, China; Changhai Hospital, Second Military Medical University, Shanghai 200433, China. Electronic address: gong_chang2008@126.com. 6. Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, China; Department of Radiology, Shanghai First People's Hospital, Baoshan Branch, Shanghai 200940, China. Electronic address: zhenglinfeng04@aliyun.com. 7. Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA. Electronic address: kcchou@gordonlifescience.org.
Abstract
BACKGROUND: Glioma is the most lethal nervous system cancer. Recent studies have made great efforts to study the occurrence and development of glioma, but the molecular mechanisms are still unclear. This study was designed to reveal the molecular mechanisms of glioma based on protein-protein interaction network combined with machine learning methods. Key differentially expressed genes (DEGs) were screened and selected by using the protein-protein interaction (PPI) networks. RESULTS: As a result, 19 genes between grade I and grade II, 21 genes between grade II and grade III, and 20 genes between grade III and grade IV. Then, five machine learning methods were employed to predict the gliomas stages based on the selected key genes. After comparison, Complement Naive Bayes classifier was employed to build the prediction model for grade II-III with accuracy 72.8%. And Random forest was employed to build the prediction model for grade I-II and grade III-VI with accuracy 97.1% and 83.2%, respectively. Finally, the selected genes were analyzed by PPI networks, Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and the results improve our understanding of the biological functions of select DEGs involved in glioma growth. We expect that the key genes expressed have a guiding significance for the occurrence of gliomas or, at the very least, that they are useful for tumor researchers. CONCLUSION: Machine learning combined with PPI networks, GO and KEGG analyses of selected DEGs improve our understanding of the biological functions involved in glioma growth.
BACKGROUND:Glioma is the most lethal nervous system cancer. Recent studies have made great efforts to study the occurrence and development of glioma, but the molecular mechanisms are still unclear. This study was designed to reveal the molecular mechanisms of glioma based on protein-protein interaction network combined with machine learning methods. Key differentially expressed genes (DEGs) were screened and selected by using the protein-protein interaction (PPI) networks. RESULTS: As a result, 19 genes between grade I and grade II, 21 genes between grade II and grade III, and 20 genes between grade III and grade IV. Then, five machine learning methods were employed to predict the gliomas stages based on the selected key genes. After comparison, Complement Naive Bayes classifier was employed to build the prediction model for grade II-III with accuracy 72.8%. And Random forest was employed to build the prediction model for grade I-II and grade III-VI with accuracy 97.1% and 83.2%, respectively. Finally, the selected genes were analyzed by PPI networks, Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and the results improve our understanding of the biological functions of select DEGs involved in glioma growth. We expect that the key genes expressed have a guiding significance for the occurrence of gliomas or, at the very least, that they are useful for tumor researchers. CONCLUSION: Machine learning combined with PPI networks, GO and KEGG analyses of selected DEGs improve our understanding of the biological functions involved in glioma growth.
Authors: Jun Zhang; Hong Peng; Yu-Lin Wang; Hua-Feng Xiao; Yuan-Yuan Cui; Xiang-Bing Bian; De-Kang Zhang; Lin Ma Journal: Front Oncol Date: 2021-05-13 Impact factor: 6.244