Literature DB >> 25512221

Classification of lung cancer using ensemble-based feature selection and machine learning methods.

Zhihua Cai1, Dong Xu, Qing Zhang, Jiexia Zhang, Sai-Ming Ngai, Jianlin Shao.   

Abstract

Lung cancer is one of the leading causes of death worldwide. There are three major types of lung cancers, non-small cell lung cancer (NSCLC), small cell lung cancer (SCLC) and carcinoid. NSCLC is further classified into lung adenocarcinoma (LADC), squamous cell lung cancer (SQCLC) as well as large cell lung cancer. Many previous studies demonstrated that DNA methylation has emerged as potential lung cancer-specific biomarkers. However, whether there exists a set of DNA methylation markers simultaneously distinguishing such three types of lung cancers remains elusive. In the present study, ROC (Receiving Operating Curve), RFs (Random Forests) and mRMR (Maximum Relevancy and Minimum Redundancy) were proposed to capture the unbiased, informative as well as compact molecular signatures followed by machine learning methods to classify LADC, SQCLC and SCLC. As a result, a panel of 16 DNA methylation markers exhibits an ideal classification power with an accuracy of 86.54%, 84.6% and a recall 84.37%, 85.5% in the leave-one-out cross-validation (LOOCV) and independent data set test experiments, respectively. Besides, comparison results indicate that ensemble-based feature selection methods outperform individual ones when combined with the incremental feature selection (IFS) strategy in terms of the informative and compact property of features. Taken together, results obtained suggest the effectiveness of the ensemble-based feature selection approach and the possible existence of a common panel of DNA methylation markers among such three types of lung cancer tissue, which would facilitate clinical diagnosis and treatment.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25512221     DOI: 10.1039/c4mb00659c

Source DB:  PubMed          Journal:  Mol Biosyst        ISSN: 1742-2051


  30 in total

1.  A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data.

Authors:  Yunchuan Kong; Tianwei Yu
Journal:  Bioinformatics       Date:  2018-11-01       Impact factor: 6.937

2.  forgeNet: a graph deep neural network model using tree-based ensemble classifiers for feature graph construction.

Authors:  Yunchuan Kong; Tianwei Yu
Journal:  Bioinformatics       Date:  2020-06-01       Impact factor: 6.937

3.  Discriminating cirRNAs from other lncRNAs using a hierarchical extreme learning machine (H-ELM) algorithm with feature selection.

Authors:  Lei Chen; Yu-Hang Zhang; Guohua Huang; Xiaoyong Pan; ShaoPeng Wang; Tao Huang; Yu-Dong Cai
Journal:  Mol Genet Genomics       Date:  2017-09-14       Impact factor: 3.291

4.  Cancer adjuvant chemotherapy prediction model for non-small cell lung cancer.

Authors:  Russul Alanni; Jingyu Hou; Hasseeb Azzawi; Yong Xiang
Journal:  IET Syst Biol       Date:  2019-06       Impact factor: 1.615

Review 5.  Targeting sphingosine-1-phosphate signaling in lung diseases.

Authors:  David L Ebenezer; Panfeng Fu; Viswanathan Natarajan
Journal:  Pharmacol Ther       Date:  2016-09-13       Impact factor: 12.310

Review 6.  Machine Learning in Epigenomics: Insights into Cancer Biology and Medicine.

Authors:  Emre Arslan; Jonathan Schulz; Kunal Rai
Journal:  Biochim Biophys Acta Rev Cancer       Date:  2021-07-07       Impact factor: 10.680

7.  A predictive model for the diagnosis of non-alcoholic fatty liver disease based on an integrated machine learning method.

Authors:  Xuefeng Ma; Chao Yang; Kun Liang; Baokai Sun; Wenwen Jin; Lizhen Chen; Mengzhen Dong; Shousheng Liu; Yongning Xin; Likun Zhuang
Journal:  Am J Transl Res       Date:  2021-11-15       Impact factor: 4.060

8.  Machine learning-based random forest predicts anastomotic leakage after anterior resection for rectal cancer.

Authors:  Rongbo Wen; Kuo Zheng; Qihang Zhang; Leqi Zhou; Qizhi Liu; Guanyu Yu; Xianhua Gao; Liqiang Hao; Zheng Lou; Wei Zhang
Journal:  J Gastrointest Oncol       Date:  2021-06

9.  Lung adenocarcinoma and lung squamous cell carcinoma cancer classification, biomarker identification, and gene expression analysis using overlapping feature selection methods.

Authors:  Joe W Chen; Joseph Dhahbi
Journal:  Sci Rep       Date:  2021-06-25       Impact factor: 4.379

10.  Enhancing the weighted voting ensemble algorithm for tuberculosis predictive diagnosis.

Authors:  Victor Chukwudi Osamor; Adaugo Fiona Okezie
Journal:  Sci Rep       Date:  2021-07-20       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.