| Literature DB >> 35028863 |
Xiongshi Deng1,2, Min Li3,4, Shaobo Deng1,2, Lei Wang1,2.
Abstract
Microarray gene expression data are often accompanied by a large number of genes and a small number of samples. However, only a few of these genes are relevant to cancer, resulting in significant gene selection challenges. Hence, we propose a two-stage gene selection approach by combining extreme gradient boosting (XGBoost) and a multi-objective optimization genetic algorithm (XGBoost-MOGA) for cancer classification in microarray datasets. In the first stage, the genes are ranked using an ensemble-based feature selection using XGBoost. This stage can effectively remove irrelevant genes and yield a group comprising the most relevant genes related to the class. In the second stage, XGBoost-MOGA searches for an optimal gene subset based on the most relevant genes' group using a multi-objective optimization genetic algorithm. We performed comprehensive experiments to compare XGBoost-MOGA with other state-of-the-art feature selection methods using two well-known learning classifiers on 14 publicly available microarray expression datasets. The experimental results show that XGBoost-MOGA yields significantly better results than previous state-of-the-art algorithms in terms of various evaluation criteria, such as accuracy, F-score, precision, and recall.Entities:
Keywords: Classification; Feature selection; Microarray gene expression; Multi-objective genetic algorithm; XGBoost
Mesh:
Year: 2022 PMID: 35028863 DOI: 10.1007/s11517-021-02476-x
Source DB: PubMed Journal: Med Biol Eng Comput ISSN: 0140-0118 Impact factor: 2.602