Literature DB >> 33876181

Improving feature selection performance for classification of gene expression data using Harris Hawks optimizer with variable neighborhood learning.

Chiwen Qu1, Lupeng Zhang2, Jinlong Li2, Fang Deng3, Yifan Tang4, Xiaomin Zeng3, Xiaoning Peng4.   

Abstract

Gene expression profiling has played a significant role in the identification and classification of tumor molecules. In gene expression data, only a few feature genes are closely related to tumors. It is a challenging task to select highly discriminative feature genes, and existing methods fail to deal with this problem efficiently. This article proposes a novel metaheuristic approach for gene feature extraction, called variable neighborhood learning Harris Hawks optimizer (VNLHHO). First, the F-score is used for a primary selection of the genes in gene expression data to narrow down the selection range of the feature genes. Subsequently, a variable neighborhood learning strategy is constructed to balance the global exploration and local exploitation of the Harris Hawks optimization. Finally, mutation operations are employed to increase the diversity of the population, so as to prevent the algorithm from falling into a local optimum. In addition, a novel activation function is used to convert the continuous solution of the VNLHHO into binary values, and a naive Bayesian classifier is utilized as a fitness function to select feature genes that can help classify biological tissues of binary and multi-class cancers. An experiment is conducted on gene expression profile data of eight types of tumors. The results show that the classification accuracy of the VNLHHO is greater than 96.128% for tumors in the colon, nervous system and lungs and 100% for the rest. We compare seven other algorithms and demonstrate the superiority of the VNLHHO in terms of the classification accuracy, fitness value and AUC value in feature selection for gene expression data.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  Harris Hawks algorithm; classification algorithm; gene feature selection; variable neighborhood learning

Year:  2021        PMID: 33876181     DOI: 10.1093/bib/bbab097

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  1 in total

1.  Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection.

Authors:  Bilal H Abed-Alguni; Noor Aldeen Alawad; Mohammed Azmi Al-Betar; David Paul
Journal:  Appl Intell (Dordr)       Date:  2022-10-08       Impact factor: 5.019

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.