| Literature DB >> 30660788 |
Hossein Nematzadeh1, Rasul Enayatifar2, Maqsood Mahmud3, Ebrahim Akbari1.
Abstract
Feature selection is the problem of finding the best subset of features which have the most impact in predicting class labels. It is noteworthy that application of feature selection is more valuable in high dimensional datasets. In this paper, a filter feature selection method has been proposed on high dimensional binary medical datasets - Colon, Central Nervous System (CNS), GLI_85, SMK_CAN_187. The proposed method incorporates three sections. First, whale algorithm has been used to discard irrelevant features. Second, the rest of features are ranked based on a frequency based heuristic approach called Mutual Congestion. Third, majority voting has been applied on best feature subsets constructed using forward feature selection with threshold τ = 10. This work provides evidence that Mutual Congestion is solely powerful to predict class labels. Furthermore, applying whale algorithm increases the overall accuracy of Mutual Congestion in most of the cases. The findings also show that the proposed method improves the prediction with selecting the less possible features in comparison with state of the arts. https://github.com/hnematzadeh.Entities:
Keywords: Feature selection; Mutual congestion; Whale algorithm
Mesh:
Year: 2019 PMID: 30660788 DOI: 10.1016/j.ygeno.2019.01.006
Source DB: PubMed Journal: Genomics ISSN: 0888-7543 Impact factor: 5.736