| Literature DB >> 25912984 |
Aiguo Wang1, Ning An2, Guilin Chen3, Lian Li4, Gil Alterovitz5.
Abstract
Gene selection plays a crucial role in constructing efficient classifiers for microarray data classification, since microarray data is characterized by high dimensionality and small sample sizes and contains irrelevant and redundant genes. In practical use, partial least squares-based gene selection approaches can obtain gene subsets of good qualities, but are considerably time-consuming. In this paper, we propose to integrate partial least squares based recursive feature elimination (PLS-RFE) with two feature elimination schemes: simulated annealing and square root, respectively, to speed up the feature selection process. Inspired from the strategy of annealing schedule, the two proposed approaches eliminate a number of features rather than one least informative feature during each iteration and the number of removed features decreases as the iteration proceeds. To verify the effectiveness and efficiency of the proposed approaches, we perform extensive experiments on six publicly available microarray data with three typical classifiers, including Naïve Bayes, K-Nearest-Neighbor and Support Vector Machine, and compare our approaches with ReliefF, PLS and PLS-RFE feature selectors in terms of classification accuracy and running time. Experimental results demonstrate that the two proposed approaches accelerate the feature selection process impressively without degrading the classification accuracy and obtain more compact feature subsets for both two-category and multi-category problems. Further experimental comparisons in feature subset consistency show that the proposed approach with simulated annealing scheme not only has better time performance, but also obtains slightly better feature subset consistency than the one with square root scheme.Keywords: Annealing schedule; Classification; Gene selection; Partial least squares; Recursive feature elimination; Sequential backward selection
Mesh:
Year: 2015 PMID: 25912984 DOI: 10.1016/j.compbiomed.2015.04.011
Source DB: PubMed Journal: Comput Biol Med ISSN: 0010-4825 Impact factor: 4.589