| Literature DB >> 27056739 |
Shun Guo1, Donghui Guo2, Lifei Chen3, Qingshan Jiang4.
Abstract
For classification problems based on microarray data, the data typically contains a large number of irrelevant and redundant features. In this paper, a new gene selection method is proposed to choose the best subset of features for microarray data with the irrelevant and redundant features removed. We formulate the selection problem as a L1-regularized optimization problem, based on a newly defined linear discriminant analysis criterion. Instead of calculating the mean of the samples, a kernel-based approach is used to estimate the class centroid to define both the between-class separability and the within-class compactness for the criterion. Theoretical analysis indicates that the global optimal solution of the L1-regularized criterion can be reached with a general condition, on which an efficient algorithm is derived to the feature selection problem in a linear time complexity with respect to the number of features and the number of samples. The experimental results on ten publicly available microarray datasets demonstrate that the proposed method performs effectively and competitively compared with state-of-the-art methods.Keywords: Class centroid; Classification; Gene selection; L1 regularization; Microarray data
Mesh:
Substances:
Year: 2016 PMID: 27056739 DOI: 10.1016/j.jtbi.2016.03.034
Source DB: PubMed Journal: J Theor Biol ISSN: 0022-5193 Impact factor: 2.691