Literature DB >> 24808598

Minimax sparse logistic regression for very high-dimensional feature selection.

Mingkui Tan, Ivor W Tsang, Li Wang.   

Abstract

Because of the strong convexity and probabilistic underpinnings, logistic regression (LR) is widely used in many real-world applications. However, in many problems, such as bioinformatics, choosing a small subset of features with the most discriminative power are desirable for interpreting the prediction model, robust predictions or deeper analysis. To achieve a sparse solution with respect to input features, many sparse LR models are proposed. However, it is still challenging for them to efficiently obtain unbiased sparse solutions to very high-dimensional problems (e.g., identifying the most discriminative subset from millions of features). In this paper, we propose a new minimax sparse LR model for very high-dimensional feature selections, which can be efficiently solved by a cutting plane algorithm. To solve the resultant nonsmooth minimax subproblems, a smoothing coordinate descent method is presented. Numerical issues and convergence rate of this method are carefully studied. Experimental results on several synthetic and real-world datasets show that the proposed method can obtain better prediction accuracy with the same number of selected features and has better or competitive scalability on very high-dimensional problems compared with the baseline methods, including the l1-regularized LR.

Year:  2013        PMID: 24808598     DOI: 10.1109/TNNLS.2013.2263427

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw Learn Syst        ISSN: 2162-237X            Impact factor:   10.451


  1 in total

1.  Classification of high dimensional biomedical data based on feature selection using redundant removal.

Authors:  Bingtao Zhang; Peng Cao
Journal:  PLoS One       Date:  2019-04-09       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.