Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Efficient ℓ0 -norm feature selection based on augmented and penalized minimization.

Literature DB >> 29082539

Efficient ℓ₀ -norm feature selection based on augmented and penalized minimization.

Xiang Li¹, Shanghong Xie², Donglin Zeng³, Yuanjia Wang².

Abstract

Advances in high-throughput technologies in genomics and imaging yield unprecedentedly large numbers of prognostic biomarkers. To accommodate the scale of biomarkers and study their association with disease outcomes, penalized regression is often used to identify important biomarkers. The ideal variable selection procedure would search for the best subset of predictors, which is equivalent to imposing an ℓ0 -penalty on the regression coefficients. Since this optimization is a nondeterministic polynomial-time hard (NP-hard) problem that does not scale with number of biomarkers, alternative methods mostly place smooth penalties on the regression parameters, which lead to computationally feasible optimization problems. However, empirical studies and theoretical analyses show that convex approximation of ℓ0 -norm (eg, ℓ1 ) does not outperform their ℓ0 counterpart. The progress for ℓ0 -norm feature selection is relatively slower, where the main methods are greedy algorithms such as stepwise regression or orthogonal matching pursuit. Penalized regression based on regularizing ℓ0 -norm remains much less explored in the literature. In this work, inspired by the recently popular augmenting and data splitting algorithms including alternating direction method of multipliers, we propose a 2-stage procedure for ℓ0 -penalty variable selection, referred to as augmented penalized minimization-L0 (APM-L0 ). The APM-L0 targets ℓ0 -norm as closely as possible while keeping computation tractable, efficient, and simple, which is achieved by iterating between a convex regularized regression and a simple hard-thresholding estimation. The procedure can be viewed as arising from regularized optimization with truncated ℓ1 norm. Thus, we propose to treat regularization parameter and thresholding parameter as tuning parameters and select based on cross-validation. A 1-step coordinate descent algorithm is used in the first stage to significantly improve computational efficiency. Through extensive simulation studies and real data application, we demonstrate superior performance of the proposed method in terms of selection accuracy and computational speed as compared to existing methods. The proposed APM-L0 procedure is implemented in the R-package APML0.

Entities: Chemical Disease Gene

Keywords: ADMM; biomarker signature; censored data; variable selection; ℓ0-penalty

Mesh：

Substances：
Biomarkers

Year: 2017 PMID： 29082539 PMCID： PMC5768461 DOI： 10.1002/sim.7526

Source DB: PubMed Journal: Stat Med ISSN： 0277-6715 Impact factor: 2.373

19 in total

Review 1. Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests.

Authors: M Greiner; D Pfeiffer; R D Smith
Journal: Prev Vet Med Date: 2000-05-30 Impact factor: 2.670

2. Likelihood-based selection and sharp parameter estimation.

Authors: Xiaotong Shen; Wei Pan; Yunzhang Zhu
Journal: J Am Stat Assoc Date: 2012-06-11 Impact factor: 5.033

3. High-dimensional, massive sample-size Cox proportional hazards regression for survival analysis.

Authors: Sushil Mittal; David Madigan; Randall S Burd; Marc A Suchard
Journal: Biostatistics Date: 2013-10-04 Impact factor: 5.899

Review 4. Huntington disease: natural history, biomarkers and prospects for therapeutics.

Authors: Christopher A Ross; Elizabeth H Aylward; Edward J Wild; Douglas R Langbehn; Jeffrey D Long; John H Warner; Rachael I Scahill; Blair R Leavitt; Julie C Stout; Jane S Paulsen; Ralf Reilmann; Paul G Unschuld; Alice Wexler; Russell L Margolis; Sarah J Tabrizi
Journal: Nat Rev Neurol Date: 2014-03-11 Impact factor: 42.937

5. Evaluating the yield of medical tests.

Authors: F E Harrell; R M Califf; D B Pryor; K L Lee; R A Rosati
Journal: JAMA Date: 1982-05-14 Impact factor: 56.272

6. Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors: Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal: J Stat Softw Date: 2010 Impact factor: 6.440

7. Thalamic metabolism and symptom onset in preclinical Huntington's disease.

Authors: A Feigin; C Tang; Y Ma; P Mattis; D Zgaljardic; M Guttman; J S Paulsen; V Dhawan; D Eidelberg
Journal: Brain Date: 2007-09-24 Impact factor: 13.501

8. VARIABLE SELECTION AND REGRESSION ANALYSIS FOR GRAPH-STRUCTURED COVARIATES WITH AN APPLICATION TO GENOMICS.

Authors: Caiyan Li; Hongzhe Li
Journal: Ann Appl Stat Date: 2010-09-01 Impact factor: 2.083

9. Variable selection and estimation in generalized linear models with the seamless L₀ penalty.

Authors: Zilin Li; Sijian Wang; Xihong Lin
Journal: Can J Stat Date: 2012-12 Impact factor: 0.875

10. The relative importance of imaging markers for the prediction of Alzheimer's disease dementia in mild cognitive impairment - Beyond classical regression.

Authors: Stefan J Teipel; Jens Kurth; Bernd Krause; Michel J Grothe
Journal: Neuroimage Clin Date: 2015-05-21 Impact factor: 4.881

4 in total