Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A SIGNIFICANCE TEST FOR THE LASSO.

Literature DB >> 25574062

A SIGNIFICANCE TEST FOR THE LASSO.

Richard Lockhart¹, Jonathan Taylor², Ryan J Tibshirani³, Robert Tibshirani⁴.

Abstract

In the sparse linear regression setting, we consider testing the significance of the predictor variable that enters the current lasso model, in the sequence of models visited along the lasso solution path. We propose a simple test statistic based on lasso fitted values, called the covariance test statistic, and show that when the true model is linear, this statistic has an Exp(1) asymptotic distribution under the null hypothesis (the null being that all truly active variables are contained in the current lasso model). Our proof of this result for the special case of the first predictor to enter the model (i.e., testing for a single significant predictor variable against the global null) requires only weak assumptions on the predictor matrix X. On the other hand, our proof for a general step in the lasso path places further technical assumptions on X and the generative model, but still allows for the important high-dimensional case p > n, and does not necessarily require that the current lasso model achieves perfect recovery of the truly active variables. Of course, for testing the significance of an additional variable between two nested linear models, one typically uses the chi-squared test, comparing the drop in residual sum of squares (RSS) to a [Formula: see text] distribution. But when this additional variable is not fixed, and has been chosen adaptively or greedily, this test is no longer appropriate: adaptivity makes the drop in RSS stochastically much larger than [Formula: see text] under the null hypothesis. Our analysis explicitly accounts for adaptivity, as it must, since the lasso builds an adaptive sequence of linear models as the tuning parameter λ decreases. In this analysis, shrinkage plays a key role: though additional variables are chosen adaptively, the coefficients of lasso active variables are shrunken due to the [Formula: see text] penalty. Therefore, the test statistic (which is based on lasso fitted values) is in a sense balanced by these two opposing properties-adaptivity and shrinkage-and its null distribution is tractable and asymptotically Exp(1).

Entities: Chemical Disease Gene Species

Keywords: Lasso; least angle regression; p-value; significance test

Year: 2014 PMID： 25574062 PMCID： PMC4285373 DOI： 10.1214/13-AOS1175

Source DB: PubMed Journal: Ann Stat ISSN： 0090-5364 Impact factor: 4.028

5 in total

1. Variance estimation using refitted cross-validation in ultrahigh dimensional regression.

Authors: Jianqing Fan; Shaojun Guo; Ning Hao
Journal: J R Stat Soc Series B Stat Methodol Date: 2012-01-01 Impact factor: 4.488

2. Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors: Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal: J Stat Softw Date: 2010 Impact factor: 6.440

3. A Perturbation Method for Inference on Regularized Regression Estimates.

Authors: Jessica Minnier; Lu Tian; Tianxi Cai
Journal: J Am Stat Assoc Date: 2012-01-24 Impact factor: 5.033

4. HIGH DIMENSIONAL VARIABLE SELECTION.

Authors: Larry Wasserman; Kathryn Roeder
Journal: Ann Stat Date: 2009-01-01 Impact factor: 4.028

5. Human immunodeficiency virus reverse transcriptase and protease sequence database.

Authors: Soo-Yon Rhee; Matthew J Gonzales; Rami Kantor; Bradley J Betts; Jaideep Ravela; Robert W Shafer
Journal: Nucleic Acids Res Date: 2003-01-01 Impact factor: 16.971

5 in total

117 in total

1. POWERFUL TEST BASED ON CONDITIONAL EFFECTS FOR GENOME-WIDE SCREENING.

Authors: Yaowu Liu; Jun Xie
Journal: Ann Appl Stat Date: 2018-03-09 Impact factor: 2.083

2. High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality.

Authors: Zhaoran Wang; Quanquan Gu; Yang Ning; Han Liu
Journal: Adv Neural Inf Process Syst Date: 2015

3. Too many covariates and too few cases? - a comparative study.

Authors: Qingxia Chen; Hui Nian; Yuwei Zhu; H Keipp Talbot; Marie R Griffin; Frank E Harrell
Journal: Stat Med Date: 2016-06-30 Impact factor: 2.373

4. Cross-validation and hypothesis testing in neuroimaging: An irenic comment on the exchange between Friston and Lindquist et al.

Authors: Philip T Reiss
Journal: Neuroimage Date: 2015-04-25 Impact factor: 6.556

5. Statistical learning and selective inference.

Authors: Jonathan Taylor; Robert J Tibshirani
Journal: Proc Natl Acad Sci U S A Date: 2015-06-23 Impact factor: 11.205

6. Collaborative regression.

Authors: Samuel M Gross; Robert Tibshirani
Journal: Biostatistics Date: 2014-11-17 Impact factor: 5.899

Review 7. Statistical learning approaches in the genetic epidemiology of complex diseases.

Authors: Anne-Laure Boulesteix; Marvin N Wright; Sabine Hoffmann; Inke R König
Journal: Hum Genet Date: 2019-05-02 Impact factor: 4.132

8. Disentangling the effects of farmland use, habitat edges, and vegetation structure on ground beetle morphological traits.

Authors: Katherina Ng; Philip S Barton; Wade Blanchard; Maldwyn J Evans; David B Lindenmayer; Sarina Macfadyen; Sue McIntyre; Don A Driscoll
Journal: Oecologia Date: 2018-06-06 Impact factor: 3.225

9. Prefrontal cortical activation during working memory task anticipation contributes to discrimination between bipolar and unipolar depression.

Authors: Anna Manelis; Satish Iyengar; Holly A Swartz; Mary L Phillips
Journal: Neuropsychopharmacology Date: 2020-02-18 Impact factor: 7.853

10. Graphical Models via Univariate Exponential Family Distributions.

Authors: Eunho Yang; Pradeep Ravikumar; Genevera I Allen; Zhandong Liu
Journal: J Mach Learn Res Date: 2015-12 Impact factor: 3.654