Literature DB >> 22312234

Variance estimation using refitted cross-validation in ultrahigh dimensional regression.

Jianqing Fan1, Shaojun Guo, Ning Hao.   

Abstract

Variance estimation is a fundamental problem in statistical modelling. In ultrahigh dimensional linear regression where the dimensionality is much larger than the sample size, traditional variance estimation techniques are not applicable. Recent advances in variable selection in ultrahigh dimensional linear regression make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the predictors. As a result, the realized noises are actually predicted when extra irrelevant variables are selected, leading to serious underestimate of the level of noise. We propose a two-stage refitted procedure via a data splitting technique, called refitted cross-validation, to attenuate the influence of irrelevant variables with high spurious correlations. Our asymptotic results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function. The simulation studies lend further support to our theoretical claims. The naive two-stage estimator and the plug-in one-stage estimators using the lasso and smoothly clipped absolute deviation are also studied and compared. Their performances can be improved by the reffitted cross-validation method proposed.

Entities:  

Year:  2012        PMID: 22312234      PMCID: PMC3271712          DOI: 10.1111/j.1467-9868.2011.01005.x

Source DB:  PubMed          Journal:  J R Stat Soc Series B Stat Methodol        ISSN: 1369-7412            Impact factor:   4.488


  7 in total

1.  Principled sure independence screening for Cox models with ultra-high-dimensional covariates.

Authors:  Sihai Dave Zhao; Yi Li
Journal:  J Multivar Anal       Date:  2012-02-01       Impact factor: 1.473

2.  Non-Concave Penalized Likelihood with NP-Dimensionality.

Authors:  Jianqing Fan; Jinchi Lv
Journal:  IEEE Trans Inf Theory       Date:  2011-08       Impact factor: 2.501

3.  Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.

Authors:  Hao Helen Zhang
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2008-11       Impact factor: 4.488

4.  A Selective Overview of Variable Selection in High Dimensional Feature Space.

Authors:  Jianqing Fan; Jinchi Lv
Journal:  Stat Sin       Date:  2010-01       Impact factor: 1.261

5.  Ultrahigh dimensional feature selection: beyond the linear model.

Authors:  Jianqing Fan; Richard Samworth; Yichao Wu
Journal:  J Mach Learn Res       Date:  2009       Impact factor: 3.654

6.  HIGH DIMENSIONAL VARIABLE SELECTION.

Authors:  Larry Wasserman; Kathryn Roeder
Journal:  Ann Stat       Date:  2009-01-01       Impact factor: 4.028

7.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.

Authors:  Hui Zou; Runze Li
Journal:  Ann Stat       Date:  2008-08-01       Impact factor: 4.028

  7 in total
  33 in total

1.  Adjusting confounders in ranking biomarkers: a model-based ROC approach.

Authors:  Tao Yu; Jialiang Li; Shuangge Ma
Journal:  Brief Bioinform       Date:  2012-03-06       Impact factor: 11.622

2.  Exploiting Linkage Disequilibrium for Ultrahigh-Dimensional Genome-Wide Data with an Integrated Statistical Approach.

Authors:  Michelle Carlsen; Guifang Fu; Shaun Bushman; Christopher Corcoran
Journal:  Genetics       Date:  2015-12-12       Impact factor: 4.562

3.  Diagnosis of sepsis from a drop of blood by measurement of spontaneous neutrophil motility in a microfluidic assay.

Authors:  Felix Ellett; Julianne Jorgensen; Anika L Marand; Yuk Ming Liu; Myriam M Martinez; Vicki Sein; Kathryn L Butler; Jarone Lee; Daniel Irimia
Journal:  Nat Biomed Eng       Date:  2018-03-19       Impact factor: 25.671

4.  The use of vector bootstrapping to improve variable selection precision in Lasso models.

Authors:  Charles Laurin; Dorret Boomsma; Gitta Lubke
Journal:  Stat Appl Genet Mol Biol       Date:  2016-08-01

5.  ARE DISCOVERIES SPURIOUS? DISTRIBUTIONS OF MAXIMUM SPURIOUS CORRELATIONS AND THEIR APPLICATIONS.

Authors:  Jianqing Fan; Qi-Man Shao; Wen-Xin Zhou
Journal:  Ann Stat       Date:  2018-05-03       Impact factor: 4.028

6.  Robust semiparametric gene-environment interaction analysis using sparse boosting.

Authors:  Mengyun Wu; Shuangge Ma
Journal:  Stat Med       Date:  2019-07-29       Impact factor: 2.373

7.  Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions.

Authors:  Jianqing Fan; Quefeng Li; Yuyan Wang
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2016-04-14       Impact factor: 4.488

8.  Error Variance Estimation in Ultrahigh-Dimensional Additive Models.

Authors:  Zhao Chen; Jianqing Fan; Runze Li
Journal:  J Am Stat Assoc       Date:  2017-09-26       Impact factor: 5.033

9.  DISTRIBUTED TESTING AND ESTIMATION UNDER SPARSE HIGH DIMENSIONAL MODELS.

Authors:  Heather Battey; Jianqing Fan; Han Liu; Junwei Lu; Ziwei Zhu
Journal:  Ann Stat       Date:  2018-05-03       Impact factor: 4.028

10.  Estimating False Discovery Proportion Under Arbitrary Covariance Dependence.

Authors:  Jianqing Fan; Xu Han; Weijie Gu
Journal:  J Am Stat Assoc       Date:  2012       Impact factor: 5.033

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.