Literature DB >> 28989558

Designing penalty functions in high dimensional problems: The role of tuning parameters.

Ting-Huei Chen1, Wei Sun2, Jason P Fine3.   

Abstract

Various forms of penalty functions have been developed for regularized estimation and variable selection. Screening approaches are often used to reduce the number of covariate before penalized estimation. However, in certain problems, the number of covariates remains large after screening. For example, in genome-wide association (GWA) studies, the purpose is to identify Single Nucleotide Polymorphisms (SNPs) that are associated with certain traits, and typically there are millions of SNPs and thousands of samples. Because of the strong correlation of nearby SNPs, screening can only reduce the number of SNPs from millions to tens of thousands and the variable selection problem remains very challenging. Several penalty functions have been proposed for such high dimensional data. However, it is unclear which class of penalty functions is the appropriate choice for a particular application. In this paper, we conduct a theoretical analysis to relate the ranges of tuning parameters of various penalty functions with the dimensionality of the problem and the minimum effect size. We exemplify our theoretical results in several penalty functions. The results suggest that a class of penalty functions that bridges L0 and L1 penalties requires less restrictive conditions on dimensionality and minimum effect sizes in order to attain the two fundamental goals of penalized estimation: to penalize all the noise to be zero and to obtain unbiased estimation of the true signals. The penalties such as SICA and Log belong to this class, but they have not been used often in applications. The simulation and real data analysis using GWAS data suggest the promising applicability of such class of penalties.

Entities:  

Keywords:  Folded-concave penalties; genome-wide association studies; tuning parameter selection

Year:  2016        PMID: 28989558      PMCID: PMC5628772          DOI: 10.1214/16-EJS1169

Source DB:  PubMed          Journal:  Electron J Stat        ISSN: 1935-7524            Impact factor:   1.125


  15 in total

1.  Non-Concave Penalized Likelihood with NP-Dimensionality.

Authors:  Jianqing Fan; Jinchi Lv
Journal:  IEEE Trans Inf Theory       Date:  2011-08       Impact factor: 2.501

2.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.

Authors:  Sharon R Browning; Brian L Browning
Journal:  Am J Hum Genet       Date:  2007-09-21       Impact factor: 11.025

3.  Genomewide multiple-loci mapping in experimental crosses by iterative adaptive penalized regression.

Authors:  Wei Sun; Joseph G Ibrahim; Fei Zou
Journal:  Genetics       Date:  2010-02-15       Impact factor: 4.562

4.  On constrained and regularized high-dimensional regression.

Authors:  Xiaotong Shen; Wei Pan; Yunzhang Zhu; Hui Zhou
Journal:  Ann Inst Stat Math       Date:  2013-10       Impact factor: 1.267

5.  A Selective Overview of Variable Selection in High Dimensional Feature Space.

Authors:  Jianqing Fan; Jinchi Lv
Journal:  Stat Sin       Date:  2010-01       Impact factor: 1.261

6.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION.

Authors:  Patrick Breheny; Jian Huang
Journal:  Ann Appl Stat       Date:  2011-01-01       Impact factor: 2.083

7.  Regulation of neuroblastoma differentiation by forkhead transcription factors FOXO1/3/4 through the receptor tyrosine kinase PDGFRA.

Authors:  Yang Mei; Zhanxiang Wang; Lei Zhang; Yiru Zhang; Xiaoyu Li; Huihui Liu; Jing Ye; Han You
Journal:  Proc Natl Acad Sci U S A       Date:  2012-03-12       Impact factor: 11.205

8.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.

Authors:  Hui Zou; Runze Li
Journal:  Ann Stat       Date:  2008-08-01       Impact factor: 4.028

9.  CALIBRATING NON-CONVEX PENALIZED REGRESSION IN ULTRA-HIGH DIMENSION.

Authors:  Lan Wang; Yongdai Kim; Runze Li
Journal:  Ann Stat       Date:  2013-10-01       Impact factor: 4.028

10.  Heritability and genomics of gene expression in peripheral blood.

Authors:  Fred A Wright; Patrick F Sullivan; Andrew I Brooks; Fei Zou; Wei Sun; Kai Xia; Vered Madar; Rick Jansen; Wonil Chung; Yi-Hui Zhou; Abdel Abdellaoui; Sandra Batista; Casey Butler; Guanhua Chen; Ting-Huei Chen; David D'Ambrosio; Paul Gallins; Min Jin Ha; Jouke Jan Hottenga; Shunping Huang; Mathijs Kattenberg; Jaspreet Kochar; Christel M Middeldorp; Ani Qu; Andrey Shabalin; Jay Tischfield; Laura Todd; Jung-Ying Tzeng; Gerard van Grootheest; Jacqueline M Vink; Qi Wang; Wei Wang; Weibo Wang; Gonneke Willemsen; Johannes H Smit; Eco J de Geus; Zhaoyu Yin; Brenda W J H Penninx; Dorret I Boomsma
Journal:  Nat Genet       Date:  2014-04-13       Impact factor: 38.330

View more
  2 in total

1.  A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information.

Authors:  Ting-Huei Chen; Nilanjan Chatterjee; Maria Teresa Landi; Jianxin Shi
Journal:  J Am Stat Assoc       Date:  2020-10-12       Impact factor: 5.033

2.  Designing penalty functions in high dimensional problems: The role of tuning parameters.

Authors:  Ting-Huei Chen; Wei Sun; Jason P Fine
Journal:  Electron J Stat       Date:  2016-08-29       Impact factor: 1.125

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.