Literature DB >> 26779257

A selective overview of feature screening for ultrahigh-dimensional data.

Liu JingYuan1, Zhong Wei2, L I RunZe3.   

Abstract

High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of high-dimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data. Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many high-dimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.

Entities:  

Keywords:  correlation learning; distance correlation; sure independence screening; sure joint screening; sure screening property; ultrahigh-dimensional data

Year:  2015        PMID: 26779257      PMCID: PMC4711389          DOI: 10.1007/s11425-015-5062-9

Source DB:  PubMed          Journal:  Sci China Math        ISSN: 1869-1862            Impact factor:   1.331


  21 in total

1.  Statistical significance for genomewide studies.

Authors:  John D Storey; Robert Tibshirani
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-25       Impact factor: 11.205

2.  A FAST ALGORITHM FOR DETECTING GENE-GENE INTERACTIONS IN GENOME-WIDE ASSOCIATION STUDIES.

Authors:  Jiahan Li; Wei Zhong; Runze Li; Rongling Wu
Journal:  Ann Appl Stat       Date:  2014       Impact factor: 2.083

3.  A Selective Overview of Variable Selection in High Dimensional Feature Space.

Authors:  Jianqing Fan; Jinchi Lv
Journal:  Stat Sin       Date:  2010-01       Impact factor: 1.261

4.  Ultrahigh dimensional feature selection: beyond the linear model.

Authors:  Jianqing Fan; Richard Samworth; Yichao Wu
Journal:  J Mach Learn Res       Date:  2009       Impact factor: 3.654

5.  Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates.

Authors:  Jingyuan Liu; Runze Li; Rongling Wu
Journal:  J Am Stat Assoc       Date:  2014-01-01       Impact factor: 5.033

6.  Diagnosis of multiple cancer types by shrunken centroids of gene expression.

Authors:  Robert Tibshirani; Trevor Hastie; Balasubramanian Narasimhan; Gilbert Chu
Journal:  Proc Natl Acad Sci U S A       Date:  2002-05-14       Impact factor: 11.205

7.  Challenges of Big Data Analysis.

Authors:  Jianqing Fan; Fang Han; Han Liu
Journal:  Natl Sci Rev       Date:  2014-06       Impact factor: 17.275

8.  Feature Screening for Ultrahigh Dimensional Categorical Data with Applications.

Authors:  Danyang Huang; Runze Li; Hansheng Wang
Journal:  J Bus Econ Stat       Date:  2014       Impact factor: 6.565

9.  Feature Screening via Distance Correlation Learning.

Authors:  Runze Li; Wei Zhong; Liping Zhu
Journal:  J Am Stat Assoc       Date:  2012-07-01       Impact factor: 5.033

10.  Nonparametric Independence Screening in Sparse Ultra-High Dimensional Varying Coefficient Models.

Authors:  Jianqing Fan; Yunbei Ma; Wei Dai
Journal:  J Am Stat Assoc       Date:  2014       Impact factor: 5.033

View more
  9 in total

1.  High-dimension to high-dimension screening for detecting genome-wide epigenetic and noncoding RNA regulators of gene expression.

Authors:  Hongjie Ke; Zhao Ren; Jianfei Qi; Shuo Chen; George C Tseng; Zhenyao Ye; Tianzhou Ma
Journal:  Bioinformatics       Date:  2022-07-20       Impact factor: 6.931

2.  Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems.

Authors:  Debmalya Nandy; Francesca Chiaromonte; Runze Li
Journal:  J Am Stat Assoc       Date:  2021-02-10       Impact factor: 4.369

3.  On correlation rank screening for ultra-high dimensional competing risks data.

Authors:  Xiaolin Chen; Chenguang Li; Tao Zhang; Zhenlong Gao
Journal:  J Appl Stat       Date:  2021-02-09       Impact factor: 1.416

4.  Feature Screening in Ultrahigh Dimensional Generalized Varying-coefficient Models.

Authors:  Guangren Yang; Songshan Yang; Runze Li
Journal:  Stat Sin       Date:  2020       Impact factor: 1.261

5.  FEATURE SELECTION FOR GENERALIZED VARYING COEFFICIENT MIXED-EFFECT MODELS WITH APPLICATION TO OBESITY GWAS.

Authors:  Wanghuan Chu; Runze Li; Jingyuan Liu; Matthew Reimherr
Journal:  Ann Appl Stat       Date:  2020-04-16       Impact factor: 2.083

6.  An adaptive threshold determination method of feature screening for genomic selection.

Authors:  Guifang Fu; Gang Wang; Xiaotian Dai
Journal:  BMC Bioinformatics       Date:  2017-04-12       Impact factor: 3.169

7.  Detecting PCOS susceptibility loci from genome-wide association studies via iterative trend correlation based feature screening.

Authors:  Xiaotian Dai; Guifang Fu; Randall Reese
Journal:  BMC Bioinformatics       Date:  2020-05-04       Impact factor: 3.169

8.  Model selection for inferential models with high dimensional data: synthesis and graphical representation of multiple techniques.

Authors:  Eliana Lima; Robert Hyde; Martin Green
Journal:  Sci Rep       Date:  2021-01-11       Impact factor: 4.379

9.  A quantile regression forest based method to predict drug response and assess prediction reliability.

Authors:  Yun Fang; Peirong Xu; Jialiang Yang; Yufang Qin
Journal:  PLoS One       Date:  2018-10-05       Impact factor: 3.240

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.