Literature DB >> 25328278

Feature Screening for Ultrahigh Dimensional Categorical Data with Applications.

Danyang Huang1, Runze Li1, Hansheng Wang1.   

Abstract

Ultrahigh dimensional data with both categorical responses and categorical covariates are frequently encountered in the analysis of big data, for which feature screening has become an indispensable statistical tool. We propose a Pearson chi-square based feature screening procedure for categorical response with ultrahigh dimensional categorical covariates. The proposed procedure can be directly applied for detection of important interaction effects. We further show that the proposed procedure possesses screening consistency property in the terminology of Fan and Lv (2008). We investigate the finite sample performance of the proposed procedure by Monte Carlo simulation studies, and illustrate the proposed method by two empirical datasets.

Entities:  

Keywords:  Feature Screening; Pearson’s Chi-Square Test; Screening Consistency; Search Engine Marketing; Text Classification; Ultrahigh Dimensional Data

Year:  2014        PMID: 25328278      PMCID: PMC4197855          DOI: 10.1080/07350015.2013.863158

Source DB:  PubMed          Journal:  J Bus Econ Stat        ISSN: 0735-0015            Impact factor:   6.565


  7 in total

Review 1.  The central role of receiver operating characteristic (ROC) curves in evaluating tests for the early detection of cancer.

Authors:  Stuart G Baker
Journal:  J Natl Cancer Inst       Date:  2003-04-02       Impact factor: 13.506

2.  Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.

Authors:  Jianqing Fan; Yang Feng; Rui Song
Journal:  J Am Stat Assoc       Date:  2011-06       Impact factor: 5.033

3.  Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.

Authors:  Hao Helen Zhang
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2008-11       Impact factor: 4.488

4.  Ultrahigh dimensional feature selection: beyond the linear model.

Authors:  Jianqing Fan; Richard Samworth; Yichao Wu
Journal:  J Mach Learn Res       Date:  2009       Impact factor: 3.654

5.  Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates.

Authors:  Jingyuan Liu; Runze Li; Rongling Wu
Journal:  J Am Stat Assoc       Date:  2014-01-01       Impact factor: 5.033

6.  Model-Free Feature Screening for Ultrahigh Dimensional Data.

Authors:  Liping Zhu; Lexin Li; Runze Li; Lixing Zhu
Journal:  J Am Stat Assoc       Date:  2012-01-24       Impact factor: 5.033

7.  Feature Screening via Distance Correlation Learning.

Authors:  Runze Li; Wei Zhong; Liping Zhu
Journal:  J Am Stat Assoc       Date:  2012-07-01       Impact factor: 5.033

  7 in total
  7 in total

1.  A Generic Sure Independence Screening Procedure.

Authors:  Wenliang Pan; Xueqin Wang; Weinan Xiao; Hongtu Zhu
Journal:  J Am Stat Assoc       Date:  2018-08-06       Impact factor: 5.033

2.  Weighted Mean Squared Deviation Feature Screening for Binary Features.

Authors:  Gaizhen Wang; Guoyu Guan
Journal:  Entropy (Basel)       Date:  2020-03-14       Impact factor: 2.524

3.  Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems.

Authors:  Debmalya Nandy; Francesca Chiaromonte; Runze Li
Journal:  J Am Stat Assoc       Date:  2021-02-10       Impact factor: 4.369

4.  A selective overview of feature screening for ultrahigh-dimensional data.

Authors:  Liu JingYuan; Zhong Wei; L I RunZe
Journal:  Sci China Math       Date:  2015-08-22       Impact factor: 1.331

5.  Feature Screening for Network Autoregression Model.

Authors:  Danyang Huang; Xuening Zhu; Runze Li; Hansheng Wang
Journal:  Stat Sin       Date:  2021       Impact factor: 1.261

6.  An adaptive threshold determination method of feature screening for genomic selection.

Authors:  Guifang Fu; Gang Wang; Xiaotian Dai
Journal:  BMC Bioinformatics       Date:  2017-04-12       Impact factor: 3.169

7.  Detecting PCOS susceptibility loci from genome-wide association studies via iterative trend correlation based feature screening.

Authors:  Xiaotian Dai; Guifang Fu; Randall Reese
Journal:  BMC Bioinformatics       Date:  2020-05-04       Impact factor: 3.169

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.