| Literature DB >> 25328278 |
Danyang Huang1, Runze Li1, Hansheng Wang1.
Abstract
Ultrahigh dimensional data with both categorical responses and categorical covariates are frequently encountered in the analysis of big data, for which feature screening has become an indispensable statistical tool. We propose a Pearson chi-square based feature screening procedure for categorical response with ultrahigh dimensional categorical covariates. The proposed procedure can be directly applied for detection of important interaction effects. We further show that the proposed procedure possesses screening consistency property in the terminology of Fan and Lv (2008). We investigate the finite sample performance of the proposed procedure by Monte Carlo simulation studies, and illustrate the proposed method by two empirical datasets.Entities:
Keywords: Feature Screening; Pearson’s Chi-Square Test; Screening Consistency; Search Engine Marketing; Text Classification; Ultrahigh Dimensional Data
Year: 2014 PMID: 25328278 PMCID: PMC4197855 DOI: 10.1080/07350015.2013.863158
Source DB: PubMed Journal: J Bus Econ Stat ISSN: 0735-0015 Impact factor: 6.565