Literature DB >> 25249709

Feature Screening via Distance Correlation Learning.

Runze Li1, Wei Zhong1, Liping Zhu1.   

Abstract

This paper is concerned with screening features in ultrahigh dimensional data analysis, which has become increasingly important in diverse scientific fields. We develop a sure independence screening procedure based on the distance correlation (DC-SIS, for short). The DC-SIS can be implemented as easily as the sure independence screening procedure based on the Pearson correlation (SIS, for short) proposed by Fan and Lv (2008). However, the DC-SIS can significantly improve the SIS. Fan and Lv (2008) established the sure screening property for the SIS based on linear models, but the sure screening property is valid for the DC-SIS under more general settings including linear models. Furthermore, the implementation of the DC-SIS does not require model specification (e.g., linear model or generalized linear model) for responses or predictors. This is a very appealing property in ultrahigh dimensional data analysis. Moreover, the DC-SIS can be used directly to screen grouped predictor variables and for multivariate response variables. We establish the sure screening property for the DC-SIS, and conduct simulations to examine its finite sample performance. Numerical comparison indicates that the DC-SIS performs much better than the SIS in various models. We also illustrate the DC-SIS through a real data example.

Entities:  

Keywords:  Distance correlation; sure screening property; ultrahigh dimensionality; variable selection

Year:  2012        PMID: 25249709      PMCID: PMC4170057          DOI: 10.1080/01621459.2012.695654

Source DB:  PubMed          Journal:  J Am Stat Assoc        ISSN: 0162-1459            Impact factor:   5.033


  15 in total

1.  Regression approaches for microarray data analysis.

Authors:  Mark R Segal; Kam D Dahlquist; Bruce R Conklin
Journal:  J Comput Biol       Date:  2003       Impact factor: 1.479

2.  A regularized Hotelling's T2 test for pathway analysis in proteomic studies.

Authors:  Lin S Chen; Debashis Paul; Ross L Prentice; Pei Wang
Journal:  J Am Stat Assoc       Date:  2011-12       Impact factor: 5.033

3.  Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.

Authors:  Hao Helen Zhang
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2008-11       Impact factor: 4.488

4.  ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS.

Authors:  Hui Zou; Hao Helen Zhang
Journal:  Ann Stat       Date:  2009       Impact factor: 4.028

5.  Oncogenic pathway signatures in human cancers as a guide to targeted therapies.

Authors:  Andrea H Bild; Guang Yao; Jeffrey T Chang; Quanli Wang; Anil Potti; Dawn Chasse; Mary-Beth Joshi; David Harpole; Johnathan M Lancaster; Andrew Berchuck; John A Olson; Jeffrey R Marks; Holly K Dressman; Mike West; Joseph R Nevins
Journal:  Nature       Date:  2005-11-06       Impact factor: 49.962

6.  Ultrahigh dimensional feature selection: beyond the linear model.

Authors:  Jianqing Fan; Richard Samworth; Yichao Wu
Journal:  J Mach Learn Res       Date:  2009       Impact factor: 3.654

7.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

8.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.

Authors:  Hui Zou; Runze Li
Journal:  Ann Stat       Date:  2008-08-01       Impact factor: 4.028

9.  Core signaling pathways in human pancreatic cancers revealed by global genomic analyses.

Authors:  Siân Jones; Xiaosong Zhang; D Williams Parsons; Jimmy Cheng-Ho Lin; Rebecca J Leary; Philipp Angenendt; Parminder Mankoo; Hannah Carter; Hirohiko Kamiyama; Antonio Jimeno; Seung-Mo Hong; Baojin Fu; Ming-Tseh Lin; Eric S Calhoun; Mihoko Kamiyama; Kimberly Walter; Tatiana Nikolskaya; Yuri Nikolsky; James Hartigan; Douglas R Smith; Manuel Hidalgo; Steven D Leach; Alison P Klein; Elizabeth M Jaffee; Michael Goggins; Anirban Maitra; Christine Iacobuzio-Donahue; James R Eshleman; Scott E Kern; Ralph H Hruban; Rachel Karchin; Nickolas Papadopoulos; Giovanni Parmigiani; Bert Vogelstein; Victor E Velculescu; Kenneth W Kinzler
Journal:  Science       Date:  2008-09-04       Impact factor: 47.728

10.  PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes.

Authors:  Vamsi K Mootha; Cecilia M Lindgren; Karl-Fredrik Eriksson; Aravind Subramanian; Smita Sihag; Joseph Lehar; Pere Puigserver; Emma Carlsson; Martin Ridderstråle; Esa Laurila; Nicholas Houstis; Mark J Daly; Nick Patterson; Jill P Mesirov; Todd R Golub; Pablo Tamayo; Bruce Spiegelman; Eric S Lander; Joel N Hirschhorn; David Altshuler; Leif C Groop
Journal:  Nat Genet       Date:  2003-07       Impact factor: 38.330

View more
  73 in total

1.  Exploiting Linkage Disequilibrium for Ultrahigh-Dimensional Genome-Wide Data with an Integrated Statistical Approach.

Authors:  Michelle Carlsen; Guifang Fu; Shaun Bushman; Christopher Corcoran
Journal:  Genetics       Date:  2015-12-12       Impact factor: 4.562

2.  Survival impact index and ultrahigh-dimensional model-free screening with survival outcomes.

Authors:  Jialiang Li; Qi Zheng; Limin Peng; Zhipeng Huang
Journal:  Biometrics       Date:  2016-02-22       Impact factor: 2.571

3.  MODEL-FREE FORWARD SCREENING VIA CUMULATIVE DIVERGENCE.

Authors:  Tingyou Zhou; Liping Zhu; Chen Xu; Runze Li
Journal:  J Am Stat Assoc       Date:  2019-07-22       Impact factor: 5.033

4.  FEATURE SCREENING FOR TIME-VARYING COEFFICIENT MODELS WITH ULTRAHIGH DIMENSIONAL LONGITUDINAL DATA.

Authors:  Wanghuan Chu; Runze Li; Matthew Reimherr
Journal:  Ann Appl Stat       Date:  2016-07-22       Impact factor: 2.083

5.  Using distance covariance for improved variable selection with application to learning genetic risk models.

Authors:  Jing Kong; Sijian Wang; Grace Wahba
Journal:  Stat Med       Date:  2015-01-29       Impact factor: 2.373

6.  A Robust Model-Free Feature Screening Method for Ultrahigh-Dimensional Data.

Authors:  Jingnan Xue; Faming Liang
Journal:  J Comput Graph Stat       Date:  2017-10-09       Impact factor: 2.302

7.  Censored cumulative residual independent screening for ultrahigh-dimensional survival data.

Authors:  Jing Zhang; Guosheng Yin; Yanyan Liu; Yuanshan Wu
Journal:  Lifetime Data Anal       Date:  2017-05-26       Impact factor: 1.588

8.  Error Variance Estimation in Ultrahigh-Dimensional Additive Models.

Authors:  Zhao Chen; Jianqing Fan; Runze Li
Journal:  J Am Stat Assoc       Date:  2017-09-26       Impact factor: 5.033

9.  Discovering and deciphering relationships across disparate data modalities.

Authors:  Joshua T Vogelstein; Eric W Bridgeford; Qing Wang; Carey E Priebe; Mauro Maggioni; Cencheng Shen
Journal:  Elife       Date:  2019-01-15       Impact factor: 8.140

10.  On Varying-coefficient Independence Screening for High-dimensional Varying-coefficient Models.

Authors:  Rui Song; Feng Yi; Hui Zou
Journal:  Stat Sin       Date:  2014       Impact factor: 1.261

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.