Literature DB >> 29056863

Estimation of the false discovery proportion with unknown dependence.

Jianqing Fan1, Xu Han2.   

Abstract

Large-scale multiple testing with correlated test statistics arises frequently in many scientific research. Incorporating correlation information in approximating false discovery proportion has attracted increasing attention in recent years. When the covariance matrix of test statistics is known, Fan, Han & Gu (2012) provided an accurate approximation of False Discovery Proportion (FDP) under arbitrary dependence structure and some sparsity assumption. However, the covariance matrix is often unknown in many applications and such dependence information has to be estimated before approximating FDP. The estimation accuracy can greatly affect FDP approximation. In the current paper, we aim to theoretically study the impact of unknown dependence on the testing procedure and establish a general framework such that FDP can be well approximated. The impacts of unknown dependence on approximating FDP are in the following two major aspects: through estimating eigenvalues/eigenvectors and through estimating marginal variances. To address the challenges in these two aspects, we firstly develop general requirements on estimates of eigenvalues and eigenvectors for a good approximation of FDP. We then give conditions on the structures of covariance matrices that satisfy such requirements. Such dependence structures include banded/sparse covariance matrices and (conditional) sparse precision matrices. Within this framework, we also consider a special example to illustrate our method where data are sampled from an approximate factor model, which encompasses most practical situations. We provide a good approximation of FDP via exploiting this specific dependence structure. The results are further generalized to the situation where the multivariate normality assumption is relaxed. Our results are demonstrated by simulation studies and some real data applications.

Entities:  

Keywords:  Large-scale multiple testing; approximate factor model; dependent test statistics; false discovery proportion; unknown covariance matrix

Year:  2016        PMID: 29056863      PMCID: PMC5648371          DOI: 10.1111/rssb.12204

Source DB:  PubMed          Journal:  J R Stat Soc Series B Stat Methodol        ISSN: 1369-7412            Impact factor:   4.488


  8 in total

1.  HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.

Authors:  Jianqing Fan; Yuan Liao; Martina Mincheva
Journal:  Ann Stat       Date:  2011-01-01       Impact factor: 4.028

2.  A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics.

Authors:  Juliane Schäfer; Korbinian Strimmer
Journal:  Stat Appl Genet Mol Biol       Date:  2005-11-14

3.  The effect of correlation in false discovery rate estimation.

Authors:  Armin Schwartzman; Xihong Lin
Journal:  Biometrika       Date:  2011-03       Impact factor: 2.445

4.  Correlated z-values and the accuracy of large-scale statistical estimates.

Authors:  Bradley Efron
Journal:  J Am Stat Assoc       Date:  2010-09-01       Impact factor: 5.033

5.  Gene-expression profiles in hereditary breast cancer.

Authors:  I Hedenfalk; D Duggan; Y Chen; M Radmacher; M Bittner; R Simon; P Meltzer; B Gusterson; M Esteller; O P Kallioniemi; B Wilfond; A Borg; J Trent; M Raffeld; Z Yakhini; A Ben-Dor; E Dougherty; J Kononen; L Bubendorf; W Fehrle; S Pittaluga; S Gruvberger; N Loman; O Johannsson; H Olsson; G Sauter
Journal:  N Engl J Med       Date:  2001-02-22       Impact factor: 91.245

6.  The Empirical Distribution of a Large Number of Correlated Normal Variables.

Authors:  David Azriel; Armin Schwartzman
Journal:  J Am Stat Assoc       Date:  2014-09-25       Impact factor: 5.033

7.  Estimating False Discovery Proportion Under Arbitrary Covariance Dependence.

Authors:  Jianqing Fan; Xu Han; Weijie Gu
Journal:  J Am Stat Assoc       Date:  2012       Impact factor: 5.033

8.  Large Covariance Estimation by Thresholding Principal Orthogonal Complements.

Authors:  Jianqing Fan; Yuan Liao; Martina Mincheva
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2013-09-01       Impact factor: 4.488

  8 in total
  7 in total

1.  Robust high dimensional factor models with applications to statistical machine learning.

Authors:  Jianqing Fan; Kaizheng Wang; Yiqiao Zhong; Ziwei Zhu
Journal:  Stat Sci       Date:  2021-04-19       Impact factor: 2.901

2.  CONFOUNDER ADJUSTMENT IN MULTIPLE HYPOTHESIS TESTING.

Authors:  Jingshu Wang; Qingyuan Zhao; Trevor Hastie; Art B Owen
Journal:  Ann Stat       Date:  2017-10-31       Impact factor: 4.028

3.  Estimating and accounting for unobserved covariates in high-dimensional correlated data.

Authors:  Chris McKennan; Dan Nicolae
Journal:  J Am Stat Assoc       Date:  2020-06-30       Impact factor: 4.369

4.  Large-Scale Simultaneous Testing of Cross-Covariance Matrices with Applications to PheWAS.

Authors:  Tianxi Cai; T Tony Cai; Katherine Liao; Weidong Liu
Journal:  Stat Sin       Date:  2019-04       Impact factor: 1.261

5.  Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures.

Authors:  Yaowu Liu; Jun Xie
Journal:  J Am Stat Assoc       Date:  2019-04-25       Impact factor: 5.033

6.  A NEW PERSPECTIVE ON ROBUST M-ESTIMATION: FINITE SAMPLE THEORY AND APPLICATIONS TO DEPENDENCE-ADJUSTED MULTIPLE TESTING.

Authors:  Wen-Xin Zhou; Koushiki Bose; Jianqing Fan; Han Liu
Journal:  Ann Stat       Date:  2018-08-17       Impact factor: 4.028

7.  Accounting for unobserved covariates with varying degrees of estimability in high-dimensional biological data.

Authors:  Chris McKennan; Dan Nicolae
Journal:  Biometrika       Date:  2019-09-16       Impact factor: 3.028

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.