Literature DB >> 26246645

HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION.

Rajarshi Mukherjee1, Natesh S Pillai2, Xihong Lin3.   

Abstract

In this paper, we study the detection boundary for minimax hypothesis testing in the context of high-dimensional, sparse binary regression models. Motivated by genetic sequencing association studies for rare variant effects, we investigate the complexity of the hypothesis testing problem when the design matrix is sparse. We observe a new phenomenon in the behavior of detection boundary which does not occur in the case of Gaussian linear regression. We derive the detection boundary as a function of two components: a design matrix sparsity index and signal strength, each of which is a function of the sparsity of the alternative. For any alternative, if the design matrix sparsity index is too high, any test is asymptotically powerless irrespective of the magnitude of signal strength. For binary design matrices with the sparsity index that is not too high, our results are parallel to those in the Gaussian case. In this context, we derive detection boundaries for both dense and sparse regimes. For the dense regime, we show that the generalized likelihood ratio is rate optimal; for the sparse regime, we propose an extended Higher Criticism Test and show it is rate optimal and sharp. We illustrate the finite sample properties of the theoretical results using simulation studies.

Entities:  

Keywords:  Higher Criticism; Minimax hypothesis testing; binary regression; detection boundary; sparsity

Year:  2015        PMID: 26246645      PMCID: PMC4522432          DOI: 10.1214/14-AOS1279

Source DB:  PubMed          Journal:  Ann Stat        ISSN: 0090-5364            Impact factor:   4.028


  7 in total

1.  A large-scale screen for coding variants predisposing to psoriasis.

Authors:  Huayang Tang; Xin Jin; Yang Li; Hui Jiang; Xianfa Tang; Xu Yang; Hui Cheng; Ying Qiu; Gang Chen; Junpu Mei; Fusheng Zhou; Renhua Wu; Xianbo Zuo; Yong Zhang; Xiaodong Zheng; Qi Cai; Xianyong Yin; Cheng Quan; Haojing Shao; Yong Cui; Fangzhen Tian; Xia Zhao; Hong Liu; Fengli Xiao; Fengping Xu; Jianwen Han; Dongmei Shi; Anping Zhang; Cheng Zhou; Qibin Li; Xing Fan; Liya Lin; Hongqing Tian; Zaixing Wang; Huiling Fu; Fang Wang; Baoqi Yang; Shaowei Huang; Bo Liang; Xuefeng Xie; Yunqing Ren; Qingquan Gu; Guangdong Wen; Yulin Sun; Xueli Wu; Lin Dang; Min Xia; Junjun Shan; Tianhang Li; Lin Yang; Xiuyun Zhang; Yuzhen Li; Chundi He; Aie Xu; Liping Wei; Xiaohang Zhao; Xinghua Gao; Jinhua Xu; Furen Zhang; Jianzhong Zhang; Yingrui Li; Liangdan Sun; Jianjun Liu; Runsheng Chen; Sen Yang; Jun Wang; Xuejun Zhang
Journal:  Nat Genet       Date:  2013-11-10       Impact factor: 38.330

Review 2.  Rare-variant association analysis: study designs and statistical tests.

Authors:  Seunggeung Lee; Gonçalo R Abecasis; Michael Boehnke; Xihong Lin
Journal:  Am J Hum Genet       Date:  2014-07-03       Impact factor: 11.025

3.  An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people.

Authors:  Matthew R Nelson; Daniel Wegmann; Margaret G Ehm; Darren Kessner; Pamela St Jean; Claudio Verzilli; Judong Shen; Zhengzheng Tang; Silviu-Alin Bacanu; Dana Fraser; Liling Warren; Jennifer Aponte; Matthew Zawistowski; Xiao Liu; Hao Zhang; Yong Zhang; Jun Li; Yun Li; Li Li; Peter Woollard; Simon Topp; Matthew D Hall; Keith Nangle; Jun Wang; Gonçalo Abecasis; Lon R Cardon; Sebastian Zöllner; John C Whittaker; Stephanie L Chissoe; John Novembre; Vincent Mooser
Journal:  Science       Date:  2012-05-17       Impact factor: 47.728

4.  The Dallas Heart Study: a population-based probability sample for the multidisciplinary study of ethnic differences in cardiovascular health.

Authors:  Ronald G Victor; Robert W Haley; DuWayne L Willett; Ronald M Peshock; Patrice C Vaeth; David Leonard; Mujeeb Basit; Richard S Cooper; Vincent G Iannacchione; Wendy A Visscher; Jennifer M Staab; Helen H Hobbs
Journal:  Am J Cardiol       Date:  2004-06-15       Impact factor: 2.778

5.  HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION.

Authors:  Rajarshi Mukherjee; Natesh S Pillai; Xihong Lin
Journal:  Ann Stat       Date:  2015-02       Impact factor: 4.028

6.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

7.  Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.

Authors:  Wenqing Fu; Timothy D O'Connor; Goo Jun; Hyun Min Kang; Goncalo Abecasis; Suzanne M Leal; Stacey Gabriel; Mark J Rieder; David Altshuler; Jay Shendure; Deborah A Nickerson; Michael J Bamshad; Joshua M Akey
Journal:  Nature       Date:  2012-11-28       Impact factor: 49.962

  7 in total
  8 in total

1.  The Generalized Higher Criticism for Testing SNP-Set Effects in Genetic Association Studies.

Authors:  Ian Barnett; Rajarshi Mukherjee; Xihong Lin
Journal:  J Am Stat Assoc       Date:  2017-05-03       Impact factor: 5.033

2.  Group-combined P-values with applications to genetic association studies.

Authors:  Xiaonan Hu; Wei Zhang; Sanguo Zhang; Shuangge Ma; Qizhai Li
Journal:  Bioinformatics       Date:  2016-06-03       Impact factor: 6.937

3.  Optimal detection of weak positive latent dependence between two sequences of multiple tests.

Authors:  Sihai Dave Zhao; T Tony Cai; Hongzhe Li
Journal:  J Multivar Anal       Date:  2017-07-14       Impact factor: 1.473

4.  Regularized estimation in sparse high-dimensional multivariate regression, with application to a DNA methylation study.

Authors:  Haixiang Zhang; Yinan Zheng; Grace Yoon; Zhou Zhang; Tao Gao; Brian Joyce; Wei Zhang; Joel Schwartz; Pantel Vokonas; Elena Colicino; Andrea Baccarelli; Lifang Hou; Lei Liu
Journal:  Stat Appl Genet Mol Biol       Date:  2017-07-26

5.  Sparse simultaneous signal detection for identifying genetically controlled disease genes.

Authors:  Sihai Dave Zhao; T Tony Cai; Thomas P Cappola; Kenneth B Margulies; Hongzhe Li
Journal:  J Am Stat Assoc       Date:  2017-01-05       Impact factor: 5.033

6.  HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION.

Authors:  Rajarshi Mukherjee; Natesh S Pillai; Xihong Lin
Journal:  Ann Stat       Date:  2015-02       Impact factor: 4.028

7.  Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models.

Authors:  Rong Ma; T Tony Cai; Hongzhe Li
Journal:  J Am Stat Assoc       Date:  2020-01-21       Impact factor: 5.033

8.  A unifying framework for rare variant association testing in family-based designs, including higher criticism approaches, SKATs, and burden tests.

Authors:  Julian Hecker; F William Townes; Priyadarshini Kachroo; Cecelia Laurie; Jessica Lasky-Su; John Ziniti; Michael H Cho; Scott T Weiss; Nan M Laird; Christoph Lange
Journal:  Bioinformatics       Date:  2020-12-26       Impact factor: 6.937

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.