Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 DISTRIBUTED TESTING AND ESTIMATION UNDER SPARSE HIGH DIMENSIONAL MODELS.

Literature DB >> 30034040

DISTRIBUTED TESTING AND ESTIMATION UNDER SPARSE HIGH DIMENSIONAL MODELS.

Heather Battey^1,2, Jianqing Fan^1,3, Han Liu¹, Junwei Lu¹, Ziwei Zhu¹.

Abstract

This paper studies hypothesis testing and parameter estimation in the context of the divide-and-conquer algorithm. In a unified likelihood based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from k subsamples of size n/k, where n is the sample size. In both low dimensional and sparse high dimensional settings, we address the important question of how large k can be, as n grows large, such that the loss of efficiency due to the divide-and-conquer algorithm is negligible. In other words, the resulting estimators have the same inferential efficiencies and estimation rates as an oracle with access to the full sample. Thorough numerical results are provided to back up the theory.

Entities: Chemical Disease Mutation Species

Keywords: 62F10; Divide and conquer; Primary 62F05; debiasing; massive data; secondary 62F12; thresholding

Year: 2018 PMID： 30034040 PMCID： PMC6051757 DOI： 10.1214/17-AOS1587

Source DB: PubMed Journal: Ann Stat ISSN： 0090-5364 Impact factor: 4.028

5 in total

1. Non-Concave Penalized Likelihood with NP-Dimensionality.

Authors: Jianqing Fan; Jinchi Lv
Journal: IEEE Trans Inf Theory Date: 2011-08 Impact factor: 2.501

2. Variance estimation using refitted cross-validation in ultrahigh dimensional regression.

Authors: Jianqing Fan; Shaojun Guo; Ning Hao
Journal: J R Stat Soc Series B Stat Methodol Date: 2012-01-01 Impact factor: 4.488

3. A PARTIALLY LINEAR FRAMEWORK FOR MASSIVE HETEROGENEOUS DATA.

Authors: Tianqi Zhao; Guang Cheng; Han Liu
Journal: Ann Stat Date: 2016-07-07 Impact factor: 4.028

4. OPTIMAL COMPUTATIONAL AND STATISTICAL RATES OF CONVERGENCE FOR SPARSE NONCONVEX LEARNING PROBLEMS.

Authors: Zhaoran Wang; Han Liu; Tong Zhang
Journal: Ann Stat Date: 2014 Impact factor: 4.028

5. Challenges of Big Data Analysis.

Authors: Jianqing Fan; Fang Han; Han Liu
Journal: Natl Sci Rev Date: 2014-06 Impact factor: 17.275

5 in total

1. dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling.

Authors: Chongliang Luo; Md Nazmul Islam; Natalie E Sheils; John Buresh; Martijn J Schuemie; Jalpa A Doshi; Rachel M Werner; David A Asch; Yong Chen
Journal: J Am Med Inform Assoc Date: 2022-07-12 Impact factor: 7.942

2. Distributed Simultaneous Inference in Generalized Linear Models via Confidence Distribution.

Authors: Lu Tang; Ling Zhou; Peter X-K Song
Journal: J Multivar Anal Date: 2019-11-28 Impact factor: 1.473

3. Sampling-based estimation for massive survival data with additive hazards model.

Authors: Lulu Zuo; Haixiang Zhang; HaiYing Wang; Lei Liu
Journal: Stat Med Date: 2020-11-03 Impact factor: 2.373

4. ODAL: A one-shot distributed algorithm to perform logistic regressions on electronic health records data from multiple clinical sites.

Authors: Rui Duan; Mary Regina Boland; Jason H Moore; Yong Chen
Journal: Pac Symp Biocomput Date: 2019

5. Multisite learning of high-dimensional heterogeneous data with applications to opioid use disorder study of 15,000 patients across 5 clinical sites.

Authors: Xiaokang Liu; Rui Duan; Chongliang Luo; Alexis Ogdie; Jason H Moore; Henry R Kranzler; Jiang Bian; Yong Chen
Journal: Sci Rep Date: 2022-06-30 Impact factor: 4.996

5 in total