Literature DB >> 29230851

A novel case-control subsampling approach for rapid model exploration of large clustered binary data.

Stephen T Wright1,2,3, Louise M Ryan1,2, Tung Pham2,4.   

Abstract

In many settings, an analysis goal is the identification of a factor, or set of factors associated with an event or outcome. Often, these associations are then used for inference and prediction. Unfortunately, in the big data era, the model building and exploration phases of analysis can be time-consuming, especially if constrained by computing power (ie, a typical corporate workstation). To speed up this model development, we propose a novel subsampling scheme to enable rapid model exploration of clustered binary data using flexible yet complex model set-ups (GLMMs with additive smoothing splines). By reframing the binary response prospective cohort study into a case-control-type design, and using our knowledge of sampling fractions, we show one can approximate the model estimates as would be calculated from a full cohort analysis. This idea is extended to derive cluster-specific sampling fractions and thereby incorporate cluster variation into an analysis. Importantly, we demonstrate that previously computationally prohibitive analyses can be conducted in a timely manner on a typical workstation. The approach is applied to analysing risk factors associated with adverse reactions relating to blood donation.
Copyright © 2017 John Wiley & Sons, Ltd.

Keywords:  case-control; outcome dependent design; penalised quasi-likelihood; subsampling big data

Mesh:

Year:  2017        PMID: 29230851     DOI: 10.1002/sim.7543

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  1 in total

1.  Outcome-dependent sampling in cluster-correlated data settings with application to hospital profiling.

Authors:  Glen McGee; Jonathan Schildcrout; Sharon-Lise Normand; Sebastien Haneuse
Journal:  J R Stat Soc Ser A Stat Soc       Date:  2019-08-29       Impact factor: 2.175

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.