Literature DB >> 34989429

Optimal sampling for design-based estimators of regression models.

Tong Chen1, Thomas Lumley1.   

Abstract

Two-phase designs measure variables of interest on a subcohort where the outcome and covariates are readily available or cheap to collect on all individuals in the cohort. Given limited resource availability, it is of interest to find an optimal design that includes more informative individuals in the final sample. We explore the optimal designs and efficiencies for analyses by design-based estimators. Generalized raking is an efficient class of design-based estimators, and they improve on the inverse-probability weighted (IPW) estimator by adjusting weights based on the auxiliary information. We derive a closed-form solution of the optimal design for estimating regression coefficients from generalized raking estimators. We compare it with the optimal design for analysis via the IPW estimator and other two-phase designs in measurement-error settings. We consider general two-phase designs where the outcome variable and variables of interest can be continuous or discrete. Our results show that the optimal designs for analyses by the two classes of design-based estimators can be very different. The optimal design for analysis via the IPW estimator is optimal for IPW estimation and typically gives near-optimal efficiency for generalized raking estimation, though we show there is potential improvement in some settings.
© 2022 John Wiley & Sons Ltd.

Entities:  

Keywords:  Neyman allocation; generalized raking; influence function; model-assisted sampling; optimal design; residual; two-phase sampling

Mesh:

Year:  2022        PMID: 34989429      PMCID: PMC8918008          DOI: 10.1002/sim.9300

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  18 in total

1.  Using the whole cohort in the analysis of countermatched samples.

Authors:  C Rivera; T Lumley
Journal:  Biometrics       Date:  2015-09-22       Impact factor: 2.571

2.  Doubly robust estimation in missing data and causal inference models.

Authors:  Heejung Bang; James M Robins
Journal:  Biometrics       Date:  2005-12       Impact factor: 2.571

3.  Connections between survey calibration estimators and semiparametric models for incomplete data.

Authors:  Thomas Lumley; Pamela A Shaw; James Y Dai
Journal:  Int Stat Rev       Date:  2011-08       Impact factor: 2.217

4.  Two-Phase Sampling Designs for Data Validation in Settings with Covariate Measurement Error and Continuous Outcome.

Authors:  Gustavo Amorim; Ran Tao; Sarah Lotspeich; Pamela A Shaw; Thomas Lumley; Bryan E Shepherd
Journal:  J R Stat Soc Ser A Stat Soc       Date:  2021-04-15       Impact factor: 2.175

5.  Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources.

Authors:  Nilanjan Chatterjee; Yi-Hau Chen; Paige Maas; Raymond J Carroll
Journal:  J Am Stat Assoc       Date:  2016-05-05       Impact factor: 5.033

6.  Combining multiple imputation with raking of weights: An efficient and robust approach in the setting of nearly true models.

Authors:  Kyunghee Han; Pamela A Shaw; Thomas Lumley
Journal:  Stat Med       Date:  2021-09-28       Impact factor: 2.373

7.  Two-phase analysis and study design for survival models with error-prone exposures.

Authors:  Kyunghee Han; Thomas Lumley; Bryan E Shepherd; Pamela A Shaw
Journal:  Stat Methods Med Res       Date:  2020-12-16       Impact factor: 2.494

8.  Adaptive sampling in two-phase designs: a biomarker study for progression in arthritis.

Authors:  Michael A McIsaac; Richard J Cook
Journal:  Stat Med       Date:  2015-05-07       Impact factor: 2.373

9.  Optimal multiwave sampling for regression modeling in two-phase designs.

Authors:  Tong Chen; Thomas Lumley
Journal:  Stat Med       Date:  2020-10-05       Impact factor: 2.373

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.