| Literature DB >> 33709462 |
Eric J Oh1, Bryan E Shepherd2, Thomas Lumley3, Pamela A Shaw1.
Abstract
Biomedical studies that use electronic health records (EHR) data for inference are often subject to bias due to measurement error. The measurement error present in EHR data is typically complex, consisting of errors of unknown functional form in covariates and the outcome, which can be dependent. To address the bias resulting from such errors, generalized raking has recently been proposed as a robust method that yields consistent estimates without the need to model the error structure. We provide rationale for why these previously proposed raking estimators can be expected to be inefficient in failure-time outcome settings involving misclassification of the event indicator. We propose raking estimators that utilize multiple imputation, to impute either the target variables or auxiliary variables, to improve the efficiency. We also consider outcome-dependent sampling designs and investigate their impact on the efficiency of the raking estimators, either with or without multiple imputation. We present an extensive numerical study to examine the performance of the proposed estimators across various measurement error settings. We then apply the proposed methods to our motivating setting, in which we seek to analyze HIV outcomes in an observational cohort with EHR data from the Vanderbilt Comprehensive Care Clinic.Entities:
Keywords: electronic health records; generalized raking; measurement error; misclassification; survival analysis
Mesh:
Year: 2021 PMID: 33709462 PMCID: PMC8211389 DOI: 10.1002/bimj.202000187
Source DB: PubMed Journal: Biom J ISSN: 0323-3847 Impact factor: 1.715