Bryan E Shepherd1, Pamela A Shaw2. 1. Biostatistics, Vanderbilt University, 2525 West End, Suite 11000, 37203Nashville, Tennessee, USA. 2. Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Abstract
Objectives: Observational data derived from patient electronic health records (EHR) data are increasingly used for human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS) research. There are challenges to using these data, in particular with regards to data quality; some are recognized, some unrecognized, and some recognized but ignored. There are great opportunities for the statistical community to improve inference by incorporating validation subsampling into analyses of EHR data. Methods: Methods to address measurement error, misclassification, and missing data are relevant, as are sampling designs such as two-phase sampling. However, many of the existing statistical methods for measurement error, for example, only address relatively simple settings, whereas the errors seen in these datasets span multiple variables (both predictors and outcomes), are correlated, and even affect who is included in the study.Results/ Conclusion: We will discuss some preliminary methods in this area with a particular focus on time-to-event outcomes and outline areas of future research.
Objectives: Observational data derived from patient electronic health records (EHR) data are increasingly used for human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS) research. There are challenges to using these data, in particular with regards to data quality; some are recognized, some unrecognized, and some recognized but ignored. There are great opportunities for the statistical community to improve inference by incorporating validation subsampling into analyses of EHR data. Methods: Methods to address measurement error, misclassification, and missing data are relevant, as are sampling designs such as two-phase sampling. However, many of the existing statistical methods for measurement error, for example, only address relatively simple settings, whereas the errors seen in these datasets span multiple variables (both predictors and outcomes), are correlated, and even affect who is included in the study.Results/ Conclusion: We will discuss some preliminary methods in this area with a particular focus on time-to-event outcomes and outline areas of future research.
Authors: Rui Duan; Ming Cao; Yang Ning; Mingfu Zhu; Bin Zhang; Aidan McDermott; Haitao Chu; Xiaohua Zhou; Jason H Moore; Joseph G Ibrahim; Daniel O Scharfstein; Yong Chen Journal: Biometrics Date: 2019-11-06 Impact factor: 2.571
Authors: Chunhua Weng; Paul Appelbaum; George Hripcsak; Ian Kronish; Linda Busacca; Karina W Davidson; J Thomas Bigger Journal: J Am Med Inform Assoc Date: 2012-04-29 Impact factor: 4.497
Authors: Stephany N Duda; Bryan E Shepherd; Cynthia S Gadd; Daniel R Masys; Catherine C McGowan Journal: PLoS One Date: 2012-04-06 Impact factor: 3.240