Literature DB >> 30375122

Inflation of type I error rates due to differential misclassification in EHR-derived outcomes: Empirical illustration using breast cancer recurrence.

Yong Chen1, Jianqiao Wang1, Jessica Chubak2, Rebecca A Hubbard1.   

Abstract

PURPOSE: Many outcomes derived from electronic health records (EHR) not only are imperfect but also may suffer from exposure-dependent differential misclassification due to variability in the quality and availability of EHR data across exposure groups. The objective of this study was to quantify the inflation of type I error rates that can result from differential outcome misclassification.
METHODS: We used data on gold-standard and EHR-derived second breast cancers in a cohort of women with a prior breast cancer diagnosis from 1993 to 2006 enrolled in Kaiser Permanente Washington. We simulated an exposure that was independent of the true outcome status. A surrogate outcome was then simulated with varying sensitivity and specificity according to exposure status. We estimated the type I error rate for a test of association relating this exposure to the surrogate outcome, while varying outcome sensitivity and specificity in exposed individuals.
RESULTS: Type I error rates were substantially inflated above the nominal level (5%) for even modest departures from nondifferential misclassification. Holding sensitivity in exposed and unexposed groups at 85%, a difference in specificity of 10% between the exposed and unexposed (80% vs 90%) resulted in a 36% type I error rate. Type I error was inflated more by differential specificity than sensitivity.
CONCLUSIONS: Differential outcome misclassification may induce spurious findings. Researchers using EHR-derived outcomes should use misclassification-adjusted methods whenever possible or conduct sensitivity analyses to investigate the possibility of false-positive findings, especially for exposures that may be related to the accuracy of outcome ascertainment.
© 2018 John Wiley & Sons, Ltd.

Entities:  

Keywords:  electronic health record; misclassification; outcome; pharmacoepidemiology; phenotype; validation

Mesh:

Year:  2018        PMID: 30375122      PMCID: PMC6716793          DOI: 10.1002/pds.4680

Source DB:  PubMed          Journal:  Pharmacoepidemiol Drug Saf        ISSN: 1053-8569            Impact factor:   2.890


  15 in total

1.  Analysis of clustered and longitudinal binary data subject to response misclassification.

Authors:  John M Neuhaus
Journal:  Biometrics       Date:  2002-09       Impact factor: 2.571

2.  Importance of accurately identifying disease in studies using electronic health records.

Authors:  Douglas G Manuel; Laura C Rosella; Thérèse A Stukel
Journal:  BMJ       Date:  2010-08-19

3.  Good research practices for comparative effectiveness research: approaches to mitigate bias and confounding in the design of nonrandomized studies of treatment effects using secondary data sources: the International Society for Pharmacoeconomics and Outcomes Research Good Research Practices for Retrospective Database Analysis Task Force Report--Part II.

Authors:  Emily Cox; Bradley C Martin; Tjeerd Van Staa; Edeltraut Garbe; Uwe Siebert; Michael L Johnson
Journal:  Value Health       Date:  2009-09-10       Impact factor: 5.725

4.  Logistic regression when the outcome is measured with uncertainty.

Authors:  L S Magder; J P Hughes
Journal:  Am J Epidemiol       Date:  1997-07-15       Impact factor: 4.897

5.  An Empirical Study for Impacts of Measurement Errors on EHR based Association Studies.

Authors:  Rui Duan; Ming Cao; Yonghui Wu; Jing Huang; Joshua C Denny; Hua Xu; Yong Chen
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

6.  An Electronic Health Record-based Algorithm to Ascertain the Date of Second Breast Cancer Events.

Authors:  Jessica Chubak; Tracy Onega; Weiwei Zhu; Diana S M Buist; Rebecca A Hubbard
Journal:  Med Care       Date:  2017-12       Impact factor: 2.983

7.  Defining and measuring completeness of electronic health records for secondary use.

Authors:  Nicole G Weiskopf; George Hripcsak; Sushmita Swaminathan; Chunhua Weng
Journal:  J Biomed Inform       Date:  2013-06-29       Impact factor: 6.317

8.  Administrative data algorithms to identify second breast cancer events following early-stage invasive breast cancer.

Authors:  Jessica Chubak; Onchee Yu; Gaia Pocobelli; Lois Lamerato; Joe Webster; Marianne N Prout; Marianne Ulcickas Yood; William E Barlow; Diana S M Buist
Journal:  J Natl Cancer Inst       Date:  2012-04-30       Impact factor: 13.506

9.  Strategies for handling missing data in electronic health record derived data.

Authors:  Brian J Wells; Kevin M Chagin; Amy S Nowacki; Michael W Kattan
Journal:  EGEMS (Wash DC)       Date:  2013-12-17

10.  Interpreting observational studies: why empirical calibration is needed to correct p-values.

Authors:  Martijn J Schuemie; Patrick B Ryan; William DuMouchel; Marc A Suchard; David Madigan
Journal:  Stat Med       Date:  2013-07-30       Impact factor: 2.373

View more
  8 in total

1.  A simple approximation to bias in the genetic effect estimates when multiple disease states share a clinical diagnosis.

Authors:  Iryna Lobach; Inyoung Kim; Alexander Alekseyenko; Siarhei Lobach; Li Zhang
Journal:  Genet Epidemiol       Date:  2019-03-19       Impact factor: 2.135

2.  An augmented estimation procedure for EHR-based association studies accounting for differential misclassification.

Authors:  Jiayi Tong; Jing Huang; Jessica Chubak; Xuan Wang; Jason H Moore; Rebecca A Hubbard; Yong Chen
Journal:  J Am Med Inform Assoc       Date:  2020-02-01       Impact factor: 4.497

3.  The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities.

Authors:  Lauren J Beesley; Maxwell Salvatore; Lars G Fritsche; Anita Pandit; Arvind Rao; Chad Brummett; Cristen J Willer; Lynda D Lisabeth; Bhramar Mukherjee
Journal:  Stat Med       Date:  2019-12-20       Impact factor: 2.373

4.  A cost-effective chart review sampling design to account for phenotyping error in electronic health records (EHR) data.

Authors:  Ziyan Yin; Jiayi Tong; Yong Chen; Rebecca A Hubbard; Cheng Yong Tang
Journal:  J Am Med Inform Assoc       Date:  2021-12-28       Impact factor: 7.942

Review 5.  Electronic health records and polygenic risk scores for predicting disease risk.

Authors:  Ruowang Li; Yong Chen; Marylyn D Ritchie; Jason H Moore
Journal:  Nat Rev Genet       Date:  2020-03-31       Impact factor: 53.242

6.  Apples and Oranges? Considerations for EHR-Based Analyses Aggregating Data From Interventional Clinical Trials and Point-of-Care Encounters in Oncology.

Authors:  Jessica A Lavery; Margaret K Callahan; Katherine S Panageas
Journal:  JCO Clin Cancer Inform       Date:  2021-01

7.  Characterizing Bias Due to Differential Exposure Ascertainment in Electronic Health Record Data.

Authors:  Rebecca A Hubbard; Elle Lett; Gloria Y F Ho; Jessica Chubak
Journal:  Health Serv Outcomes Res Methodol       Date:  2021-01-04

8.  Use of Deep Learning to Analyze Social Media Discussions About the Human Papillomavirus Vaccine.

Authors:  Jingcheng Du; Chongliang Luo; Ross Shegog; Jiang Bian; Rachel M Cunningham; Julie A Boom; Gregory A Poland; Yong Chen; Cui Tao
Journal:  JAMA Netw Open       Date:  2020-11-02
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.