Literature DB >> 24727481

Identifying and mitigating biases in EHR laboratory tests.

Rimma Pivovarov1, David J Albers2, Jorge L Sepulveda3, Noémie Elhadad4.   

Abstract

Electronic health record (EHR) data show promise for deriving new ways of modeling human disease states. Although EHR researchers often use numerical values of laboratory tests as features in disease models, a great deal of information is contained in the context within which a laboratory test is taken. For example, the same numerical value of a creatinine test has different interpretation for a chronic kidney disease patient and a patient with acute kidney injury. We study whether EHR research studies are subject to biased results and interpretations if laboratory measurements taken in different contexts are not explicitly separated. We show that the context of a laboratory test measurement can often be captured by the way the test is measured through time. We perform three tasks to study the properties of these temporal measurement patterns. In the first task, we confirm that laboratory test measurement patterns provide additional information to the stand-alone numerical value. The second task identifies three measurement pattern motifs across a set of 70 laboratory tests performed for over 14,000 patients. Of these, one motif exhibits properties that can lead to biased research results. In the third task, we demonstrate the potential for biased results on a specific example. We conduct an association study of lipase test values to acute pancreatitis. We observe a diluted signal when using only a lipase value threshold, whereas the full association is recovered when properly accounting for lipase measurements in different contexts (leveraging the lipase measurement patterns to separate the contexts). Aggregating EHR data without separating distinct laboratory test measurement patterns can intermix patients with different diseases, leading to the confounding of signals in large-scale EHR analyses. This paper presents a methodology for leveraging measurement frequency to identify and reduce laboratory test biases.
Copyright © 2014 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Bias; Confounding; Electronic health record; Information theory; Laboratory testing; Missing data

Mesh:

Year:  2014        PMID: 24727481      PMCID: PMC4194228          DOI: 10.1016/j.jbi.2014.03.016

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  27 in total

Review 1.  Computational approaches to phenotyping: high-throughput phenomics.

Authors:  Yves A Lussier; Yang Liu
Journal:  Proc Am Thorac Soc       Date:  2007-01

2.  Perspectives for medical informatics. Reusing the electronic medical record for clinical research.

Authors:  H U Prokosch; T Ganslandt
Journal:  Methods Inf Med       Date:  2009       Impact factor: 2.176

3.  Missing data in medical databases: impute, delete or classify?

Authors:  Federico Cismondi; André S Fialho; Susana M Vieira; Shane R Reti; João M C Sousa; Stan N Finkelstein
Journal:  Artif Intell Med       Date:  2013-02-19       Impact factor: 5.326

4.  The utility of general purpose versus specialty clinical databases for research: warfarin dose estimation from extracted clinical variables.

Authors:  Hersh Sagreiya; Russ B Altman
Journal:  J Biomed Inform       Date:  2010-04-02       Impact factor: 6.317

5.  Variation in the frequency of hemoglobin A1c (HbA1c) testing: population studies used to assess compliance with clinical practice guidelines and use of HbA1c to screen for diabetes.

Authors:  Andrew W Lyon; Trefor Higgins; James C Wesenberg; David V Tran; George S Cembrowski
Journal:  J Diabetes Sci Technol       Date:  2009-05-01

6.  Caveats for the use of operational electronic health record data in comparative effectiveness research.

Authors:  William R Hersh; Mark G Weiner; Peter J Embi; Judith R Logan; Philip R O Payne; Elmer V Bernstam; Harold P Lehmann; George Hripcsak; Timothy H Hartzog; James J Cimino; Joel H Saltz
Journal:  Med Care       Date:  2013-08       Impact factor: 2.983

7.  Population-based study of repeat laboratory testing.

Authors:  Carl van Walraven; Michael Raymond
Journal:  Clin Chem       Date:  2003-12       Impact factor: 8.327

8.  Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies.

Authors:  Raphael Cohen; Michael Elhadad; Noémie Elhadad
Journal:  BMC Bioinformatics       Date:  2013-01-16       Impact factor: 3.307

9.  Latent physiological factors of complex human diseases revealed by independent component analysis of clinarrays.

Authors:  David P Chen; Joel T Dudley; Atul J Butte
Journal:  BMC Bioinformatics       Date:  2010-10-28       Impact factor: 3.169

10.  Next-generation phenotyping of electronic health records.

Authors:  George Hripcsak; David J Albers
Journal:  J Am Med Inform Assoc       Date:  2012-09-06       Impact factor: 4.497

View more
  37 in total

1.  Yield and bias in defining a cohort study baseline from electronic health record data.

Authors:  Jason L Vassy; Yuk-Lam Ho; Jacqueline Honerlaw; Kelly Cho; J Michael Gaziano; Peter W F Wilson; David R Gagnon
Journal:  J Biomed Inform       Date:  2018-01-03       Impact factor: 6.317

2.  Learning Optimal Individualized Treatment Rules from Electronic Health Record Data.

Authors:  Yuanjia Wang; Peng Wu; Ying Liu; Chunhua Weng; Donglin Zeng
Journal:  IEEE Int Conf Healthc Inform       Date:  2016-12-08

Review 3.  Progress in Biomedical Knowledge Discovery: A 25-year Retrospective.

Authors:  L Sacchi; J H Holmes
Journal:  Yearb Med Inform       Date:  2016-08-02

4.  Learning probabilistic phenotypes from heterogeneous EHR data.

Authors:  Rimma Pivovarov; Adler J Perotte; Edouard Grave; John Angiolillo; Chris H Wiggins; Noémie Elhadad
Journal:  J Biomed Inform       Date:  2015-10-14       Impact factor: 6.317

Review 5.  Development and validation of early warning score system: A systematic literature review.

Authors:  Li-Heng Fu; Jessica Schwartz; Amanda Moy; Chris Knaplund; Min-Jeoung Kang; Kumiko O Schnock; Jose P Garcia; Haomiao Jia; Patricia C Dykes; Kenrick Cato; David Albers; Sarah Collins Rossetti
Journal:  J Biomed Inform       Date:  2020-04-08       Impact factor: 6.317

6.  Using Clinical Notes and Natural Language Processing for Automated HIV Risk Assessment.

Authors:  Daniel J Feller; Jason Zucker; Michael T Yin; Peter Gordon; Noémie Elhadad
Journal:  J Acquir Immune Defic Syndr       Date:  2018-02-01       Impact factor: 3.731

7.  Temporal trends of hemoglobin A1c testing.

Authors:  Rimma Pivovarov; David J Albers; George Hripcsak; Jorge L Sepulveda; Noémie Elhadad
Journal:  J Am Med Inform Assoc       Date:  2014-06-13       Impact factor: 4.497

8.  Clinical data quality: a data life cycle perspective.

Authors:  Chunhua Weng
Journal:  Biostat Epidemiol       Date:  2019-02-23

9.  Learning Personalized Treatment Rules from Electronic Health Records Using Topic Modeling Feature Extraction.

Authors:  Peng Wu; Tianchen Xu; Yuanjia Wang
Journal:  Proc Int Conf Data Sci Adv Anal       Date:  2020-01-23

Review 10.  Big Data in Nephrology.

Authors:  Navchetan Kaur; Sanchita Bhattacharya; Atul J Butte
Journal:  Nat Rev Nephrol       Date:  2021-06-30       Impact factor: 28.314

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.