Jeffrey S Brown1, Michael Kahn, Sengwee Toh. 1. *Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, USA. jeff_brown@hphc.org
Abstract
BACKGROUND: Electronic health information routinely collected during health care delivery and reimbursement can help address the need for evidence about the real-world effectiveness, safety, and quality of medical care. Often, distributed networks that combine information from multiple sources are needed to generate this real-world evidence. OBJECTIVE: We provide a set of field-tested best practices and a set of recommendations for data quality checking for comparative effectiveness research (CER) in distributed data networks. METHODS: Explore the requirements for data quality checking and describe data quality approaches undertaken by several existing multi-site networks. RESULTS: There are no established standards regarding how to evaluate the quality of electronic health data for CER within distributed networks. Data checks of increasing complexity are often used, ranging from consistency with syntactic rules to evaluation of semantics and consistency within and across sites. Temporal trends within and across sites are widely used, as are checks of each data refresh or update. Rates of specific events and exposures by age group, sex, and month are also common. DISCUSSION: Secondary use of electronic health data for CER holds promise but is complex, especially in distributed data networks that incorporate periodic data refreshes. The viability of a learning health system is dependent on a robust understanding of the quality, validity, and optimal secondary uses of routinely collected electronic health data within distributed health data networks. Robust data quality checking can strengthen confidence in findings based on distributed data network.
BACKGROUND: Electronic health information routinely collected during health care delivery and reimbursement can help address the need for evidence about the real-world effectiveness, safety, and quality of medical care. Often, distributed networks that combine information from multiple sources are needed to generate this real-world evidence. OBJECTIVE: We provide a set of field-tested best practices and a set of recommendations for data quality checking for comparative effectiveness research (CER) in distributed data networks. METHODS: Explore the requirements for data quality checking and describe data quality approaches undertaken by several existing multi-site networks. RESULTS: There are no established standards regarding how to evaluate the quality of electronic health data for CER within distributed networks. Data checks of increasing complexity are often used, ranging from consistency with syntactic rules to evaluation of semantics and consistency within and across sites. Temporal trends within and across sites are widely used, as are checks of each data refresh or update. Rates of specific events and exposures by age group, sex, and month are also common. DISCUSSION: Secondary use of electronic health data for CER holds promise but is complex, especially in distributed data networks that incorporate periodic data refreshes. The viability of a learning health system is dependent on a robust understanding of the quality, validity, and optimal secondary uses of routinely collected electronic health data within distributed health data networks. Robust data quality checking can strengthen confidence in findings based on distributed data network.
Authors: Robert L Davis; Margarette Kolczak; Edwin Lewis; James Nordin; Michael Goodman; David K Shay; Richard Platt; Steven Black; Henry Shinefield; Robert T Chen Journal: Epidemiology Date: 2005-05 Impact factor: 4.822
Authors: Mark C Hornbrook; Gene Hart; Jennifer L Ellis; Donald J Bachman; Gary Ansell; Sarah M Greene; Edward H Wagner; Roy Pardee; Mark M Schmidt; Ann Geiger; Amy L Butani; Terry Field; Hassan Fouayzi; Irina Miroshnik; Liyan Liu; Robert Diseker; Karen Wells; Rick Krajenta; Lois Lamerato; Christine Neslund Dudas Journal: J Natl Cancer Inst Monogr Date: 2005
Authors: Edward H Wagner; Sarah M Greene; Gene Hart; Terry S Field; Suzanne Fletcher; Ann M Geiger; Lisa J Herrinton; Mark C Hornbrook; Christine C Johnson; Judy Mouchawar; Sharon J Rolnick; Victor J Stevens; Stephen H Taplin; Dennis Tolsma; Thomas M Vogt Journal: J Natl Cancer Inst Monogr Date: 2005
Authors: R T Chen; J W Glasser; P H Rhodes; R L Davis; W E Barlow; R S Thompson; J P Mullooly; S B Black; H R Shinefield; C M Vadheim; S M Marcy; J I Ward; R P Wise; S G Wassilak; S C Hadler Journal: Pediatrics Date: 1997-06 Impact factor: 7.124
Authors: Susan S Ellenberg; Richard Culbertson; Daniel L Gillen; Steven Goodman; Suzanne Schrandt; Maryan Zirkle Journal: Clin Trials Date: 2015-09-15 Impact factor: 2.486