| Literature DB >> 24551375 |
Samantha Kleinberg1, Noémie Elhadad2.
Abstract
Electronic health records are an increasingly important source of data for research, allowing for large-scale longitudinal studies on the same population that is being treated. Unlike in controlled studies, though, these data vary widely in quality, quantity, and structure. In order to know whether algorithms can accurately uncover new knowledge from these records, or whether findings can be extrapolated to new populations, they must be validated. One approach is to conduct the same study in multiple sites and compare results, but it is a challenge to determine whether differences are due to artifacts of the medical process, population differences, or failures of the methods used. In this paper we describe the results of replicating a data-driven experiment to infer possible causes of congestive heart failure and their timing using data from two medical systems and two patient populations. We focus on the difficulties faced in this type of work, lessons learned, and recommendations for future research.Entities:
Mesh:
Year: 2013 PMID: 24551375 PMCID: PMC3900216
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076