| Literature DB >> 34095540 |
S Garies1,2, M Cummings3, B Forst3, K McBrien1,2, B Soos1,2, M Taylor3, N Drummond1,2,3, D Manca3, K Duerksen3, H Quan2, T Williamson2.
Abstract
INTRODUCTION: Electronic medical record (EMR) databases have become increasingly popular for secondary purposes, such as health research. The Canadian Primary Care Sentinel Surveillance Network (CPCSSN) is the first and only pan-Canadian primary care EMR data repository, with de-identified health information for almost two million Canadians. Comprehensive and freely available documentation describing the data 'lifecycle' is important for assessing potential data quality issues and appropriate interpretation of research findings. Here, we describe the flow and transformation of CPCSSN data in the province of Alberta. APPROACH: In Alberta, the data originate from 54 publicly-funded primary care settings, including one community pediatric clinic, with 318 providers contributing de-identified EMR data for 410,951 patients (as of December 2018). Data extraction methods have been developed for five different EMR systems, and include both backend and automated frontend extractions. The raw EMR data are transformed according to specific rules, including trimming implausible values, converting values and free text to standard terminologies or classification systems, and structuring the data into a common CPCSSN format. Following local data extraction and processing, the data are transferred to a central repository and made available for research and disease surveillance.Entities:
Year: 2019 PMID: 34095540 PMCID: PMC8142949 DOI: 10.23889/ijpds.v4i2.1132
Source DB: PubMed Journal: Int J Popul Data Sci ISSN: 2399-4908
Figure 1: Potential sources of bias and data quality issues in Canadian primary care EMR data. (adapted from Verheij et al. [30])