Stephanie Garies1,2, Michael Cummings3, Hude Quan4, Kerry McBrien5,4, Neil Drummond5,4,3,6, Donna Manca3, Tyler Williamson4. 1. Department of Family Medicine, University of Calgary, G012 Health Sciences Centre, 3330 Hospital Drive NW, Calgary, Alberta, T2N 4N1, Canada. sgaries@ucalgary.ca. 2. Department of Community Health Sciences, University of Calgary, 3280 Hospital Drive NW, Calgary, Alberta, T2N 4Z6, Canada. sgaries@ucalgary.ca. 3. Department of Family Medicine, University of Alberta, 6-10 University Terrace, Edmonton, Alberta, T6G 2T4, Canada. 4. Department of Community Health Sciences, University of Calgary, 3280 Hospital Drive NW, Calgary, Alberta, T2N 4Z6, Canada. 5. Department of Family Medicine, University of Calgary, G012 Health Sciences Centre, 3330 Hospital Drive NW, Calgary, Alberta, T2N 4N1, Canada. 6. School of Public Health, University of Alberta, 3-300 Edmonton Clinic Health Academy, 11405-87 Ave, Edmonton, Alberta, T6G 1C9, Canada.
Abstract
BACKGROUND: Primary care electronic medical record (EMR) data are emerging as a useful source for secondary uses, such as disease surveillance, health outcomes research, and practice improvement. These data capture clinical details about patients' health status, as well as behavioural risk factors, such as smoking. While the importance of documenting smoking status in a healthcare setting is recognized, the quality of smoking data captured in EMRs is variable. This study was designed to test methods aimed at improving the quality of patient smoking information in a primary care EMR database. METHODS: EMR data from community primary care settings extracted by two regional practice-based research networks in Alberta, Canada were used. Patients with at least one encounter in the previous 2 years (2016-2018) and having hypertension according to a validated definition were included (n = 48,377). Multiple imputation was tested under two different assumptions for missing data (smoking status is missing at random and missing not-at-random). A third method tested a novel pattern matching algorithm developed to augment smoking information in the primary care EMR database. External validity was examined by comparing the proportions of smoking categories generated in each method with a general population survey. RESULTS: Among those with hypertension, 40.8% (n = 19,743) had either no smoking information recorded or it was not interpretable and considered missing. Those with missing smoking data differed statistically by demographics, clinical features, and type of EMR system used in the clinic. Both multiple imputation methods produced fully complete smoking status information, with the proportion of current smokers estimated at 25.3% (data missing at random) and 12.5% (data missing not-at-random). The pattern-matching algorithm classified 18.2% of patients as current smokers, similar to the population-based survey (18.9%), but still resulted in missing smoking information for 23.6% of patients. The algorithm was estimated to be 93.8% accurate overall, but varied by smoking status category. CONCLUSION: Multiple imputation and algorithmic pattern-matching can be used to improve EMR data post-extraction but the recommended method depends on the purpose of secondary use (e.g. practice improvement or epidemiological analyses).
BACKGROUND: Primary care electronic medical record (EMR) data are emerging as a useful source for secondary uses, such as disease surveillance, health outcomes research, and practice improvement. These data capture clinical details about patients' health status, as well as behavioural risk factors, such as smoking. While the importance of documenting smoking status in a healthcare setting is recognized, the quality of smoking data captured in EMRs is variable. This study was designed to test methods aimed at improving the quality of patient smoking information in a primary care EMR database. METHODS: EMR data from community primary care settings extracted by two regional practice-based research networks in Alberta, Canada were used. Patients with at least one encounter in the previous 2 years (2016-2018) and having hypertension according to a validated definition were included (n = 48,377). Multiple imputation was tested under two different assumptions for missing data (smoking status is missing at random and missing not-at-random). A third method tested a novel pattern matching algorithm developed to augment smoking information in the primary care EMR database. External validity was examined by comparing the proportions of smoking categories generated in each method with a general population survey. RESULTS: Among those with hypertension, 40.8% (n = 19,743) had either no smoking information recorded or it was not interpretable and considered missing. Those with missing smoking data differed statistically by demographics, clinical features, and type of EMR system used in the clinic. Both multiple imputation methods produced fully complete smoking status information, with the proportion of current smokers estimated at 25.3% (data missing at random) and 12.5% (data missing not-at-random). The pattern-matching algorithm classified 18.2% of patients as current smokers, similar to the population-based survey (18.9%), but still resulted in missing smoking information for 23.6% of patients. The algorithm was estimated to be 93.8% accurate overall, but varied by smoking status category. CONCLUSION: Multiple imputation and algorithmic pattern-matching can be used to improve EMR data post-extraction but the recommended method depends on the purpose of secondary use (e.g. practice improvement or epidemiological analyses).
Entities:
Keywords:
Electronic medical records; Primary health care; Public health informatics; Smoking
Authors: Ian S Johnston; Brendan Miles; Boglarka Soos; Stephanie Garies; Grace Perez; John A Queenan; Neil Drummond; Alexander Singer Journal: BMC Prim Care Date: 2022-05-25
Authors: Polina V Kukhareva; Tanner J Caverly; Haojia Li; Hormuzd A Katki; Li C Cheung; Thomas J Reese; Guilherme Del Fiol; Rachel Hess; David W Wetter; Yue Zhang; Teresa Y Taft; Michael C Flynn; Kensaku Kawamoto Journal: J Am Med Inform Assoc Date: 2022-04-13 Impact factor: 7.942
Authors: Abin Abraham; Brian Le; Idit Kosti; Peter Straub; Digna R Velez-Edwards; Lea K Davis; J M Newton; Louis J Muglia; Antonis Rokas; Cosmin A Bejan; Marina Sirota; John A Capra Journal: BMC Med Date: 2022-09-28 Impact factor: 11.150