Lina Sulieman1, Robert M Cronin1,2, Robert J Carroll1, Karthik Natarajan3, Kayla Marginean4, Brandy Mapes4, Dan Roden1,5, Paul Harris1,4, Andrea Ramirez5,6. 1. Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA. 2. Department of Medicine, The Ohio State University, Columbus, Ohio, USA. 3. Department of Biomedical Informatics, Columbia University, New York, New York, USA. 4. Vanderbilt Institute of Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA. 5. Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA. 6. Office of data and analytics, All of Us Research Program, National Institutes of Health, Bethesda, Maryland, USA.
Abstract
OBJECTIVE: A participant's medical history is important in clinical research and can be captured from electronic health records (EHRs) and self-reported surveys. Both can be incomplete, EHR due to documentation gaps or lack of interoperability and surveys due to recall bias or limited health literacy. This analysis compares medical history collected in the All of Us Research Program through both surveys and EHRs. MATERIALS AND METHODS: The All of Us medical history survey includes self-report questionnaire that asks about diagnoses to over 150 medical conditions organized into 12 disease categories. In each category, we identified the 3 most and least frequent self-reported diagnoses and retrieved their analogues from EHRs. We calculated agreement scores and extracted participant demographic characteristics for each comparison set. RESULTS: The 4th All of Us dataset release includes data from 314 994 participants; 28.3% of whom completed medical history surveys, and 65.5% of whom had EHR data. Hearing and vision category within the survey had the highest number of responses, but the second lowest positive agreement with the EHR (0.21). The Infectious disease category had the lowest positive agreement (0.12). Cancer conditions had the highest positive agreement (0.45) between the 2 data sources. DISCUSSION AND CONCLUSION: Our study quantified the agreement of medical history between 2 sources-EHRs and self-reported surveys. Conditions that are usually undocumented in EHRs had low agreement scores, demonstrating that survey data can supplement EHR data. Disagreement between EHR and survey can help identify possible missing records and guide researchers to adjust for biases.
OBJECTIVE: A participant's medical history is important in clinical research and can be captured from electronic health records (EHRs) and self-reported surveys. Both can be incomplete, EHR due to documentation gaps or lack of interoperability and surveys due to recall bias or limited health literacy. This analysis compares medical history collected in the All of Us Research Program through both surveys and EHRs. MATERIALS AND METHODS: The All of Us medical history survey includes self-report questionnaire that asks about diagnoses to over 150 medical conditions organized into 12 disease categories. In each category, we identified the 3 most and least frequent self-reported diagnoses and retrieved their analogues from EHRs. We calculated agreement scores and extracted participant demographic characteristics for each comparison set. RESULTS: The 4th All of Us dataset release includes data from 314 994 participants; 28.3% of whom completed medical history surveys, and 65.5% of whom had EHR data. Hearing and vision category within the survey had the highest number of responses, but the second lowest positive agreement with the EHR (0.21). The Infectious disease category had the lowest positive agreement (0.12). Cancer conditions had the highest positive agreement (0.45) between the 2 data sources. DISCUSSION AND CONCLUSION: Our study quantified the agreement of medical history between 2 sources-EHRs and self-reported surveys. Conditions that are usually undocumented in EHRs had low agreement scores, demonstrating that survey data can supplement EHR data. Disagreement between EHR and survey can help identify possible missing records and guide researchers to adjust for biases.
Authors: Robert M Cronin; Rebecca N Jerome; Brandy Mapes; Regina Andrade; Rebecca Johnston; Jennifer Ayala; David Schlundt; Kemberlee Bonnet; Sunil Kripalani; Kathryn Goggins; Kenneth A Wallston; Mick P Couper; Michael R Elliott; Paul Harris; Mark Begale; Fatima Munoz; Maria Lopez-Class; David Cella; David Condon; Mona AuYoung; Kathleen M Mazor; Steve Mikita; Michael Manganiello; Nicholas Borselli; Stephanie Fowler; Joni L Rutter; Joshua C Denny; Elizabeth W Karlson; Brian K Ahmedani; Christopher J O'Donnell Journal: Epidemiology Date: 2019-07 Impact factor: 4.822
Authors: Tracy M Layne; Leah M Ferrucci; Beth A Jones; Tenbroeck Smith; Lou Gonsalves; Brenda Cartmel Journal: Cancer Causes Control Date: 2018-11-03 Impact factor: 2.506
Authors: Jennifer A Sinnott; Fiona Cai; Sheng Yu; Boris P Hejblum; Chuan Hong; Isaac S Kohane; Katherine P Liao Journal: J Am Med Inform Assoc Date: 2018-10-01 Impact factor: 4.497
Authors: Saeed Mehrabi; Anand Krishnan; Alexandra M Roch; Heidi Schmidt; DingCheng Li; Joe Kesterson; Chris Beesley; Paul Dexter; Max Schmidt; Mathew Palakal; Hongfang Liu Journal: Stud Health Technol Inform Date: 2015
Authors: Alina Zalounina Falborg; Peter Vedsted; Usha Menon; David Weller; Richard D Neal; Irene Reguilon; Samantha Harrison; Henry Jensen Journal: Cancer Epidemiol Date: 2020-02-27 Impact factor: 2.984
Authors: Robert M Cronin; Julie R Field; Yuki Bradford; Christian M Shaffer; Robert J Carroll; Jonathan D Mosley; Lisa Bastarache; Todd L Edwards; Scott J Hebbring; Simon Lin; Lucia A Hindorff; Paul K Crane; Sarah A Pendergrass; Marylyn D Ritchie; Dana C Crawford; Jyotishman Pathak; Suzette J Bielinski; David S Carrell; David R Crosslin; David H Ledbetter; David J Carey; Gerard Tromp; Marc S Williams; Eric B Larson; Gail P Jarvik; Peggy L Peissig; Murray H Brilliant; Catherine A McCarty; Christopher G Chute; Iftikhar J Kullo; Erwin Bottinger; Rex Chisholm; Maureen E Smith; Dan M Roden; Joshua C Denny Journal: Front Genet Date: 2014-08-05 Impact factor: 4.599