Young J Juhn1,2, Euijung Ryu3, Chung-Il Wi1,2, Katherine S King3, Momin Malik4, Santiago Romero-Brufau5, Chunhua Weng6, Sunghwan Sohn7, Richard R Sharp8, John D Halamka4,9. 1. Precision Population Science Lab, Mayo Clinic, Rochester, Minnesota, USA. 2. Artificial Intelligence Program of Department of Pediatric and Adolescent Medicine, Mayo Clinic, Rochester, Minnesota, USA. 3. Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA. 4. Center for Digital Health, Mayo Clinic, Rochester, Minnesota, USA. 5. Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota, USA. 6. Department of Biomedical Informatics, Columbia University, New York, New York, USA. 7. Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA. 8. Biomedical Ethics Program, Mayo Clinic, Rochester, Minnesota, USA. 9. Mayo Clinic Platform, Rochester, Minnesota, USA.
Abstract
OBJECTIVE: Artificial intelligence (AI) models may propagate harmful biases in performance and hence negatively affect the underserved. We aimed to assess the degree to which data quality of electronic health records (EHRs) affected by inequities related to low socioeconomic status (SES), results in differential performance of AI models across SES. MATERIALS AND METHODS: This study utilized existing machine learning models for predicting asthma exacerbation in children with asthma. We compared balanced error rate (BER) against different SES levels measured by HOUsing-based SocioEconomic Status measure (HOUSES) index. As a possible mechanism for differential performance, we also compared incompleteness of EHR information relevant to asthma care by SES. RESULTS: Asthmatic children with lower SES had larger BER than those with higher SES (eg, ratio = 1.35 for HOUSES Q1 vs Q2-Q4) and had a higher proportion of missing information relevant to asthma care (eg, 41% vs 24% for missing asthma severity and 12% vs 9.8% for undiagnosed asthma despite meeting asthma criteria). DISCUSSION: Our study suggests that lower SES is associated with worse predictive model performance. It also highlights the potential role of incomplete EHR data in this differential performance and suggests a way to mitigate this bias. CONCLUSION: The HOUSES index allows AI researchers to assess bias in predictive model performance by SES. Although our case study was based on a small sample size and a single-site study, the study results highlight a potential strategy for identifying bias by using an innovative SES measure.
OBJECTIVE: Artificial intelligence (AI) models may propagate harmful biases in performance and hence negatively affect the underserved. We aimed to assess the degree to which data quality of electronic health records (EHRs) affected by inequities related to low socioeconomic status (SES), results in differential performance of AI models across SES. MATERIALS AND METHODS: This study utilized existing machine learning models for predicting asthma exacerbation in children with asthma. We compared balanced error rate (BER) against different SES levels measured by HOUsing-based SocioEconomic Status measure (HOUSES) index. As a possible mechanism for differential performance, we also compared incompleteness of EHR information relevant to asthma care by SES. RESULTS: Asthmatic children with lower SES had larger BER than those with higher SES (eg, ratio = 1.35 for HOUSES Q1 vs Q2-Q4) and had a higher proportion of missing information relevant to asthma care (eg, 41% vs 24% for missing asthma severity and 12% vs 9.8% for undiagnosed asthma despite meeting asthma criteria). DISCUSSION: Our study suggests that lower SES is associated with worse predictive model performance. It also highlights the potential role of incomplete EHR data in this differential performance and suggests a way to mitigate this bias. CONCLUSION: The HOUSES index allows AI researchers to assess bias in predictive model performance by SES. Although our case study was based on a small sample size and a single-site study, the study results highlight a potential strategy for identifying bias by using an innovative SES measure.
Authors: Alvin Rajkomar; Michaela Hardt; Michael D Howell; Greg Corrado; Marshall H Chin Journal: Ann Intern Med Date: 2018-12-04 Impact factor: 25.391
Authors: Young J Juhn; Timothy J Beebe; Dawn M Finnie; Jeff Sloan; Philip H Wheeler; Barbara Yawn; Arthur R Williams Journal: J Urban Health Date: 2011-10 Impact factor: 3.671
Authors: Maria R Pardo-Crespo; Nirmala Priya Narla; Arthur R Williams; Timothy J Beebe; Jeff Sloan; Barbara P Yawn; Philip H Wheeler; Young J Juhn Journal: J Epidemiol Community Health Date: 2013-01-15 Impact factor: 3.710
Authors: Xiaoxi Yao; David R Rushlow; Jonathan W Inselman; Rozalina G McCoy; Thomas D Thacher; Emma M Behnken; Matthew E Bernard; Steven L Rosas; Abdulla Akfaly; Artika Misra; Paul E Molling; Joseph S Krien; Randy M Foss; Barbara A Barry; Konstantinos C Siontis; Suraj Kapa; Patricia A Pellikka; Francisco Lopez-Jimenez; Zachi I Attia; Nilay D Shah; Paul A Friedman; Peter A Noseworthy Journal: Nat Med Date: 2021-05-06 Impact factor: 53.440
Authors: Euijung Ryu; Janet E Olson; Young J Juhn; Matthew A Hathcock; Chung-Il Wi; James R Cerhan; Kathleen J Yost; Paul Y Takahashi Journal: BMJ Open Date: 2018-05-14 Impact factor: 2.692