Sabine N van der Veer1, Lisa Riste2,3, Sudeh Cheraghi-Sohi2,4, Denham L Phipps3, Mary P Tully3, Kyle Bozentko5, Sarah Atwood5, Alex Hubbard6, Carl Wiper6, Malcolm Oswald7,8, Niels Peek1,2. 1. Centre for Health Informatics, Division of Informatics, Imaging and Data Science, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK. 2. NIHR Greater Manchester Patient Safety Translational Research Centre, School of Health Sciences, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK. 3. Division of Pharmacy and Optometry, School of Health Sciences, The University of Manchester, Manchester, UK. 4. Division of Population Health, Health Services Research & Primary Care, School of Health Sciences, The University of Manchester, Manchester, UK. 5. Jefferson Center, Saint Paul, Minnesota, USA. 6. Information Commissioner's Office, Wilmslow, UK. 7. School of Law, Faculty of Humanities, The University of Manchester, Manchester, UK. 8. Citizens' Juries CIC, Manchester, UK.
Abstract
OBJECTIVE: To investigate how the general public trades off explainability versus accuracy of artificial intelligence (AI) systems and whether this differs between healthcare and non-healthcare scenarios. MATERIALS AND METHODS: Citizens' juries are a form of deliberative democracy eliciting informed judgment from a representative sample of the general public around policy questions. We organized two 5-day citizens' juries in the UK with 18 jurors each. Jurors considered 3 AI systems with different levels of accuracy and explainability in 2 healthcare and 2 non-healthcare scenarios. Per scenario, jurors voted for their preferred system; votes were analyzed descriptively. Qualitative data on considerations behind their preferences included transcribed audio-recordings of plenary sessions, observational field notes, outputs from small group work and free-text comments accompanying jurors' votes; qualitative data were analyzed thematically by scenario, per and across AI systems. RESULTS: In healthcare scenarios, jurors favored accuracy over explainability, whereas in non-healthcare contexts they either valued explainability equally to, or more than, accuracy. Jurors' considerations in favor of accuracy regarded the impact of decisions on individuals and society, and the potential to increase efficiency of services. Reasons for emphasizing explainability included increased opportunities for individuals and society to learn and improve future prospects and enhanced ability for humans to identify and resolve system biases. CONCLUSION: Citizens may value explainability of AI systems in healthcare less than in non-healthcare domains and less than often assumed by professionals, especially when weighed against system accuracy. The public should therefore be actively consulted when developing policy on AI explainability.
OBJECTIVE: To investigate how the general public trades off explainability versus accuracy of artificial intelligence (AI) systems and whether this differs between healthcare and non-healthcare scenarios. MATERIALS AND METHODS: Citizens' juries are a form of deliberative democracy eliciting informed judgment from a representative sample of the general public around policy questions. We organized two 5-day citizens' juries in the UK with 18 jurors each. Jurors considered 3 AI systems with different levels of accuracy and explainability in 2 healthcare and 2 non-healthcare scenarios. Per scenario, jurors voted for their preferred system; votes were analyzed descriptively. Qualitative data on considerations behind their preferences included transcribed audio-recordings of plenary sessions, observational field notes, outputs from small group work and free-text comments accompanying jurors' votes; qualitative data were analyzed thematically by scenario, per and across AI systems. RESULTS: In healthcare scenarios, jurors favored accuracy over explainability, whereas in non-healthcare contexts they either valued explainability equally to, or more than, accuracy. Jurors' considerations in favor of accuracy regarded the impact of decisions on individuals and society, and the potential to increase efficiency of services. Reasons for emphasizing explainability included increased opportunities for individuals and society to learn and improve future prospects and enhanced ability for humans to identify and resolve system biases. CONCLUSION: Citizens may value explainability of AI systems in healthcare less than in non-healthcare domains and less than often assumed by professionals, especially when weighed against system accuracy. The public should therefore be actively consulted when developing policy on AI explainability.