| Literature DB >> 35812230 |
Alexander John Karran1, Théophile Demazure1, Antoine Hudon1, Sylvain Senecal1, Pierre-Majorique Léger1.
Abstract
Explainable artificial intelligence aims to bring transparency to artificial intelligence (AI) systems by translating, simplifying, and visualizing its decisions. While society remains skeptical about AI systems, studies show that transparent and explainable AI systems can help improve the Human-AI trust relationship. This manuscript presents two studies that assess three AI decision visualization attribution models that manipulate morphological clarity (MC) and two information presentation-order methods to determine each visualization's impact on the Human-AI trust relationship through increased confidence and cognitive fit (CF). The first study, N = 206 (Avg. age = 37.87 ± 10.51, Male = 123), utilized information presentation methods and visualizations delivered through an online experiment to explore trust in AI by asking participants to complete a visual decision-making task. The second study, N = 19 (24.9 ± 8.3 years old, Male = 10), utilized eye-tracking technology and the same stimuli presentation methods to investigate if cognitive load, inferred through pupillometry measures, mediated the confidence-trust relationship. The results indicate that low MC positively impacts Human-AI trust and that the presentation order of information within an interface in terms of adjacency further influences user trust in AI. We conclude that while adjacency and MC significantly affect cognitive load, cognitive load alone does not mediate the confidence-trust relationship. Our findings interpreted through a combination of CF, situation awareness, and ecological interface design have implications for the design of future AI systems, which may facilitate better collaboration between humans and AI-based decision agents.Entities:
Keywords: HCAI; cognitive fit; confidence; decision support; explainability; trust
Year: 2022 PMID: 35812230 PMCID: PMC9263374 DOI: 10.3389/fnins.2022.883385
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 5.152
FIGURE 1Research model.
Overall inter-rater agreement for each group (G#) at each round (R#).
| Kappa | Raw agreement | Fleiss’s K | Krippendorff’s alpha |
| |||||||
| R1 | R2 | R1 | R2 | R1 | R2 | R1 | R2 | R1 | R2 | ||
| G1 | J1 – J2 | 0.829 | 0.969 | 75.40% | 78.80% | 0.816 | 0.854 | 0.81 | 0.855 | ||
| J1 – J3 | 0.752 | 0.798 | |||||||||
| J2 – J3 | 0.858 | 0.798 | |||||||||
| G2 | J1 – J2 | 0.938 | 0.85 | 84.80% | 82.10% | 0.896 | 0.853 | 0.896 | 0.855 | ||
| J1 – J3 | 0.907 | 0.848 | |||||||||
| J2 – J3 | 0.845 | 0.863 | |||||||||
| G3 | J1 – J2 | 0.845 | 0.907 | 72.70% | 89.40% | 0.803 | 0.927 | 0.804 | 0.927 | ||
| J1 – J3 | 0.752 | 0.891 | |||||||||
| J2 – J3 | 0.814 | 0.984 | |||||||||
| Average | 0.838 |
| 77.63% |
| 0.838 |
| 0.837 |
| |||
Values in bold represent the final average across rater groups.
FIGURE 2Task’s design. Images selected from the ImageNet dataset (Deng et al., 2009).
FIGURE 3Task design. Images selected from the ImageNet dataset.
FIGURE 4Example of each type of EV for the classification “Monkey”. Images selected from the ImageNet dataset (Deng et al., 2009).
Post hoc comparisons adjacency, MC, and classification on confidence.
| Factor (1) | Factor (2) | Mean diff | DF |
|
|
| Adjacent | Non-adjacent | 0.46 | 205 | 15.71 | <0.001 |
| Low MC (Cloud of points) | Medium MC (Heatmap) | 0.08 | 410 | 2.36 | 0.019 |
| Low MC (Cloud of points)) | High MC (Outline) | 0.12 | 410 | 3.27 | 0.001 |
| Medium MC (Heatmap) | High MC (Outline) | 0.03 | 410 | 0.91 | 0.363 |
| Good classification | Bad classification | 0.98 | 205 | 27.72 | <0.001 |
*p < 0.05, **p < 0.01, and ***p < 0.001.
Post hoc comparisons of the interaction between adjacency and MC on confidence.
| Adjacency | MC level (1) | MC level (2) | Mean diff (1–2) | DF |
|
|
| Adjacent | Low (Cloud of points) | Medium (Heatmap) | –0.024 | 410 | –0.48 | 0.632 |
| Low (Cloud of points) | High (Outline) | 0.009 | 410 | 0.18 | 0.857 | |
| Medium (Heatmap) | High (Outline) | 0.033 | 410 | 0.66 | 0.511 | |
| Non-adjacent | Low (Cloud of points) | Medium (Heatmap) | 0.191 | 410 | 3.82 | <0.001 |
| Low (Cloud of points) | High (Outline) | 0.223 | 410 | 4.44 | <0.001 | |
| Medium (Heatmap) | High (Outline) | 0.032 | 410 | 0.63 | 0.530 |
***p < 0.001.
FIGURE 5Comparison of adjacent and non-adjacent visualizations based on their MC from low to high. ***p < 0.001.
FIGURE 6Mediation model for adjacency and MC on perceived confidence via cognitive load.
FIGURE 7Percentage change of pupil diameter by each type of EV.
FIGURE 8Perceived confidence for each type of EV.