| Literature DB >> 30369862 |
Jay Hegdé1,2, Evgeniy Bart3.
Abstract
In everyday life, we rely on human experts to make a variety of complex decisions, such as medical diagnoses. These decisions are typically made through some form of weakly guided learning, a form of learning in which decision expertise is gained through labeled examples rather than explicit instructions. Expert decisions can significantly affect people other than the decision-maker (for example, teammates, clients, or patients), but may seem cryptic and mysterious to them. It is therefore desirable for the decision-maker to explain the rationale behind these decisions to others. This, however, can be difficult to do. Often, the expert has a "gut feeling" for what the correct decision is, but may have difficulty giving an objective set of criteria for arriving at it. Explainability of human expert decisions, i.e., the extent to which experts can make their decisions understandable to others, has not been studied systematically. Here, we characterize the explainability of human decision-making, using binary categorical decisions about visual objects as an illustrative example. We trained a group of "expert" subjects to categorize novel, naturalistic 3-D objects called "digital embryos" into one of two hitherto unknown categories, using a weakly guided learning paradigm. We then asked the expert subjects to provide a written explanation for each binary decision they made. These experiments generated several intriguing findings. First, the expert's explanations modestly improve the categorization performance of naïve users (paired t-tests, p < 0.05). Second, this improvement differed significantly between explanations. In particular, explanations that pointed to a spatially localized region of the object improved the user's performance much better than explanations that referred to global features. Third, neither experts nor naïve subjects were able to reliably predict the degree of improvement for a given explanation. Finally, significant bias effects were observed, where naïve subjects rated an explanation significantly higher when told it comes from an expert user, compared to the rating of the same explanation when told it comes from another non-expert, suggesting a variant of the Asch conformity effect. Together, our results characterize, for the first time, the various issues, both methodological and conceptual, underlying the explainability of human decisions.Entities:
Keywords: classification; machine learning; objective explainability; perceptual learning; subjective explainability; weakly guided learning
Year: 2018 PMID: 30369862 PMCID: PMC6194166 DOI: 10.3389/fnins.2018.00670
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
Examples of effective explanations: selected explanations by expert servers that led to relatively high classification performance by naïve clients∗.
| Line # | Query embryo | Servers | Clients | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Image | Category | Server ID | Reported category | Explanation¶ | % Correct§ | SERS | % Correct§ | Mean OURNC† | Mean OERNC† | |
| 1 | Q | 11-58 | Q | Triangle at the end of the neck vein is shadowy for Q | 100 | 52 | 77.7 | 100 | 57.8 | |
| 2 | Q | 11-07 | Q | Shadow on right side of neck is dark for P and light for Q. | 100 | 98 | 58.9 | 100 | 81.2 | |
| 3 | P | 00-17 | P | Shading on right side P’s neck is not very sharp nor dark. | 100 | 100 | 70.0 | 90.9 | 70.2 | |
| 4 | P | 00-02 | P | Shading of P is lighter and less sharp on right side of neck than shading on Q, more similar to P | 100 | 100 | 58.2 | 69.5 | 66.0 | |
| 5 | P | 11-58 | P | P has lighter shading around neck and Q had darker shading around neck | 100 | 66.2 | 67.7 | 100 | 75.5 | |
Selected examples of ineffective explanations: explanations by expert servers that led to relatively low classification performance by naïve clients∗.
| Line # | Query embryo | Servers | Clients | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Image | Category | Server ID | Reported category | Explanation¶ | % Correct§ | SERS | % Correct§ | Mean OURNC† | Mean OERNC† | |
| 1 | P | 11-16 | P | P has light and straight nerve tail | 100 | 93 | 0 | 91.3 | 6.9 | |
| 2 | P | 11-58 | P | Q has harsher shading. Q has harsher shading than P | 100 | 100 | 0 | 99.2 | 11.2 | |
| 3 | P | 00-17 | Q | Top right groove in P matches in length but not fully in shape | 0 | 91 | 0 | 89.1 | 23.5 | |
| 4 | Q | 11-16 | P | P is smooth, dark and curvy | 0 | 84 | 0 | 99.4 | 17.2 | |
| 5 | Q | 00-02 | P | Shading is pretty mild, more similar to P than Q | 0 | 99 | 2.5 | 96.6 | 2.4 | |