| Literature DB >> 34966640 |
Adrian Erasmus1,2, Tyler D P Brunet2, Eyal Fisher3.
Abstract
We argue that artificial networks are explainable and offer a novel theory of interpretability. Two sets of conceptual questions are prominent in theoretical engagements with artificial neural networks, especially in the context of medical artificial intelligence: (1) Are networks explainable, and if so, what does it mean to explain the output of a network? And (2) what does it mean for a network to be interpretable? We argue that accounts of "explanation" tailored specifically to neural networks have ineffectively reinvented the wheel. In response to (1), we show how four familiar accounts of explanation apply to neural networks as they would to any scientific phenomenon. We diagnose the confusion about explaining neural networks within the machine learning literature as an equivocation on "explainability," "understandability" and "interpretability." To remedy this, we distinguish between these notions, and answer (2) by offering a theory and typology of interpretation in machine learning. Interpretation is something one does to an explanation with the aim of producing another, more understandable, explanation. As with explanation, there are various concepts and methods involved in interpretation: Total or Partial, Global or Local, and Approximative or Isomorphic. Our account of "interpretability" is consistent with uses in the machine learning literature, in keeping with the philosophy of explanation and understanding, and pays special attention to medical artificial intelligence systems.Entities:
Keywords: Explainability; Interpretability; Medical AI; XAI
Year: 2020 PMID: 34966640 PMCID: PMC8654716 DOI: 10.1007/s13347-020-00435-2
Source DB: PubMed Journal: Philos Technol ISSN: 2210-5433
Fig. 1The general structure of explanation
Fig. 2The structures of DN and IS explanation
Fig. 3The structures of CM and NM explanation
Fig. 4The general structure of total interpretation
Fig. 5A partial interpretation wherein the explanandum remains the same in both the interpretans and interpretandum
Fig. 6A partial interpretation wherein the process of interpretation used results in a new process of explanation for an explanans and explanandum