On the transferability of insights from the psychology of explanation to explainable AI

Abstract

An interest in automatically-generated explanations for predictive AI systems has grown considerably in recent years (DARPA, 2016; Doshi-Velez and Kim, 2017; Gunning and Aha, 2019; Montavon et al., 2018; Rieger et al., 2018; Samek et al., 2017). It is argued that explanations provide transparency for what are often black-box procedures and transparency is viewed as critical for fostering the acceptance of AI systems in real-world practice (Bansal et al., 2014; Chen et al., 2014; Fallon and Blaha, 2018; Hayes and Shah, 2017; Mercado et al., 2016; Wachter et al., 2017b), not least because transparency might be a necessary ingredient for dealing with legal liability (Felzmann et al., 2019; Doshi-Velez et al., 2017; Goodman and Flaxman, 2016; Wachter et al., 2017a). Explainable AI (XAI) has emerged as a field that addresses the need for AI systems’ predictions to be followed by explanations of these predictions. Calls for XAI research to look into social sciences and psychology of explanation for insights on building and evaluating machine-generated explanations have been made by a number of authors (Byrne, 2019; Gunning and Aha, 2019; Miller, 2019; Miller et al., 2017; Molnar, 2020). The guiding idea is that incorporating the results on the effects and roles of different kinds of explanations from psychology would help XAI researchers in building methods to produce explanations that are better tailored for a human user. Examples of explanation methods for AI systems that are inspired by the work in cognitive science and decision-making include example-based explanations methods (Kim et al., 2016) and interactive explainability systems (Sokol and Flach, 2020). In this work I point to some potential consequences of directly applying the insights from the psychology of explanation literature that mostly focuses on causal explanations onto explanations of the decisions and predictions of AI systems that are based on associations.

Publication
Human Centered AI workshop at NeurIPS 2021
Marko Tešić
Marko Tešić
Research Associate