Explaining Latent Representations with a Corpus of Examples

10/28/2021
by   Jonathan Crabbé, et al.
13

Modern machine learning models are complicated. Most of them rely on convoluted latent representations of their input to issue a prediction. To achieve greater transparency than a black-box that connects inputs to predictions, it is necessary to gain a deeper understanding of these latent representations. To that aim, we propose SimplEx: a user-centred method that provides example-based explanations with reference to a freely selected set of examples, called the corpus. SimplEx uses the corpus to improve the user's understanding of the latent space with post-hoc explanations answering two questions: (1) Which corpus examples explain the prediction issued for a given test example? (2) What features of these corpus examples are relevant for the model to relate them to the test example? SimplEx provides an answer by reconstructing the test latent representation as a mixture of corpus latent representations. Further, we propose a novel approach, the Integrated Jacobian, that allows SimplEx to make explicit the contribution of each corpus feature in the mixture. Through experiments on tasks ranging from mortality prediction to image classification, we demonstrate that these decompositions are robust and accurate. With illustrative use cases in medicine, we show that SimplEx empowers the user by highlighting relevant patterns in the corpus that explain model representations. Moreover, we demonstrate how the freedom in choosing the corpus allows the user to have personalized explanations in terms of examples that are meaningful for them.

READ FULL TEXT
research
02/02/2022

Analogies and Feature Attributions for Model Agnostic Explanation of Similarity Learners

Post-hoc explanations for black box models have been studied extensively...
research
09/30/2022

Contrastive Corpus Attribution for Explaining Representations

Despite the widespread use of unsupervised models, very few methods are ...
research
02/16/2016

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

Despite widespread adoption, machine learning models remain mostly black...
research
06/14/2020

Explaining Predictions by Approximating the Local Decision Boundary

Constructing accurate model-agnostic explanations for opaque machine lea...
research
12/05/2022

Explaining Link Predictions in Knowledge Graph Embedding Models with Influential Examples

We study the problem of explaining link predictions in the Knowledge Gra...
research
10/17/2022

RbX: Region-based explanations of prediction models

We introduce region-based explanations (RbX), a novel, model-agnostic me...

Please sign up or login with your details

Forgot password? Click here to reset