Hierarchical interpretations for neural network predictions

by   Chandan Singh, et al.

Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to DNNs being characterized as black boxes and consequently limited their applications. To ameliorate this problem, we introduce the use of hierarchical interpretations to explain DNN predictions through our proposed method, agglomerative contextual decomposition (ACD). Given a prediction from a trained DNN, ACD produces a hierarchical clustering of the input features, along with the contribution of each cluster to the final prediction. This hierarchy is optimized to identify clusters of features that the DNN learned are predictive. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect predictions and identifying dataset bias. Through human experiments, we demonstrate that ACD enables users both to identify the more accurate of two DNNs and to better trust a DNN's outputs. We also find that ACD's hierarchy is largely robust to adversarial perturbations, implying that it captures fundamental aspects of the input and ignores spurious noise.


page 11

page 12

page 13

page 14

page 15

page 20

page 22

page 23


Exclusion and Inclusion – A model agnostic approach to feature importance in DNNs

Deep Neural Networks in NLP have enabled systems to learn complex non-li...

Peeking inside the Black Box: Interpreting Deep Learning Models for Exoplanet Atmospheric Retrievals

Deep learning algorithms are growing in popularity in the field of exopl...

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

The driving force behind the recent success of LSTMs has been their abil...

Self-Checking Deep Neural Networks in Deployment

The widespread adoption of Deep Neural Networks (DNNs) in important doma...

Interpreting Robustness Proofs of Deep Neural Networks

In recent years numerous methods have been developed to formally verify ...

Deep Networks as Logical Circuits: Generalization and Interpretation

Not only are Deep Neural Networks (DNNs) black box models, but also we f...

Prioritizing Corners in OoD Detectors via Symbolic String Manipulation

For safety assurance of deep neural networks (DNNs), out-of-distribution...

Please sign up or login with your details

Forgot password? Click here to reset