On the Benefits of Attributional Robustness

11/29/2019
by   Mayank Singh, et al.
11

Interpretability is an emerging area of research in trustworthy machine learning. Safe deployment of machine learning system mandates that the prediction and its explanation be reliable and robust. Recently, it was shown that one could craft perturbations that produce perceptually indistinguishable inputs having the same prediction, yet very different interpretations. We tackle the problem of attributional robustness (i.e. models having robust explanations) by maximizing the alignment between the input image and its saliency map using soft-margin triplet loss. We propose a robust attribution training methodology that beats the state-of-the-art attributional robustness measure by a margin of approximately 6-18 SVHN, CIFAR-10 and GTSRB. We further show the utility of the proposed robust model in the domain of weakly supervised object localization and segmentation. Our proposed robust model also achieves a new state-of-the-art object localization accuracy on the CUB-200 dataset.

READ FULL TEXT

page 1

page 7

page 8

page 9

page 14

research
06/14/2020

On Saliency Maps and Adversarial Robustness

A Very recent trend has emerged to couple the notion of interpretability...
research
12/28/2020

Enhanced Regularizers for Attributional Robustness

Deep neural networks are the default choice of learning models for compu...
research
05/23/2019

Robust Attribution Regularization

An emerging problem in trustworthy machine learning is to train models t...
research
11/28/2021

Learning a Weight Map for Weakly-Supervised Localization

In the weakly supervised localization setting, supervision is given as a...
research
05/09/2019

Learning Interpretable Features via Adversarially Robust Optimization

Neural networks are proven to be remarkably successful for classificatio...
research
09/12/2023

Certified Robust Models with Slack Control and Large Lipschitz Constants

Despite recent success, state-of-the-art learning-based models remain hi...
research
06/06/2019

Image Synthesis with a Single (Robust) Classifier

We show that the basic classification framework alone can be used to tac...

Please sign up or login with your details

Forgot password? Click here to reset