On the Robustness of Removal-Based Feature Attributions

06/12/2023
by   Chris Lin, et al.
0

To explain complex models based on their inputs, many feature attribution methods have been developed that assign importance scores to input features. However, some recent work challenges the robustness of feature attributions by showing that these methods are sensitive to input and model perturbations, while other work addresses this robustness issue by proposing robust attribution methods and model modifications. Nevertheless, previous work on attribution robustness has focused primarily on gradient-based feature attributions. In contrast, the robustness properties of removal-based attribution methods are not comprehensively well understood. To bridge this gap, we theoretically characterize the robustness of removal-based feature attributions. Specifically, we provide a unified analysis of such methods and prove upper bounds for the difference between intact and perturbed attributions, under settings of both input and model perturbations. Our empirical experiments on synthetic and real-world data validate our theoretical results and demonstrate their practical implications.

READ FULL TEXT
research
03/01/2023

A Practical Upper Bound for the Worst-Case Attribution Deviations

Model attribution is a critical component of deep neural networks (DNNs)...
research
11/29/2022

Towards More Robust Interpretation via Local Gradient Alignment

Neural network interpretation methods, particularly feature attribution ...
research
05/15/2022

Exploiting the Relationship Between Kendall's Rank Correlation and Cosine Similarity for Attribution Protection

Model attributions are important in deep neural networks as they aid pra...
research
12/05/2020

Robustness on Networks

We adopt the statistical framework on robustness proposed by Watson and ...
research
08/21/2020

A Unified Taylor Framework for Revisiting Attribution Methods

Attribution methods have been developed to understand the decision makin...
research
03/20/2021

Boundary Attributions Provide Normal (Vector) Explanations

Recent work on explaining Deep Neural Networks (DNNs) focuses on attribu...
research
10/18/2021

RKHS-SHAP: Shapley Values for Kernel Methods

Feature attribution for kernel methods is often heuristic and not indivi...

Please sign up or login with your details

Forgot password? Click here to reset