A General Framework for Defending Against Backdoor Attacks via Influence Graph

11/29/2021
by   Xiaofei Sun, et al.
0

In this work, we propose a new and general framework to defend against backdoor attacks, inspired by the fact that attack triggers usually follow a specific type of attacking pattern, and therefore, poisoned training examples have greater impacts on each other during training. We introduce the notion of the influence graph, which consists of nodes and edges respectively representative of individual training points and associated pair-wise influences. The influence between a pair of training points represents the impact of removing one training point on the prediction of another, approximated by the influence function <cit.>. Malicious training points are extracted by finding the maximum average sub-graph subject to a particular size. Extensive experiments on computer vision and natural language processing tasks demonstrate the effectiveness and generality of the proposed framework.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2022

Characterizing the Influence of Graph Elements

Influence function, a method from robust statistics, measures the change...
research
01/25/2022

Identifying a Training-Set Attack's Target Using Renormalized Influence Estimation

Targeted training-set attacks inject malicious instances into the traini...
research
05/25/2023

On Influence Functions, Classification Influence, Relative Influence, Memorization and Generalization

Machine learning systems such as large scale recommendation systems or n...
research
10/24/2022

Analyzing the Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation

Customer feedback can be an important signal for improving commercial ma...
research
11/08/2021

Revisiting Methods for Finding Influential Examples

Several instance-based explainability methods for finding influential tr...
research
04/19/2022

Indiscriminate Data Poisoning Attacks on Neural Networks

Data poisoning attacks, in which a malicious adversary aims to influence...
research
04/06/2023

GIF: A General Graph Unlearning Strategy via Influence Function

With the greater emphasis on privacy and security in our society, the pr...

Please sign up or login with your details

Forgot password? Click here to reset