AntidoteRT: Run-time Detection and Correction of Poison Attacks on Neural Networks

by   Muhammad Usman, et al.

We study backdoor poisoning attacks against image classification networks, whereby an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger causes the classifier to predict some target class. that aim to detect the attack but only a few also propose to defend against it, and they typically involve retraining the network which is not always possible in practice. We propose lightweight automated detection and correction techniques against poisoning attacks, which are based on neuron patterns mined from the network using a small set of clean and poisoned test samples with known labels. The patterns built based on the mis-classified samples are used for run-time detection of new poisoned inputs. For correction, we propose an input correction technique that uses a differential analysis to identify the trigger in the detected poisoned images, which is then reset to a neutral color. Our detection and correction are performed at run-time and input level, which is in contrast to most existing work that is focused on offline model-level defenses. We demonstrate that our technique outperforms existing defenses such as NeuralCleanse and STRIP on popular benchmarks such as MNIST, CIFAR-10, and GTSRB against the popular BadNets attack and the more complex DFST attack.


page 1

page 2

page 3

page 4


Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural Networks

Backdoor (Trojan) attacks are emerging threats against deep neural netwo...

NTD: Non-Transferability Enabled Backdoor Detection

A backdoor deep learning (DL) model behaves normally upon clean inputs b...

Invisible Backdoor Attacks Using Data Poisoning in the Frequency Domain

With the broad application of deep neural networks (DNNs), backdoor atta...

Improved Activation Clipping for Universal Backdoor Mitigation and Test-Time Detection

Deep neural networks are vulnerable to backdoor attacks (Trojans), where...

Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

Recent studies have shown that DNNs can be compromised by backdoor attac...

RADAR: Run-time Adversarial Weight Attack Detection and Accuracy Recovery

Adversarial attacks on Neural Network weights, such as the progressive b...

Test-Time Adaptation for Backdoor Defense

Deep neural networks have played a crucial part in many critical domains...

Please sign up or login with your details

Forgot password? Click here to reset