You Only Query Once: Effective Black Box Adversarial Attacks with Minimal Repeated Queries

by   Devin Willmott, et al.

Researchers have repeatedly shown that it is possible to craft adversarial attacks on deep classifiers (small perturbations that significantly change the class label), even in the "black-box" setting where one only has query access to the classifier. However, all prior work in the black-box setting attacks the classifier by repeatedly querying the same image with minor modifications, usually thousands of times or more, making it easy for defenders to detect an ensuing attack. In this work, we instead show that it is possible to craft (universal) adversarial perturbations in the black-box setting by querying a sequence of different images only once. This attack prevents detection from high number of similar queries and produces a perturbation that causes misclassification when applied to any input to the classifier. In experiments, we show that attacks that adhere to this restriction can produce untargeted adversarial perturbations that fool the vast majority of MNIST and CIFAR-10 classifier inputs, as well as in excess of 60-70% of inputs on ImageNet classifiers. In the targeted setting, we exhibit targeted black-box universal attacks on ImageNet classifiers with success rates above 20% when only allowed one query per image, and 66% when allowed two queries per image.


EvoBA: An Evolution Strategy as a Strong Baseline forBlack-Box Adversarial Attacks

Recent work has shown how easily white-box adversarial attacks can be ap...

Universal Adversarial Training

Standard adversarial attacks change the predicted class label of an imag...

One Sparse Perturbation to Fool them All, almost Always!

Constructing adversarial perturbations for deep neural networks is an im...

Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks

A large body of research has focused on adversarial attacks which requir...

Towards Imperceptible Query-limited Adversarial Attacks with Perceptual Feature Fidelity Loss

Recently, there has been a large amount of work towards fooling deep-lea...

MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes

Deepfakes raised serious concerns on the authenticity of visual contents...

Simple black-box universal adversarial attacks on medical image classification based on deep neural networks

Universal adversarial attacks, which hinder most deep neural network (DN...

Please sign up or login with your details

Forgot password? Click here to reset