Small Input Noise is Enough to Defend Against Query-based Black-box Attacks

by   Junyoung Byun, et al.

While deep neural networks show unprecedented performance in various tasks, the vulnerability to adversarial examples hinders their deployment in safety-critical systems. Many studies have shown that attacks are also possible even in a black-box setting where an adversary cannot access the target model's internal information. Most black-box attacks are based on queries, each of which obtains the target model's output for an input, and many recent studies focus on reducing the number of required queries. In this paper, we pay attention to an implicit assumption of these attacks that the target model's output exactly corresponds to the query input. If some randomness is introduced into the model to break this assumption, query-based attacks may have tremendous difficulty in both gradient estimation and local search, which are the core of their attack process. From this motivation, we observe even a small additive input noise can neutralize most query-based attacks and name this simple yet effective approach Small Noise Defense (SND). We analyze how SND can defend against query-based black-box attacks and demonstrate its effectiveness against eight different state-of-the-art attacks with CIFAR-10 and ImageNet datasets. Even with strong defense ability, SND almost maintains the original clean accuracy and computational speed. SND is readily applicable to pre-trained models by adding only one line of code at the inference stage, so we hope that it will be used as a baseline of defense against query-based black-box attacks in the future.


page 1

page 2

page 3

page 4


Theoretical Study of Random Noise Defense against Query-Based Black-Box Attacks

The query-based black-box attacks, which don't require any knowledge abo...

Exploring Non-additive Randomness on ViT against Query-Based Black-Box Attacks

Deep Neural Networks can be easily fooled by small and imperceptible per...

Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems

With the emergence of large foundational models, model-serving systems a...

MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks

Many adversarial attacks have been proposed to investigate the security ...

BUZz: BUffer Zones for defending adversarial examples in image classification

We propose a novel defense against all existing gradient based adversari...

Improved Adversarial Robustness via Logit Regularization Methods

While great progress has been made at making neural networks effective a...

Subspace Attack: Exploiting Promising Subspaces for Query-Efficient Black-box Attacks

Unlike the white-box counterparts that are widely studied and readily ac...

Please sign up or login with your details

Forgot password? Click here to reset