Stateful Detection of Black-Box Adversarial Attacks

07/12/2019
by   Steven Chen, et al.
0

The problem of adversarial examples, evasion attacks on machine learning classifiers, has proven extremely difficult to solve. This is true even when, as is the case in many practical settings, the classifier is hosted as a remote service and so the adversary does not have direct access to the model parameters. This paper argues that in such settings, defenders have a much larger space of actions than have been previously explored. Specifically, we deviate from the implicit assumption made by prior work that a defense must be a stateless function that operates on individual examples, and explore the possibility for stateful defenses. To begin, we develop a defense designed to detect the process of adversarial example generation. By keeping a history of the past queries, a defender can try to identify when a sequence of queries appears to be for the purpose of generating an adversarial example. We then introduce query blinding, a new class of attacks designed to bypass defenses that rely on such a defense approach. We believe that expanding the study of adversarial examples from stateless classifiers to stateful systems is not only more realistic for many black-box settings, but also gives the defender a much-needed advantage in responding to the adversary.

READ FULL TEXT
research
03/11/2023

Investigating Stateful Defenses Against Black-Box Adversarial Examples

Defending machine-learning (ML) models against white-box adversarial att...
research
05/23/2019

Thwarting finite difference adversarial attacks with output randomization

Adversarial examples pose a threat to deep neural network models in a va...
research
06/16/2020

AdvMind: Inferring Adversary Intent of Black-Box Attacks

Deep neural networks (DNNs) are inherently susceptible to adversarial at...
research
07/30/2023

Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks

Adversarial examples threaten the integrity of machine learning systems ...
research
06/03/2023

Towards Black-box Adversarial Example Detection: A Data Reconstruction-based Method

Adversarial example detection is known to be an effective adversarial de...
research
10/03/2019

BUZz: BUffer Zones for defending adversarial examples in image classification

We propose a novel defense against all existing gradient based adversari...
research
09/05/2022

An Adaptive Black-box Defense against Trojan Attacks (TrojDef)

Trojan backdoor is a poisoning attack against Neural Network (NN) classi...

Please sign up or login with your details

Forgot password? Click here to reset