Generating Adversarial Inputs Using A Black-box Differential Technique

Neural Networks (NNs) are known to be vulnerable to adversarial attacks. A malicious agent initiates these attacks by perturbing an input into another one such that the two inputs are classified differently by the NN. In this paper, we consider a special class of adversarial examples, which can exhibit not only the weakness of NN models - as do for the typical adversarial examples - but also the different behavior between two NN models. We call them difference-inducing adversarial examples or DIAEs. Specifically, we propose DAEGEN, the first black-box differential technique for adversarial input generation. DAEGEN takes as input two NN models of the same classification problem and reports on output an adversarial example. The obtained adversarial example is a DIAE, so that it represents a point-wise difference in the input space between the two NN models. Algorithmically, DAEGEN uses a local search-based optimization algorithm to find DIAEs by iteratively perturbing an input to maximize the difference of two models on predicting the input. We conduct experiments on a spectrum of benchmark datasets (e.g., MNIST, ImageNet, and Driving) and NN models (e.g., LeNet, ResNet, Dave, and VGG). Experimental results are promising. First, we compare DAEGEN with two existing white-box differential techniques (DeepXplore and DLFuzz) and find that under the same setting, DAEGEN is 1) effective, i.e., it is the only technique that succeeds in generating attacks in all cases, 2) precise, i.e., the adversarial attacks are very likely to fool machines and humans, and 3) efficient, i.e, it requires a reasonable number of classification queries. Second, we compare DAEGEN with state-of-the-art black-box adversarial attack methods (simba and tremba), by adapting them to work on a differential setting. The experimental results show that DAEGEN performs better than both of them.

READ FULL TEXT

page 4

page 12

page 13

research
09/16/2019

They Might NOT Be Giants: Crafting Black-Box Adversarial Examples with Fewer Queries Using Particle Swarm Optimization

Machine learning models have been found to be susceptible to adversarial...
research
08/19/2019

Hybrid Batch Attacks: Finding Black-box Adversarial Examples with Limited Queries

In a black-box setting, the adversary only has API access to the target ...
research
10/06/2021

Adversarial Attacks on Machinery Fault Diagnosis

Despite the great progress of neural network-based (NN-based) machinery ...
research
01/21/2019

Perception-in-the-Loop Adversarial Examples

We present a scalable, black box, perception-in-the-loop technique to fi...
research
11/17/2020

SIENA: Stochastic Multi-Expert Neural Patcher

Neural network (NN) models that are solely trained to maximize the likel...
research
08/25/2019

Adversarial Edit Attacks for Tree Data

Many machine learning models can be attacked with adversarial examples, ...
research
09/13/2019

Defending Against Adversarial Attacks by Suppressing the Largest Eigenvalue of Fisher Information Matrix

We propose a scheme for defending against adversarial attacks by suppres...

Please sign up or login with your details

Forgot password? Click here to reset