AI-GAN: Attack-Inspired Generation of Adversarial Examples

02/06/2020
by   Tao Bai, et al.
21

Adversarial examples that can fool deep models are mainly crafted by adding small perturbations imperceptible to human eyes. There are various optimization-based methods in the literature to generate adversarial perturbations, most of which are time-consuming. AdvGAN, a method proposed by Xiao et al. in IJCAI 2018, employs Generative Adversarial Networks (GAN) to generate adversarial perturbation with original images as inputs, which is faster than optimization-based methods at inference time. AdvGAN, however, fixes the target classes in the training and we find it difficult to train AdvGAN when it is modified to take original images and target classes as inputs. In this paper, we propose GAN () with a different training strategy to solve this problem. is a two-stage method, in which we use projected gradient descent (PGD) attack to inspire the training of GAN in the first stage and apply standard training of GAN in the second stage. Once trained, the Generator can approximate the conditional distribution of adversarial instances and generate adversarial perturbations given different target classes. We conduct experiments and evaluate the performance of on MNIST and . Compared with AdvGAN, achieves higher attack success rates with similar perturbation magnitudes.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset