Structure Matters: Towards Generating Transferable Adversarial Images
Recent works on adversarial examples for image classification focus on directly modifying pixels with minor perturbations. The small perturbation requirement is imposed to ensure the generated adversarial examples being natural and realistic to humans, which, however, puts a curb on the attack space thus limiting the attack ability and transferability especially for systems protected by a defense mechanism. In this paper, we propose the novel concepts of structure patterns and structure-aware perturbations that relax the small perturbation constraint while still keeping images natural. The key idea of our approach is to allow perceptible deviation in adversarial examples while keeping structure patterns that are central to a human classifier. Built upon these concepts, we propose a structure-preserving attack (SPA) for generating natural adversarial examples with extremely high transferability. Empirical results on the MNIST and the CIFAR10 datasets show that SPA adversarial images can easily bypass strong PGD-based adversarial training and are still effective against SPA-based adversarial training. Further, they transfer well to other target models with little or no loss of successful attack rate, thus exhibiting competitive black-box attack performance. Our code is available at <https://github.com/spasrccode/SPA>.
READ FULL TEXT