Efficient Black-Box Adversarial Attack Guided by the Distribution of Adversarial Perturbations

06/15/2020
by   Yan Feng, et al.
0

This work studied the score-based black-box adversarial attack problem, where only a continuous score is returned for each query, while the structure and parameters of the attacked model are unknown. A promising approach to solve this problem is evolution strategies (ES), which introduces a search distribution to sample perturbations that are likely to be adversarial. Gaussian distribution is widely adopted as the search distribution in the standard ES algorithm. However, it may not be flexible enough to capture the diverse distributions of adversarial perturbations around different benign examples. In this work, we propose to transform the Gaussian-distributed variable to another space through a conditional flow-based model, to enhance the capability and flexibility of capturing the intrinsic distribution of adversarial perturbations conditioned on the benign example. Besides, to further enhance the query efficiency, we propose to pre-train the conditional flow model based on some white-box surrogate models, utilizing the transferability of adversarial perturbations across different models, which has been widely observed in the literature of adversarial examples. Consequently, the proposed method could take advantage of both query-based and transfer-based attack methods, to achieve satisfying attack performance on both effectiveness and efficiency. Extensive experiments of attacking four target models on CIFAR-10 and Tiny-ImageNet verify the superior performance of the proposed method to state-of-the-art methods.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset