Switching Gradient Directions for Query-Efficient Black-Box Adversarial Attacks

09/15/2020
by   Chen Ma, et al.
12

We propose a simple and highly query-efficient black-box adversarial attack named SWITCH, which has a state-of-the-art performance under ℓ_2 and ℓ_∞ norms in the score-based setting. In the black box attack setting, designing query-efficient attacks remains an open problem. The high query efficiency of the proposed approach stems from the combination of transfer-based attacks and random-search-based ones. The surrogate model's gradient 𝐠̂ is exploited for the guidance, which is then switched if our algorithm detects that it does not point to the adversarial region by using a query, thereby keeping the objective loss function of the target model rising as much as possible. Two switch operations are available, i.e., SWITCH_neg and SWITCH_rnd. SWITCH_neg takes -𝐠̂ as the new direction, which is reasonable under an approximate local linearity assumption. SWITCH_rnd computes the gradient from another model, which is randomly selected from a large model set, to help bypass the potential obstacle in optimization. Experimental results show that these strategies boost the optimization process whereas following the original surrogate gradients does not work. In SWITCH, no query is used to estimate the gradient, and all the queries aim to determine whether to switch directions, resulting in unprecedented query efficiency. We demonstrate that our approach outperforms 10 state-of-the-art attacks on CIFAR-10, CIFAR-100 and TinyImageNet datasets. SWITCH can serve as a strong baseline for future black-box attacks. The PyTorch source code is released in https://github.com/machanic/SWITCH .

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset