Adversarial Distillation for Ordered Top-k Attacks

05/25/2019
by   Zekun Zhang, et al.
1

Deep Neural Networks (DNNs) are vulnerable to adversarial attacks, especially white-box targeted attacks. One scheme of learning attacks is to design a proper adversarial objective function that leads to the imperceptible perturbation for any test image (e.g., the Carlini-Wagner (C&W) method). Most methods address targeted attacks in the Top-1 manner. In this paper, we propose to learn ordered Top-k attacks (k>= 1) for image classification tasks, that is to enforce the Top-k predicted labels of an adversarial example to be the k (randomly) selected and ordered labels (the ground-truth label is exclusive). To this end, we present an adversarial distillation framework: First, we compute an adversarial probability distribution for any given ordered Top-k targeted labels with respect to the ground-truth of a test image. Then, we learn adversarial examples by minimizing the Kullback-Leibler (KL) divergence together with the perturbation energy penalty, similar in spirit to the network distillation method. We explore how to leverage label semantic similarities in computing the targeted distributions, leading to knowledge-oriented attacks. In experiments, we thoroughly test Top-1 and Top-5 attacks in the ImageNet-1000 validation dataset using two popular DNNs trained with clean ImageNet-1000 train dataset, ResNet-50 and DenseNet-121. For both models, our proposed adversarial distillation approach outperforms the C&W method in the Top-1 setting, as well as other baseline methods. Our approach shows significant improvement in the Top-5 setting against a strong modified C&W method.

READ FULL TEXT

page 2

page 6

research
07/14/2016

Defensive Distillation is Not Robust to Adversarial Examples

We show that defensive distillation is not secure: it is no more resista...
research
11/15/2019

Simple iterative method for generating targeted universal adversarial perturbations

Deep neural networks (DNNs) are vulnerable to adversarial attacks. In pa...
research
07/31/2020

TEAM: We Need More Powerful Adversarial Examples for DNNs

Although deep neural networks (DNNs) have achieved success in many appli...
research
09/04/2021

Utilizing Adversarial Targeted Attacks to Boost Adversarial Robustness

Adversarial attacks have been shown to be highly effective at degrading ...
research
05/19/2022

On Trace of PGD-Like Adversarial Attacks

Adversarial attacks pose safety and security concerns for deep learning ...
research
09/24/2018

Is Ordered Weighted ℓ_1 Regularized Regression Robust to Adversarial Perturbation? A Case Study on OSCAR

Many state-of-the-art machine learning models such as deep neural networ...
research
06/21/2020

Network Moments: Extensions and Sparse-Smooth Attacks

The impressive performance of deep neural networks (DNNs) has immensely ...

Please sign up or login with your details

Forgot password? Click here to reset