Defensive Collaborative Multi-task Training - Defending against Adversarial Attack towards Deep Neural Networks

03/14/2018
by   Derek Wang, et al.
0

Deep neural network (DNNs) has shown impressive performance on hard perceptual problems. However, researchers found that DNN-based systems are vulnerable to adversarial examples which contain specially crafted humans-imperceptible perturbations. Such perturbations cause DNN-based systems to misclassify the adversarial examples, with potentially disastrous consequences where safety or security is crucial. As a major security concern, state-of-the-art attacks can still bypass the existing defensive methods. In this paper, we propose a novel defensive framework based on collaborative multi-task training to address the above problem. The proposed defence first incorporates specific label pairs into adversarial training process to enhance model robustness in black-box setting. Then a novel collaborative multi-task training framework is proposed to construct a detector which identifies adversarial examples based on the pairwise relationship of the label pairs. The detector can identify and reject high confidence adversarial examples that bypass black-box defence. The model whose robustness has been enhanced work reciprocally with the detector on the false-negative adversarial examples. Importantly, the proposed collaborative architecture can prevent the adversary from finding valid adversarial examples in a nearly-white-box setting.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset