On the transferability of adversarial examples between convex and 01 loss models

06/14/2020
by   Yunzhe Xue, et al.
0

We show that white box adversarial examples do not transfer effectively between convex and 01 loss and between 01 loss models compared to between convex models. We also show that convex substitute model black box attacks are less effective on 01 loss than convex models, and that 01 loss substitute model attacks are ineffective on both convex and 01 loss models. We show intuitively by example how the presence of outliers can cause different decision boundaries between 01 and convex loss models which in turn produces adversaries that are non-transferable. Indeed we see on MNIST that adversaries transfer between 01 loss and convex models more easily than on CIFAR10 and ImageNet which are likely to contain outliers. We also show intuitively by example how the non-continuity of 01 loss makes adversaries non-transferable in a two layer neural network.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset