Delving into the pixels of adversarial samples
Despite extensive research into adversarial attacks, we do not know how adversarial attacks affect image pixels. Knowing how image pixels are affected by adversarial attacks has the potential to lead us to better adversarial defenses. Motivated by instances that we find where strong attacks do not transfer, we delve into adversarial examples at pixel level to scrutinize how adversarial attacks affect image pixel values. We consider several ImageNet architectures, InceptionV3, VGG19 and ResNet50, as well as several strong attacks. We find that attacks can have different effects at pixel level depending on classifier architecture. In particular, input pre-processing plays a previously overlooked role in the effect that attacks have on pixels. Based on the insights of pixel-level examination, we find new ways to detect some of the strongest current attacks.
READ FULL TEXT