On the unreasonable vulnerability of transformers for image restoration – and an easy fix

by   Shashank Agnihotri, et al.

Following their success in visual recognition tasks, Vision Transformers(ViTs) are being increasingly employed for image restoration. As a few recent works claim that ViTs for image classification also have better robustness properties, we investigate whether the improved adversarial robustness of ViTs extends to image restoration. We consider the recently proposed Restormer model, as well as NAFNet and the "Baseline network" which are both simplified versions of a Restormer. We use Projected Gradient Descent (PGD) and CosPGD, a recently proposed adversarial attack tailored to pixel-wise prediction tasks for our robustness evaluation. Our experiments are performed on real-world images from the GoPro dataset for image deblurring. Our analysis indicates that contrary to as advocated by ViTs in image classification works, these models are highly susceptible to adversarial attacks. We attempt to improve their robustness through adversarial training. While this yields a significant increase in robustness for Restormer, results on other networks are less promising. Interestingly, the design choices in NAFNet and Baselines, which were based on iid performance, and not on robust generalization, seem to be at odds with the model robustness. Thus, we investigate this further and find a fix.


page 1

page 7

page 8

page 14

page 15


On the interplay of adversarial robustness and architecture components: patches, convolution and attention

In recent years novel architecture components for image classification h...

CosPGD: a unified white-box adversarial attack for pixel-wise prediction tasks

While neural networks allow highly accurate predictions in many tasks, t...

Analyzing Adversarial Robustness of Vision Transformers against Spatial and Spectral Attacks

Vision Transformers have emerged as a powerful architecture that can out...

Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Vision Transformers (ViTs) are becoming a very popular paradigm for visi...

Pretrained Transformers Do not Always Improve Robustness

Pretrained Transformers (PT) have been shown to improve Out of Distribut...

Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations

Transferable adversarial attacks optimize adversaries from a pretrained ...

Equivariant Transformer Networks

How can prior knowledge on the transformation invariances of a domain be...

Please sign up or login with your details

Forgot password? Click here to reset