Robust Evaluation of Diffusion-Based Adversarial Purification

03/16/2023
by   Minjong Lee, et al.
0

We question the current evaluation practice on diffusion-based purification methods. Diffusion-based purification methods aim to remove adversarial effects from an input data point at test time. The approach gains increasing attention as an alternative to adversarial training due to the disentangling between training and testing. Well-known white-box attacks are often employed to measure the robustness of the purification. However, it is unknown whether these attacks are the most effective for the diffusion-based purification since the attacks are often tailored for adversarial training. We analyze the current practices and provide a new guideline for measuring the robustness of purification methods against adversarial attacks. Based on our analysis, we further propose a new purification strategy showing competitive results against the state-of-the-art adversarial training approaches.

READ FULL TEXT
research
03/03/2023

Multi-Agent Adversarial Training Using Diffusion Learning

This work focuses on adversarial learning over graphs. We propose a gene...
research
08/13/2023

Faithful to Whom? Questioning Interpretability Measures in NLP

A common approach to quantifying model interpretability is to calculate ...
research
12/21/2020

Adversarial training for continuous robustness control problem in power systems

We propose a new adversarial training approach for injecting robustness ...
research
05/24/2023

Robust Classification via a Single Diffusion Model

Recently, diffusion models have been successfully applied to improving a...
research
03/15/2023

The Devil's Advocate: Shattering the Illusion of Unexploitable Data using Diffusion Models

Protecting personal data against the exploitation of machine learning mo...
research
03/23/2023

Decentralized Adversarial Training over Graphs

The vulnerability of machine learning models to adversarial attacks has ...
research
04/15/2022

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Adversarial training (i.e., training on adversarially perturbed input da...

Please sign up or login with your details

Forgot password? Click here to reset