Random Projections for Adversarial Attack Detection

by   Nathan Drenkow, et al.

Whilst adversarial attack detection has received considerable attention, it remains a fundamentally challenging problem from two perspectives. First, while threat models can be well-defined, attacker strategies may still vary widely within those constraints. Therefore, detection should be considered as an open-set problem, standing in contrast to most current detection strategies. These methods take a closed-set view and train binary detectors, thus biasing detection toward attacks seen during detector training. Second, information is limited at test time and confounded by nuisance factors including the label and underlying content of the image. Many of the current high-performing techniques use training sets for dealing with some of these issues, but can be limited by the overall size and diversity of those sets during the detection step. We address these challenges via a novel strategy based on random subspace analysis. We present a technique that makes use of special properties of random projections, whereby we can characterize the behavior of clean and adversarial examples across a diverse set of subspaces. We then leverage the self-consistency (or inconsistency) of model activations to discern clean from adversarial examples. Performance evaluation demonstrates that our technique outperforms (>0.92 AUC) competing state of the art (SOTA) attack strategies, while remaining truly agnostic to the attack method itself. It also requires significantly less training data, composed only of clean examples, when compared to competing SOTA methods, which achieve only chance performance, when evaluated in a more rigorous testing scenario.


page 1

page 2

page 3

page 4


Attack-Agnostic Adversarial Detection

The growing number of adversarial attacks in recent years gives attacker...

Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples

We present DeClaW, a system for detecting, classifying, and warning of a...

Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors

Most previous works usually explained adversarial examples from several ...

ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation

Adversarial Examples Detection (AED) is a crucial defense technique agai...

MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples Detectors

Detection of adversarial examples has been a hot topic in the last years...

Open Set Adversarial Examples

Adversarial examples in recent works target at closed set recognition sy...

Adversarial Sample Detection Through Neural Network Transport Dynamics

We propose a detector of adversarial samples that is based on the view o...

Please sign up or login with your details

Forgot password? Click here to reset