TROJANZOO: Everything you ever wanted to know about neural backdoors (but were afraid to ask)

by   Ren Pang, et al.

Neural backdoors represent one primary threat to the security of deep learning systems. The intensive research on this subject has produced a plethora of attacks/defenses, resulting in a constant arms race. However, due to the lack of evaluation benchmarks, many critical questions remain largely unexplored: (i) How effective, evasive, or transferable are different attacks? (ii) How robust, utility-preserving, or generic are different defenses? (iii) How do various factors (e.g., model architectures) impact their performance? (iv) What are the best practices (e.g., optimization strategies) to operate such attacks/defenses? (v) How can the existing attacks/defenses be further improved? To bridge the gap, we design and implement TROJANZOO, the first open-source platform for evaluating neural backdoor attacks/defenses in a unified, holistic, and practical manner. Thus, it has incorporated 12 representative attacks, 15 state-of-the-art defenses, 6 attack performance metrics, 10 defense utility metrics, as well as rich tools for in-depth analysis of attack-defense interactions. Leveraging TROJANZOO, we conduct a systematic study of existing attacks/defenses, leading to a number of interesting findings: (i) different attacks manifest various trade-offs among multiple desiderata (e.g., effectiveness, evasiveness, and transferability); (ii) one-pixel triggers often suffice; (iii) optimizing trigger patterns and trojan models jointly improves both attack effectiveness and evasiveness; (iv) sanitizing trojan models often introduces new vulnerabilities; (v) most defenses are ineffective against adaptive attacks, but integrating complementary ones significantly enhances defense robustness. We envision that such findings will help users select the right defense solutions and facilitate future research on neural backdoors.


page 2

page 3

page 4

page 5

page 6

page 15

page 16

page 17


A critique of the DeepSec Platform for Security Analysis of Deep Learning Models

At IEEE S&P 2019, the paper "DeepSec: A Uniform Platform for Security An...

Pick your Poison: Undetectability versus Robustness in Data Poisoning Attacks against Deep Image Classification

Deep image classification models trained on large amounts of web-scraped...

SoK: Inference Attacks and Defenses in Human-Centered Wireless Sensing

Human-centered wireless sensing aims to understand the fine-grained envi...

Seeing is Living? Rethinking the Security of Facial Liveness Verification in the Deepfake Era

Facial Liveness Verification (FLV) is widely used for identity authentic...

Less is More: Exploiting Social Trust to Increase the Effectiveness of a Deception Attack

Cyber attacks such as phishing, IRS scams, etc., still are successful in...

How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection

Model stealing attacks present a dilemma for public machine learning API...

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

Textual backdoor attacks are a kind of practical threat to NLP systems. ...

Please sign up or login with your details

Forgot password? Click here to reset