Constrained Optimization with Dynamic Bound-scaling for Effective NLPBackdoor Defense

02/11/2022
by   Guangyu Shen, et al.
2

We develop a novel optimization method for NLPbackdoor inversion. We leverage a dynamically reducing temperature coefficient in the softmax function to provide changing loss landscapes to the optimizer such that the process gradually focuses on the ground truth trigger, which is denoted as a one-hot value in a convex hull. Our method also features a temperature rollback mechanism to step away from local optimals, exploiting the observation that local optimals can be easily deter-mined in NLP trigger inversion (while not in general optimization). We evaluate the technique on over 1600 models (with roughly half of them having injected backdoors) on 3 prevailing NLP tasks, with 4 different backdoor attacks and 7 architectures. Our results show that the technique is able to effectively and efficiently detect and remove backdoors, outperforming 4 baseline methods.

READ FULL TEXT

page 1

page 5

research
01/18/2022

Temperature Field Inversion of Heat-Source Systems via Physics-Informed Neural Networks

Temperature field inversion of heat-source systems (TFI-HSS) with limite...
research
10/26/2014

Improved depth imaging by constrained full-waveform inversion

We propose a formulation of full-wavefield inversion (FWI) as a constrai...
research
08/16/2021

Random learning gradient based optimization for efficient design of photovoltaic models

How to effectively realize the parameter identification of different pho...
research
06/17/2022

Landscape Learning for Neural Network Inversion

Many machine learning methods operate by inverting a neural network at i...
research
10/07/2022

Distillation-Resistant Watermarking for Model Protection in NLP

How can we protect the intellectual property of trained NLP models? Mode...
research
12/25/2020

Contextual Temperature for Language Modeling

Temperature scaling has been widely used as an effective approach to con...
research
06/11/2022

Bilateral Dependency Optimization: Defending Against Model-inversion Attacks

Through using only a well-trained classifier, model-inversion (MI) attac...

Please sign up or login with your details

Forgot password? Click here to reset