Exploring Memorization in Adversarial Training

06/03/2021
by   Yinpeng Dong, et al.
0

It is well known that deep learning models have a propensity for fitting the entire training set even with random labels, which requires memorization of every training sample. In this paper, we investigate the memorization effect in adversarial training (AT) for promoting a deeper understanding of capacity, convergence, generalization, and especially robust overfitting of adversarially trained classifiers. We first demonstrate that deep networks have sufficient capacity to memorize adversarial examples of training data with completely random labels, but not all AT algorithms can converge under the extreme circumstance. Our study of AT with random labels motivates further analyses on the convergence and generalization of AT. We find that some AT methods suffer from a gradient instability issue, and the recently suggested complexity measures cannot explain robust generalization by considering models trained on random labels. Furthermore, we identify a significant drawback of memorization in AT that it could result in robust overfitting. We then propose a new mitigation algorithm motivated by detailed memorization analyses. Extensive experiments on various datasets validate the effectiveness of the proposed method.

READ FULL TEXT
research
11/25/2022

Boundary Adversarial Examples Against Adversarial Overfitting

Standard adversarial training approaches suffer from robust overfitting ...
research
09/23/2019

Robust Local Features for Improving the Generalization of Adversarial Training

Adversarial training has been demonstrated as one of the most effective ...
research
06/17/2022

Understanding Robust Overfitting of Adversarial Training and Beyond

Robust overfitting widely exists in adversarial training of deep network...
research
12/31/2021

Benign Overfitting in Adversarially Robust Linear Classification

"Benign overfitting", where classifiers memorize noisy training data yet...
research
09/13/2022

Adversarial Coreset Selection for Efficient Robust Training

Neural networks are vulnerable to adversarial attacks: adding well-craft...
research
09/29/2022

Regularizing Neural Network Training via Identity-wise Discriminative Feature Suppression

It is well-known that a deep neural network has a strong fitting capabil...
research
05/07/2021

Uniform Convergence, Adversarial Spheres and a Simple Remedy

Previous work has cast doubt on the general framework of uniform converg...

Please sign up or login with your details

Forgot password? Click here to reset