Pairwise Margin Maximization for Deep Neural Networks

10/09/2021
by   Berry Weinstein, et al.
0

The weight decay regularization term is widely used during training to constrain expressivity, avoid overfitting, and improve generalization. Historically, this concept was borrowed from the SVM maximum margin principle and extended to multi-class deep networks. Carefully inspecting this principle reveals that it is not optimal for multi-class classification in general, and in particular when using deep neural networks. In this paper, we explain why this commonly used principle is not optimal and propose a new regularization scheme, called Pairwise Margin Maximization (PMM), which measures the minimal amount of displacement an instance should take until its predicted classification is switched. In deep neural networks, PMM can be implemented in the vector space before the network's output layer, i.e., in the deep feature space, where we add an additional normalization term to avoid convergence to a trivial solution. We demonstrate empirically a substantial improvement when training a deep neural network with PMM compared to the standard regularization terms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2019

Selective sampling for accelerating training of deep neural networks

We present a selective sampling method designed to accelerate the traini...
research
09/13/2020

Margin-Based Regularization and Selective Sampling in Deep Neural Networks

We derive a new margin-based regularization formulation, termed multi-ma...
research
10/09/2018

Average Margin Regularization for Classifiers

Adversarial robustness has become an important research topic given empi...
research
04/29/2018

SHADE: Information-Based Regularization for Deep Learning

Regularization is a big issue for training deep neural networks. In this...
research
04/29/2018

SHARE: Regularization for Deep Learning

Regularization is a big issue for training deep neural networks. In this...
research
05/29/2017

Feature Incay for Representation Regularization

Softmax loss is widely used in deep neural networks for multi-class clas...
research
03/07/2020

AL2: Progressive Activation Loss for Learning General Representations in Classification Neural Networks

The large capacity of neural networks enables them to learn complex func...

Please sign up or login with your details

Forgot password? Click here to reset