Improving Generalization of Batch Whitening by Convolutional Unit Optimization

08/24/2021
by   Yooshin Cho, et al.
0

Batch Whitening is a technique that accelerates and stabilizes training by transforming input features to have a zero mean (Centering) and a unit variance (Scaling), and by removing linear correlation between channels (Decorrelation). In commonly used structures, which are empirically optimized with Batch Normalization, the normalization layer appears between convolution and activation function. Following Batch Whitening studies have employed the same structure without further analysis; even Batch Whitening was analyzed on the premise that the input of a linear layer is whitened. To bridge the gap, we propose a new Convolutional Unit that is in line with the theory, and our method generally improves the performance of Batch Whitening. Moreover, we show the inefficacy of the original Convolutional Unit by investigating rank and correlation of features. As our method is employable off-the-shelf whitening modules, we use Iterative Normalization (IterNorm), the state-of-the-art whitening module, and obtain significantly improved performance on five image classification datasets: CIFAR-10, CIFAR-100, CUB-200-2011, Stanford Dogs, and ImageNet. Notably, we verify that our method improves stability and performance of whitening when using large learning rate, group size, and iteration number.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2019

Iterative Normalization: Beyond Standardization towards Efficient Whitening

Batch Normalization (BN) is ubiquitously employed for accelerating neura...
research
09/04/2018

Understanding Regularization in Batch Normalization

Batch Normalization (BN) makes output of hidden neuron had zero mean and...
research
07/08/2016

Adjusting for Dropout Variance in Batch Normalization and Weight Initialization

We show how to adjust for the variance introduced by dropout with correc...
research
04/23/2018

Decorrelated Batch Normalization

Batch Normalization (BN) is capable of accelerating the training of deep...
research
12/04/2020

Batch Group Normalization

Deep Convolutional Neural Networks (DCNNs) are hard and time-consuming t...
research
02/20/2017

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks

Traditionally, multi-layer neural networks use dot product between the o...
research
02/27/2019

Equi-normalization of Neural Networks

Modern neural networks are over-parametrized. In particular, each rectif...

Please sign up or login with your details

Forgot password? Click here to reset