Invariance reduces Variance: Understanding Data Augmentation in Deep Learning and Beyond

by   Shuxiao Chen, et al.

Many complex deep learning models have found success by exploiting symmetries in data. Convolutional neural networks (CNNs), for example, are ubiquitous in image classification due to their use of translation symmetry, as image identity is roughly invariant to translations. In addition, many other forms of symmetry such as rotation, scale, and color shift are commonly used via data augmentation: the transformed images are added to the training set. However, a clear framework for understanding data augmentation is not available. One may even say that it is somewhat mysterious: how can we increase performance by simply adding transforms of our data to the model? Can that be information theoretically possible? In this paper, we develop a theoretical framework to start to shed light on some of these problems. We explain data augmentation as averaging over the orbits of the group that keeps the data distribution invariant, and show that it leads to variance reduction. We study finite-sample and asymptotic empirical risk minimization (using results from stochastic convex optimization, Rademacher complexity, and asymptotic statistical theory). We work out as examples the variance reduction in exponential families, linear regression, and certain two-layer neural networks under shift invariance (using discrete Fourier analysis). We also discuss how data augmentation could be used in problems with symmetry where other approaches are prevalent, such as in cryo-electron microscopy (cryo-EM).


page 1

page 2

page 3

page 4


On the Benefits of Invariance in Neural Networks

Many real world data analysis problems exhibit invariant structure, and ...

Learning Invariances in Neural Networks

Invariances to translations have imbued convolutional neural networks wi...

On discrete symmetries of robotics systems: A group-theoretic and data-driven analysis

In this work, we study discrete morphological symmetries of dynamical sy...

A Theory of PAC Learnability under Transformation Invariances

Transformation invariances are present in many real-world problems. For ...

Learning Augmentation Distributions using Transformed Risk Minimization

Adapting to the structure of data distributions (such as symmetry and tr...

Augmenting learning using symmetry in a biologically-inspired domain

Invariances to translation, rotation and other spatial transformations a...

Radial Prediction Domain Adaption Classifier for the MIDOG 2022 challenge

In this paper, we describe our contribution to the MIDOG 2022 challenge ...

Please sign up or login with your details

Forgot password? Click here to reset