The Lottery Ticket Hypothesis: Finding Small, Trainable Neural Networks

03/09/2018
by   Jonathan Frankle, et al.
0

Neural network compression techniques are able to reduce the parameter counts of trained networks by over 90 inference performance--without compromising accuracy. However, contemporary experience is that it is difficult to train small architectures from scratch, which would similarly improve training performance. We articulate a new conjecture to explain why it is easier to train large networks: the "lottery ticket hypothesis." It states that large networks that train successfully contain subnetworks that--when trained in isolation--converge in a comparable number of iterations to comparable accuracy. These subnetworks, which we term "winning tickets," have won the initialization lottery: their connections have initial weights that make training particularly effective. We find that a standard technique for pruning unnecessary network weights naturally uncovers a subnetwork which, at the start of training, comprised a winning ticket. We present an algorithm to identify winning tickets and a series of experiments that support the lottery ticket hypothesis. We consistently find winning tickets that are less than 20 fully-connected, convolutional, and residual architectures for MNIST and CIFAR10. Furthermore, winning tickets at moderate levels of pruning (20-50 the original network size) converge up to 6.7x faster than the original network and exhibit higher test accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2018

The Lottery Ticket Hypothesis: Training Pruned Neural Networks

Recent work on neural network pruning indicates that, at training time, ...
research
05/10/2022

Robust Learning of Parsimonious Deep Neural Networks

We propose a simultaneous learning and pruning algorithm capable of iden...
research
06/23/2020

Principal Component Networks: Parameter Reduction Early in Training

Recent works show that overparameterized networks contain small subnetwo...
research
05/03/2019

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed ...
research
01/31/2022

Signing the Supermask: Keep, Hide, Invert

The exponential growth in numbers of parameters of neural networks over ...
research
12/13/2021

On the Compression of Natural Language Models

Deep neural networks are effective feature extractors but they are prohi...
research
02/24/2022

Rare Gems: Finding Lottery Tickets at Initialization

It has been widely observed that large neural networks can be pruned to ...

Please sign up or login with your details

Forgot password? Click here to reset