The Elastic Lottery Ticket Hypothesis

03/30/2021
by   Xiaohan Chen, et al.
0

Lottery Ticket Hypothesis raises keen attention to identifying sparse trainable subnetworks or winning tickets, at the initialization (or early stage) of training, which can be trained in isolation to achieve similar or even better performance compared to the full models. Despite many efforts being made, the most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning (IMP), which is computationally expensive and has to be run thoroughly for every different network. A natural question that comes in is: can we "transform" the winning ticket found in one network to another with a different architecture, yielding a winning ticket for the latter at the beginning, without re-doing the expensive IMP? Answering this question is not only practically relevant for efficient "once-for-all" winning ticket finding, but also theoretically appealing for uncovering inherently scalable sparse patterns in networks. We conduct extensive experiments on CIFAR-10 and ImageNet, and propose a variety of strategies to tweak the winning tickets found from different networks of the same model family (e.g., ResNets). Based on these results, we articulate the Elastic Lottery Ticket Hypothesis (E-LTH): by mindfully replicating (or dropping) and re-ordering layers for one network, its corresponding winning ticket could be stretched (or squeezed) into a subnetwork for another deeper (or shallower) network from the same family, whose performance is nearly as competitive as the latter's winning ticket directly found by IMP. We have also thoroughly compared E-LTH with pruning-at-initialization and dynamic sparse training methods, and discuss the generalizability of E-LTH to different model families, layer types, and even across datasets. Our codes are publicly available at https://github.com/VITA-Group/ElasticLTH.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2022

Pushing the Efficiency Limit Using Structured Sparse Convolutions

Weight pruning is among the most popular approaches for compressing deep...
research
03/08/2022

Dual Lottery Ticket Hypothesis

Fully exploiting the learning capacity of neural networks requires overp...
research
06/06/2021

Efficient Lottery Ticket Finding: Less Data is More

The lottery ticket hypothesis (LTH) reveals the existence of winning tic...
research
06/09/2020

Pruning neural networks without any data by iteratively conserving synaptic flow

Pruning the parameters of deep neural networks has generated intense int...
research
02/09/2022

Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

The lottery ticket hypothesis (LTH) has shown that dense models contain ...
research
06/06/2019

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers

The success of lottery ticket initializations (Frankle and Carbin, 2019)...
research
12/11/2019

Linear Mode Connectivity and the Lottery Ticket Hypothesis

We introduce "instability analysis," a framework for assessing whether t...

Please sign up or login with your details

Forgot password? Click here to reset