Dynamical Isometry: The Missing Ingredient for Neural Network Pruning

05/12/2021
by   Huan Wang, et al.
0

Several recent works [40, 24] observed an interesting phenomenon in neural network pruning: A larger finetuning learning rate can improve the final performance significantly. Unfortunately, the reason behind it remains elusive up to date. This paper is meant to explain it through the lens of dynamical isometry [42]. Specifically, we examine neural network pruning from an unusual perspective: pruning as initialization for finetuning, and ask whether the inherited weights serve as a good initialization for the finetuning? The insights from dynamical isometry suggest a negative answer. Despite its critical role, this issue has not been well-recognized by the community so far. In this paper, we will show the understanding of this problem is very important – on top of explaining the aforementioned mystery about the larger finetuning rate, it also unveils the mystery about the value of pruning [5, 30]. Besides a clearer theoretical understanding of pruning, resolving the problem can also bring us considerable performance benefits in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/12/2023

Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Pruning

The state of neural network pruning has been noticed to be unclear and e...
research
07/25/2022

Trainability Preserving Neural Structured Pruning

Several recent works empirically find finetuning learning rate is critic...
research
06/14/2019

A Signal Propagation Perspective for Pruning Neural Networks at Initialization

Network pruning is a promising avenue for compressing deep neural networ...
research
03/11/2021

Emerging Paradigms of Neural Network Pruning

Over-parameterization of neural networks benefits the optimization and g...
research
01/01/2023

Theoretical Characterization of How Neural Network Pruning Affects its Generalization

It has been observed in practice that applying pruning-at-initialization...
research
12/09/2022

Optimizing Learning Rate Schedules for Iterative Pruning of Deep Neural Networks

The importance of learning rate (LR) schedules on network pruning has be...
research
06/19/2020

Exploring Weight Importance and Hessian Bias in Model Pruning

Model pruning is an essential procedure for building compact and computa...

Please sign up or login with your details

Forgot password? Click here to reset