CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity

12/24/2019
by   Konpat Preechakul, et al.
16

Most optimizers including stochastic gradient descent (SGD) and its adaptive gradient derivatives face the same problem where an effective learning rate during the training is vastly different. A learning rate scheduling, mostly tuned by hand, is usually employed in practice. In this paper, we propose CProp, a gradient scaling method, which acts as a second-level learning rate adapting throughout the training process based on cues from past gradient conformity. When the past gradients agree on direction, CProp keeps the original learning rate. On the contrary, if the gradients do not agree on direction, CProp scales down the gradient proportionally to its uncertainty. Since it works by scaling, it could apply to any existing optimizer extending its learning rate scheduling capability. We put CProp to a series of tests showing significant gain in training speed on both SGD and adaptive gradient method like Adam. Codes are available at https://github.com/phizaz/cprop .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2021

Decreasing scaling transition from adaptive gradient descent to stochastic gradient descent

Currently, researchers have proposed the adaptive gradient descent algor...
research
09/07/2021

Tom: Leveraging trend of the observed gradients for faster convergence

The success of deep learning can be attributed to various factors such a...
research
03/05/2021

Unintended Effects on Adaptive Learning Rate for Training Neural Network with Output Scale Change

A multiplicative constant scaling factor is often applied to the model o...
research
08/08/2019

On the Variance of the Adaptive Learning Rate and Beyond

The learning rate warmup heuristic achieves remarkable success in stabil...
research
12/04/2019

Domain-independent Dominance of Adaptive Methods

From a simplified analysis of adaptive methods, we derive AvaGrad, a new...
research
07/28/2019

ROAM: Recurrently Optimizing Tracking Model

Online updating a tracking model to adapt to object appearance variation...
research
05/13/2019

Scaling Distributed Training of Flood-Filling Networks on HPC Infrastructure for Brain Mapping

Mapping all the neurons in the brain requires automatic reconstruction o...

Please sign up or login with your details

Forgot password? Click here to reset