Achieving acceleration despite very noisy gradients

02/10/2023
by   Kanan Gupta, et al.
0

We present a novel momentum-based first order optimization method (AGNES) which provably achieves acceleration for convex minimization, even if the stochastic noise in the gradient estimates is many orders of magnitude larger than the gradient itself. Here we model the noise as having a variance which is proportional to the magnitude of the underlying gradient. We argue, based upon empirical evidence, that this is appropriate for mini-batch gradients in overparameterized deep learning. Furthermore, we demonstrate that the method achieves competitive performance in the training of CNNs on MNIST and CIFAR-10.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset