Non-Convex Stochastic Optimization via Non-Reversible Stochastic Gradient Langevin Dynamics
Stochastic gradient Langevin dynamics (SGLD) is a poweful algorithm for optimizing a non-convex objective, where a controlled and properly scaled Gaussian noise is added to the stochastic gradients to steer the iterates towards a global minimum. SGLD is based on the overdamped Langevin diffusion which is reversible in time. By adding an anti-symmetric matrix to the drift term of the overdamped Langevin diffusion, one gets a non-reversible diffusion that converges to the same stationary distribution with a faster convergence rate. In this paper, we study the non-reversible stochastic gradient Langevin dynamics (NSGLD) which is based on discretization of the non-reversible Langevin diffusion. We provide finite time performance bounds for the global convergence of NSGLD for solving stochastic non-convex optimization problems. Our results lead to non-asymptotic guarantees for both population and empirical risk minimization problems. Numerical experiments for a simple polynomial function optimization, Bayesian independent component analysis and neural network models show that NSGLD can outperform SGLD with proper choices of the anti-symmetric matrix.
READ FULL TEXT