A Nonstochastic Control Approach to Optimization
Tuning optimizer hyperparameters, notably the learning rate to a particular optimization instance, is an important but nonconvex problem. Therefore iterative optimization methods such as hypergradient descent lack global optimality guarantees in general. We propose an online nonstochastic control methodology for mathematical optimization. The choice of hyperparameters for gradient based methods, including the learning rate, momentum parameter and preconditioner, is described as feedback control. The optimal solution to this control problem is shown to encompass preconditioned adaptive gradient methods with varying acceleration and momentum parameters. Although the optimal control problem by itself is nonconvex, we show how recent methods from online nonstochastic control based on convex relaxation can be applied to compete with the best offline solution. This guarantees that in episodic optimization, we converge to the best optimization method in hindsight.
READ FULL TEXT