A Fast Anderson-Chebyshev Mixing Method for Nonlinear Optimization

09/07/2018
by   Zhize Li, et al.
0

Anderson mixing (or Anderson acceleration) is an efficient acceleration method for fixed point iterations x_t+1=G(x_t), e.g., gradient descent can be viewed as iteratively applying the operation G(x) = x-α∇ f(x). It is known that Anderson mixing is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. In this paper, we show that Anderson mixing with Chebyshev polynomial parameters can achieve the optimal convergence rate O(√(κ)1/ϵ), which improves the previous result O(κ1/ϵ) provided by [Toth and Kelley, 2015] for quadratic functions. Then, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter L) are not available, we propose a guessing algorithm for guessing them dynamically and also prove a similar convergence rate. Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev mixing method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD. Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.

READ FULL TEXT
research
09/07/2018

An Anderson-Chebyshev Mixing Method for Nonlinear Optimization

Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
research
11/14/2014

Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of Expected-Value Functions

Classical stochastic gradient methods are well suited for minimizing exp...
research
01/01/2021

On a Faster R-Linear Convergence Rate of the Barzilai-Borwein Method

The Barzilai-Borwein (BB) method has demonstrated great empirical succes...
research
06/22/2023

Iteratively Preconditioned Gradient-Descent Approach for Moving Horizon Estimation Problems

Moving horizon estimation (MHE) is a widely studied state estimation app...
research
10/26/2020

Convergence Acceleration via Chebyshev Step: Plausible Interpretation of Deep-Unfolded Gradient Descent

Deep unfolding is a promising deep-learning technique, whose network arc...
research
10/17/2021

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

Anderson mixing has been heuristically applied to reinforcement learning...
research
04/11/2021

Alternating cyclic extrapolation methods for optimization algorithms

This article introduces new acceleration methods for fixed point iterati...

Please sign up or login with your details

Forgot password? Click here to reset