On Underdamped Nesterov's Acceleration

04/28/2023
by   Shuo Chen, et al.
0

The high-resolution differential equation framework has been proven to be tailor-made for Nesterov's accelerated gradient descent method () and its proximal correspondence – the class of faster iterative shrinkage thresholding algorithms (FISTA). However, the systems of theories is not still complete, since the underdamped case (r < 2) has not been included. In this paper, based on the high-resolution differential equation framework, we construct the new Lyapunov functions for the underdamped case, which is motivated by the power of the time t^γ or the iteration k^γ in the mixed term. When the momentum parameter r is 2, the new Lyapunov functions are identical to the previous ones. These new proofs do not only include the convergence rate of the objective value previously obtained according to the low-resolution differential equation framework but also characterize the convergence rate of the minimal gradient norm square. All the convergence rates obtained for the underdamped case are continuously dependent on the parameter r. In addition, it is observed that the high-resolution differential equation approximately simulates the convergence behavior of  for the critical case r=-1, while the low-resolution differential equation degenerates to the conservative Newton's equation. The high-resolution differential equation framework also theoretically characterizes the convergence rates, which are consistent with that obtained for the underdamped case with r=-1.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2022

Gradient Norm Minimization of Nesterov Acceleration: o(1/k^3)

In the history of first-order algorithms, Nesterov's accelerated gradien...
research
08/08/2022

A high-resolution dynamical view on momentum methods for over-parameterized neural networks

In this paper, we present the convergence analysis of momentum methods i...
research
03/06/2017

Learning across scales - A multiscale method for Convolution Neural Networks

In this work we establish the relation between optimal control and train...
research
06/16/2023

Linear convergence of Nesterov-1983 with the strong convexity

For modern gradient-based optimization, a developmental landmark is Nest...
research
12/12/2022

Revisiting the acceleration phenomenon via high-resolution differential equations

Nesterov's accelerated gradient descent (NAG) is one of the milestones i...
research
12/13/2022

Linear Convergence of ISTA and FISTA

In this paper, we revisit the class of iterative shrinkage-thresholding ...
research
11/03/2022

Proximal Subgradient Norm Minimization of ISTA and FISTA

For first-order smooth optimization, the research on the acceleration ph...

Please sign up or login with your details

Forgot password? Click here to reset