Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry

by   Ziyi Chen, et al.

The gradient descent-ascent (GDA) algorithm has been widely applied to solve minimax optimization problems. In order to achieve convergent policy parameters for minimax optimization, it is important that GDA generates convergent variable sequences rather than convergent sequences of function values or gradient norms. However, the variable convergence of GDA has been proved only under convexity geometries, and there lacks understanding for general nonconvex minimax optimization. This paper fills such a gap by studying the convergence of a more general proximal-GDA for regularized nonconvex-strongly-concave minimax optimization. Specifically, we show that proximal-GDA admits a novel Lyapunov function, which monotonically decreases in the minimax optimization process and drives the variable sequence to a critical point. By leveraging this Lyapunov function and the KŁ geometry that parameterizes the local geometries of general nonconvex functions, we formally establish the variable convergence of proximal-GDA to a critical point x^*, i.e., x_t→ x^*, y_t→ y^*(x^*). Furthermore, over the full spectrum of the KŁ-parameterized geometry, we show that proximal-GDA achieves different types of convergence rates ranging from sublinear convergence up to finite-step convergence, depending on the geometry associated with the KŁ parameter. This is the first theoretical result on the variable convergence for nonconvex minimax optimization.


page 1

page 2

page 3

page 4


Escaping Saddle Points in Nonconvex Minimax Optimization via Cubic-Regularized Gradient Descent-Ascent

The gradient descent-ascent (GDA) algorithm has been widely applied to s...

Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning

Alternating gradient-descent-ascent (AltGDA) is an optimization algorith...

SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization

Stochastic gradient descent-ascent (SGDA) is one of the main workhorses ...

Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities

In this paper, we study zeroth-order algorithms for minimax optimization...

Variable Metric Proximal Gradient Method with Diagonal Barzilai-Borwein Stepsize

Variable metric proximal gradient (VM-PG) is a widely used class of conv...

On the Proximal Gradient Algorithm with Alternated Inertia

In this paper, we investigate the attractive properties of the proximal ...

Asynchronous Delay-Aware Accelerated Proximal Coordinate Descent for Nonconvex Nonsmooth Problems

Nonconvex and nonsmooth problems have recently attracted considerable at...

Please sign up or login with your details

Forgot password? Click here to reset