Projection-Free Adaptive Gradients for Large-Scale Optimization

09/29/2020
by   Cyrille W. Combettes, et al.
0

The complexity in large-scale optimization can lie in both handling the objective function and handling the constraint set. In this respect, stochastic Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens, by querying only approximate first-order information from the objective and by maintaining feasibility of the iterates without using projections. In this paper, we improve the quality of their first-order information by blending in adaptive gradients. Starting from the design of adaptive gradient algorithms, we propose to solve the occurring constrained optimization subproblems very incompletely via a fixed and small number of iterations of the Frank-Wolfe algorithm (often times only 2 iterations), in order to preserve the low per-iteration complexity. We derive convergence rates and demonstrate the computational advantage of our method over the state-of-the-art stochastic Frank-Wolfe algorithms.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset