Improved Regret Bounds for Projection-free Bandit Convex Optimization

10/08/2019

∙

We revisit the challenge of designing online algorithms for the bandit convex optimization problem (BCO) which are also scalable to high dimensional problems. Hence, we consider algorithms that are projection-free, i.e., based on the conditional gradient method whose only access to the feasible decision set, is through a linear optimization oracle (as opposed to other methods which require potentially much more computationally-expensive subprocedures, such as computing Euclidean projections). We present the first such algorithm that attains O(T^3/4) expected regret using only O(T) overall calls to the linear optimization oracle, in expectation, where T is the number of prediction rounds. This improves over the O(T^4/5) expected regret bound recently obtained by <cit.>, and actually matches the current best regret bound for projection-free online learning in the full information setting.

READ FULL TEXT

Improved Regret Bounds for Projection-free Bandit Convex Optimization

Sign in with Google

Consider DeepAI Pro