Online Newton Step Algorithm with Estimated Gradient

11/25/2018
by   Binbin Liu, et al.
0

Online learning with limited information feedback (bandit) tries to solve the problem where an online learner receives partial feedback information from the environment in the course of learning. Under this setting, Flaxman extends Zinkevich's classical Online Gradient Descent (OGD) algorithm Zinkevich [2003] by proposing the Online Gradient Descent with Expected Gradient (OGDEG) algorithm. Specifically, it uses a simple trick to approximate the gradient of the loss function f_t by evaluating it at a single point and bounds the expected regret as O(T^5/6) Flaxman et al. [2005]. It has been shown that compared with the first-order algorithms, second-order online learning algorithms such as Online Newton Step (ONS) Hazan et al. [2007] can significantly accelerate the convergence rate in traditional online learning. Motivated by this, this paper aims to exploit second-order information to speed up the convergence of OGDEG. In particular, we extend the ONS algorithm with the trick of expected gradient and develop a novel second-order online learning algorithm, i.e., Online Newton Step with Expected Gradient (ONSEG). Theoretically, we show that the proposed ONSEG algorithm significantly reduces the expected regret of OGDEG from O(T^5/6) to O(T^2/3) in the bandit feedback scenario. Empirically, we demonstrate the advantages of the proposed algorithm on several real-world datasets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset