Geometric Exploration for Online Control

10/25/2020
by   Orestis Plevrakis, et al.
6

We study the control of an unknown linear dynamical system under general convex costs. The objective is minimizing regret vs. the class of disturbance-feedback-controllers, which encompasses all stabilizing linear-dynamical-controllers. In this work, we first consider the case of known cost functions, for which we design the first polynomial-time algorithm with n^3√(T)-regret, where n is the dimension of the state plus the dimension of control input. The √(T)-horizon dependence is optimal, and improves upon the previous best known bound of T^2/3. The main component of our algorithm is a novel geometric exploration strategy: we adaptively construct a sequence of barycentric spanners in the policy space. Second, we consider the case of bandit feedback, for which we give the first polynomial-time algorithm with poly(n)√(T)-regret, building on Stochastic Bandit Convex Optimization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2020

Bandit Linear Control

We consider the problem of controlling a known linear dynamical system u...
research
05/24/2023

Optimal Rates for Bandit Nonstochastic Control

Linear Quadratic Regulator (LQR) and Linear Quadratic Gaussian (LQG) con...
research
07/11/2016

Kernel-based methods for bandit convex optimization

We consider the adversarial convex bandit problem and we build the first...
research
07/15/2020

Improved algorithms for online load balancing

We consider an online load balancing problem and its extensions in the f...
research
12/21/2013

Volumetric Spanners: an Efficient Exploration Basis for Learning

Numerous machine learning problems require an exploration basis - a mech...
research
03/16/2021

Taming Wild Price Fluctuations: Monotone Stochastic Convex Optimization with Bandit Feedback

Prices generated by automated price experimentation algorithms often dis...
research
07/23/2020

Explore More and Improve Regret in Linear Quadratic Regulators

Stabilizing the unknown dynamics of a control system and minimizing regr...

Please sign up or login with your details

Forgot password? Click here to reset