Non-stationary Online Learning with Memory and Non-stochastic Control

02/07/2021
by   Peng Zhao, et al.
0

We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions and thus captures temporal effects of learning problems. In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments, which competes algorithms' decisions with a sequence of changing comparators. We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret. The key technical challenge is how to control the switching cost, the cumulative movements of player's decisions, which is neatly addressed by a novel decomposition of dynamic policy regret and an appropriate meta-expert structure. Furthermore, we generalize the results to the problem of online non-stochastic control, i.e., controlling a linear dynamical system with adversarial disturbance and convex loss functions. We derive a novel gradient-based controller with dynamic policy regret guarantees, which is the first controller competitive to a sequence of changing policies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2020

Dynamic Regret of Convex and Smooth Functions

We investigate online convex optimization in non-stationary environments...
research
01/16/2021

Blind Optimal User Association in Small-Cell Networks

We learn optimal user association policies for traffic from different lo...
research
09/12/2019

Nonstationary Nonparametric Online Learning: Balancing Dynamic Regret and Model Parsimony

An open challenge in supervised learning is conceptual drift: a data poi...
research
01/31/2023

Online Learning in Dynamically Changing Environments

We study the problem of online learning and online regret minimization w...
research
10/09/2019

Derivative-Free Order-Robust Optimisation

In this paper, we formalise order-robust optimisation as an instance of ...
research
06/30/2021

Koopman Spectrum Nonlinear Regulator and Provably Efficient Online Learning

Most modern reinforcement learning algorithms optimize a cumulative sing...
research
02/18/2021

Online Optimization and Learning in Uncertain Dynamical Environments with Performance Guarantees

We propose a new framework to solve online optimization and learning pro...

Please sign up or login with your details

Forgot password? Click here to reset