Safety Aware Changepoint Detection for Piecewise i.i.d. Bandits

05/27/2022
by   Subhojyoti Mukherjee, et al.
0

In this paper, we consider the setting of piecewise i.i.d. bandits under a safety constraint. In this piecewise i.i.d. setting, there exists a finite number of changepoints where the mean of some or all arms change simultaneously. We introduce the safety constraint studied in <cit.> to this setting such that at any round the cumulative reward is above a constant factor of the default action reward. We propose two actively adaptive algorithms for this setting that satisfy the safety constraint, detect changepoints, and restart without the knowledge of the number of changepoints or their locations. We provide regret bounds for our algorithms and show that the bounds are comparable to their counterparts from the safe bandit and piecewise i.i.d. bandit literature. We also provide the first matching lower bounds for this setting. Empirically, we show that our safety-aware algorithms perform similarly to the state-of-the-art actively adaptive algorithms that do not satisfy the safety constraint.

READ FULL TEXT
research
08/27/2019

A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits

We investigate the piecewise-stationary combinatorial semi-bandit proble...
research
05/30/2019

Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits

We consider the setup of stochastic multi-armed bandits in the case when...
research
02/11/2018

Nearly Optimal Adaptive Procedure for Piecewise-Stationary Bandit: a Change-Point Detection Approach

Multi-armed bandit (MAB) is a class of online learning problems where a ...
research
04/01/2022

Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk

We investigate a natural but surprisingly unstudied approach to the mult...
research
09/11/2023

Combinative Cumulative Knowledge Processes

We analyze Cumulative Knowledge Processes, introduced by Ben-Eliezer, Mi...
research
06/08/2020

Learning under Invariable Bayesian Safety

A recent body of work addresses safety constraints in explore-and-exploi...
research
09/30/2020

Stage-wise Conservative Linear Bandits

We study stage-wise conservative linear stochastic bandits: an instance ...

Please sign up or login with your details

Forgot password? Click here to reset