From Finite to Countable-Armed Bandits

05/22/2021
by   Anand Kalvit, et al.
0

We consider a stochastic bandit problem with countably many arms that belong to a finite set of types, each characterized by a unique mean reward. In addition, there is a fixed distribution over types which sets the proportion of each type in the population of arms. The decision maker is oblivious to the type of any arm and to the aforementioned distribution over types, but perfectly knows the total number of types occurring in the population of arms. We propose a fully adaptive online learning algorithm that achieves O(log n) distribution-dependent expected cumulative regret after any number of plays n, and show that this order of regret is best possible. The analysis of our algorithm relies on newly discovered concentration and convergence properties of optimism-based policies like UCB in finite-armed bandit problems with "zero gap," which may be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2023

Complexity Analysis of a Countable-armed Bandit Problem

We consider a stochastic multi-armed bandit (MAB) problem motivated by “...
research
10/23/2021

The Countable-armed Bandit with Vanishing Arms

We consider a bandit problem with countably many arms, partitioned into ...
research
01/24/2019

Regret Minimisation in Multi-Armed Bandits Using Bounded Arm Memory

In this paper, we propose a constant word (RAM model) algorithm for regr...
research
06/05/2021

Differentially Private Multi-Armed Bandits in the Shuffle Model

We give an (ε,δ)-differentially private algorithm for the multi-armed ba...
research
06/28/2023

Allocating Divisible Resources on Arms with Unknown and Random Rewards

We consider a decision maker allocating one unit of renewable and divisi...
research
08/31/2021

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

We consider the query recommendation problem in closed loop interactive ...
research
02/15/2018

Bandit Learning with Positive Externalities

Many platforms are characterized by the fact that future user arrivals a...

Please sign up or login with your details

Forgot password? Click here to reset