Correlated Multi-armed Bandits with a Latent Random Source

08/17/2018
by   Samarth Gupta, et al.
0

We consider a novel multi-armed bandit framework where the rewards obtained by pulling the arms are functions of a common latent random variable. The correlation between arms due to the common random source can be used to design a generalized upper-confidence-bound (UCB) algorithm that identifies certain arms as non-competitive, and avoids exploring them. As a result, we reduce a K-armed bandit problem to a C+1-armed problem, where C+1 includes the best arm and C competitive arms. Our regret analysis shows that the competitive arms need to be pulled O( T) times, while the non-competitive arms are pulled only O(1) times. As a result, there are regimes where our algorithm achieves a O(1) regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms. We also evaluate lower bounds on the expected regret and prove that our correlated-UCB algorithm is order-wise optimal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2019

Multi-Armed Bandits with Correlated Arms

We consider a multi-armed bandit framework where the rewards obtained by...
research
10/11/2019

Old Dog Learns New Tricks: Randomized UCB for Bandit Problems

We propose RandUCB, a bandit strategy that uses theoretically derived co...
research
01/24/2019

Regret Minimisation in Multi-Armed Bandits Using Bounded Arm Memory

In this paper, we propose a constant word (RAM model) algorithm for regr...
research
02/24/2020

Optimal and Greedy Algorithms for Multi-Armed Bandits with Many Arms

We characterize Bayesian regret in a stochastic multi-armed bandit probl...
research
06/17/2020

Constrained regret minimization for multi-criterion multi-armed bandits

We consider a stochastic multi-armed bandit setting and study the proble...
research
05/26/2019

Phase Transitions and Cyclic Phenomena in Bandits with Switching Constraints

We consider the classical stochastic multi-armed bandit problem with a c...
research
10/13/2020

Multi-Armed Bandits with Dependent Arms

We study a variant of the classical multi-armed bandit problem (MABP) wh...

Please sign up or login with your details

Forgot password? Click here to reset