A General Framework of Multi-Armed Bandit Processes by Switching Restrictions

08/20/2018
by   Wenqing Bao, et al.
0

This paper proposes a general framework of multi-armed bandit (MAB) processes by introducing a type of restrictions on the switches among arms to the arms evolving in continuous time. The Gittins index process is developed for any single arm subject to the restrictions on stopping times and then the optimality of the corresponding Gittins index rule is established. The Gittins indices defined in this paper are consistent with the ones for MAB processes in continuous time, discrete time, and semi-Markovian setting so that the new theory covers the classical models as special cases and also applies to many other situations that have not yet been touched in the literature. While the proof of the optimality of Gittins index policies benefits from ideas in the existing theory of MAB processes in continuous time, new techniques are introduced which drastically simplifies the proof.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2018

A General Framework of Multi-Armed Bandit Processes by Arm Switch Restrictions

This paper proposes a general framework of multi-armed bandit (MAB) proc...
research
11/21/2021

The Gittins Policy in the M/G/1 Queue

The Gittins policy is a highly general scheduling policy that minimizes ...
research
05/31/2023

Restless Bandits with Average Reward: Breaking the Uniform Global Attractor Assumption

We study the infinite-horizon restless bandit problem with the average r...
research
03/31/2021

Robust Experimentation in the Continuous Time Bandit Problem

We study the experimentation dynamics of a decision maker (DM) in a two-...
research
02/02/2019

On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems

We investigate the optimality of perturbation based algorithms in the st...
research
05/19/2022

Adaptive Experiments and a Rigorous Framework for Type I Error Verification and Computational Experiment Design

This PhD thesis covers breakthroughs in several areas of adaptive experi...
research
07/13/2019

A new approach to Poissonian two-armed bandit problem

We consider a continuous time two-armed bandit problem in which incomes ...

Please sign up or login with your details

Forgot password? Click here to reset