research
∙
04/20/2023
Optimal Activation of Halting Multi-Armed Bandit Models
We study new types of dynamic allocation problems the Halting Bandit mod...
research
∙
09/28/2019
Accelerating the Computation of UCB and Related Indices for Reinforcement Learning
In this paper we derive an efficient method for computing the indices as...
research
∙
09/13/2019
Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies
In this paper we consider the basic version of Reinforcement Learning (R...
research
∙
10/07/2015
Asymptotically Optimal Sequential Experimentation Under Generalized Ranking
We consider the classical problem of a controller activating (or samplin...
research
∙
05/12/2015
Asymptotic Behavior of Minimal-Exploration Allocation Policies: Almost Sure, Arbitrarily Slow Growing Regret
The purpose of this paper is to provide further understanding into the s...
research
∙
05/08/2015
An Asymptotically Optimal Policy for Uniform Bandits of Unknown Support
Consider the problem of a controller sampling sequentially from a finite...
research
∙
04/22/2015