Chat Image Generator Video Music Voice Chat Photo Editor

Batched Multi-Armed Bandits with Optimal Regret

10/11/2019

∙

We present a simple and efficient algorithm for the batched stochastic multi-armed bandit problem. We prove a bound for its expected regret that improves over the best-known regret bound, for any number of batches. In particular, our algorithm achieves the optimal expected regret by using only a logarithmic number of batches.

READ FULL TEXT

Success!

An error occurred

Batched Multi-Armed Bandits with Optimal Regret

Sign in with Google

Consider DeepAI Pro