Wesley Cowan

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Junya Honda
29 publications
Michael N. Katehakis
10 publications
Sheldon M Ross
2 publications
Daniel Pirutinsky
2 publications

research

∙ 04/20/2023

Optimal Activation of Halting Multi-Armed Bandit Models

We study new types of dynamic allocation problems the Halting Bandit mod...

0 Wesley Cowan, et al. ∙

research

∙ 09/28/2019

Accelerating the Computation of UCB and Related Indices for Reinforcement Learning

In this paper we derive an efficient method for computing the indices as...

17 Wesley Cowan, et al. ∙

research

∙ 09/13/2019

Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies

In this paper we consider the basic version of Reinforcement Learning (R...

1 Wesley Cowan, et al. ∙

research

∙ 10/07/2015

Asymptotically Optimal Sequential Experimentation Under Generalized Ranking

We consider the classical problem of a controller activating (or samplin...

0 Wesley Cowan, et al. ∙

research

∙ 05/12/2015

Asymptotic Behavior of Minimal-Exploration Allocation Policies: Almost Sure, Arbitrarily Slow Growing Regret

The purpose of this paper is to provide further understanding into the s...

0 Wesley Cowan, et al. ∙

research

∙ 05/08/2015

An Asymptotically Optimal Policy for Uniform Bandits of Unknown Support

Consider the problem of a controller sampling sequentially from a finite...

0 Wesley Cowan, et al. ∙

research

∙ 04/22/2015

Normal Bandits of Unknown Means and Variances: Asymptotic Optimality, Finite Horizon Regret Bounds, and a Solution to an Open Problem

Consider the problem of sampling sequentially from a finite number of N ...

0 Wesley Cowan, et al. ∙

Success!

An error occurred

Wesley Cowan

Featured Co-authors

Optimal Activation of Halting Multi-Armed Bandit Models

Accelerating the Computation of UCB and Related Indices for Reinforcement Learning

Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies

Asymptotically Optimal Sequential Experimentation Under Generalized Ranking

Asymptotic Behavior of Minimal-Exploration Allocation Policies: Almost Sure, Arbitrarily Slow Growing Regret

An Asymptotically Optimal Policy for Uniform Bandits of Unknown Support

Normal Bandits of Unknown Means and Variances: Asymptotic Optimality, Finite Horizon Regret Bounds, and a Solution to an Open Problem

Sign in with Google

Consider DeepAI Pro