Curiosity for machine agents has been a focus of lively research activit...
Curiosity for machine agents has been a focus of intense research. The s...
Large-scale Markov decision processes (MDPs) require planning algorithms...
Discounted reinforcement learning is fundamentally incompatible with fun...
We study the contextual linear bandit problem, a version of the standard...
We study a novel multi-armed bandit problem that models the challenge fa...