Reinforcement learning has been widely adopted to model dialogue manager...
Task-oriented dialogue systems are designed to achieve specific goals wh...
The task of verbalization of RDF triples has known a growth in popularit...
A learning dialogue agent can infer its behaviour from interactions with...
Can we learn a control policy able to adapt its behaviour in real time s...
We study a variant of the stochastic multi-armed bandit (MAB) problem in...
We present an approach to structured prediction from bandit feedback, ca...