We consider an improper reinforcement learning setting where a learner i...
We consider an improper reinforcement learning setting where the learner...
We study the problem of best arm identification in linearly parameterise...
We give a new algorithm for best arm identification in linearly paramete...