research
∙
12/12/2022
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
We study reinforcement learning (RL) with linear function approximation....
research
∙
02/28/2022
Bandit Learning with General Function Classes: Heteroscedastic Noise and Variance-dependent Regret Bounds
We consider learning a stochastic bandit model, where the reward functio...
research
∙
10/25/2021