research
          
      
      ∙
      07/19/2022
    Actor-Critic based Improper Reinforcement Learning
We consider an improper reinforcement learning setting where a learner i...
          
            research
          
      
      ∙
      02/16/2021
    Improper Learning with Gradient-based Policy Optimization
We consider an improper reinforcement learning setting where the learner...
          
            research
          
      
      ∙
      06/13/2020
    Explicit Best Arm Identification in Linear Bandits Using No-Regret Learners
We study the problem of best arm identification in linearly parameterise...
          
            research
          
      
      ∙
      11/05/2019
     
             
  
  
     
                             
                             share
 share