Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies

05/27/2019
by   Yonathan Efroni, et al.
0

State-of-the-art efficient model-based Reinforcement Learning (RL) algorithms typically act by iteratively solving empirical models, i.e., by performing full-planning on Markov Decision Processes (MDPs) built by the gathered experience. In this paper, we focus on model-based RL in the finite-state finite-horizon MDP setting and establish that exploring with greedy policies -- act by 1-step planning -- can achieve tight minimax performance in terms of regret, Õ(√(HSAT)). Thus, full-planning in model-based RL can be avoided altogether without any performance degradation, and, by doing so, the computational complexity decreases by a factor of S. The results are based on a novel analysis of real-time dynamic programming, then extended to model-based RL. Specifically, we generalize existing algorithms that perform full-planning to such that act by 1-step planning. For these generalizations, we prove regret bounds with the same rate as their full-planning counterparts.

READ FULL TEXT
research
04/11/2022

Settling the Sample Complexity of Model-Based Offline Reinforcement Learning

This paper is concerned with offline reinforcement learning (RL), which ...
research
06/27/2012

Incremental Model-based Learners With Formal Learning-Time Guarantees

Model-based learning algorithms have been shown to use experience effici...
research
06/24/2020

Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes

We study minimax optimal reinforcement learning in episodic factored Mar...
research
06/02/2023

Efficient RL with Impaired Observability: Learning to Act with Delayed and Missing State Observations

In real-world reinforcement learning (RL) systems, various forms of impa...
research
02/08/2023

Predictable MDP Abstraction for Unsupervised Model-Based RL

A key component of model-based reinforcement learning (RL) is a dynamics...
research
06/16/2022

Understanding Decision-Time vs. Background Planning in Model-Based Reinforcement Learning

In model-based reinforcement learning, an agent can leverage a learned m...
research
10/19/2016

A Reinforcement Learning Approach to the View Planning Problem

We present a Reinforcement Learning (RL) solution to the view planning p...

Please sign up or login with your details

Forgot password? Click here to reset