Restricted Value Iteration: Theory and Algorithms

06/30/2011
by   N. L. Zhang, et al.
0

Value iteration is a popular algorithm for finding near optimal policies for POMDPs. It is inefficient due to the need to account for the entire belief space, which necessitates the solution of large numbers of linear programs. In this paper, we study value iteration restricted to belief subsets. We show that, together with properly chosen belief subsets, restricted value iteration yields near-optimal policies and we give a condition for determining whether a given belief subset would bring about savings in space and time. We also apply restricted value iteration to two interesting classes of POMDPs, namely informative POMDPs and near-discernible POMDPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2012

On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes

We consider infinite-horizon stationary γ-discounted Markov Decision Pro...
research
12/27/2022

Almost-Bayesian Quadratic Persuasion (Extended Version)

In this article, we relax the Bayesianity assumption in the now-traditio...
research
07/04/2022

Doubly-Asynchronous Value Iteration: Making Value Iteration Asynchronous in Actions

Value iteration (VI) is a foundational dynamic programming method, impor...
research
09/09/2011

Perseus: Randomized Point-based Value Iteration for POMDPs

Partially observable Markov decision processes (POMDPs) form an attracti...
research
02/23/2023

Intermittently Observable Markov Decision Processes

This paper investigates MDPs with intermittent state information. We con...
research
04/07/2022

Lower Bounds for Restricted Schemes in the Two-Adaptive Bitprobe Model

In the adaptive bitprobe model answering membership queries in two bitpr...
research
09/30/2011

Anytime Point-Based Approximations for Large POMDPs

The Partially Observable Markov Decision Process has long been recognize...

Please sign up or login with your details

Forgot password? Click here to reset