MAGIC: Learning Macro-Actions for Online POMDP Planning using Generator-Critic

11/07/2020
by   Yiyuan Lee, et al.
0

When robots operate in the real-world, they need to handle uncertainties in sensing, acting, and the environment. Many tasks also require reasoning about long-term consequences of robot decisions. The partially observable Markov decision process (POMDP) offers a principled approach for planning under uncertainty. However, its computational complexity grows exponentially with the planning horizon. We propose to use temporally-extended macro-actions to cut down the effective planning horizon and thus the exponential factor of the complexity. We propose Macro-Action Generator-Critic (MAGIC), an algorithm that learns a macro-action generator from data, and uses the learned macro-actions to perform long-horizon planning. MAGIC learns the generator using experience provided by an online planner, and in-turn conditions the planner using the generated macro-actions. We evaluate MAGIC on several long-term planning tasks, showing that it significantly outperforms planning using primitive actions, hand-crafted macro-actions, as well as naive reinforcement learning in both simulation and on a real robot.

READ FULL TEXT
research
03/22/2019

Macro Action Reinforcement Learning with Sequence Disentanglement using Variational Autoencoder

One problem in the application of reinforcement learning to real-world p...
research
10/12/2011

Marvin: A Heuristic Search Planner with Online Macro-Action Learning

This paper describes Marvin, a planner that competed in the Fourth Inter...
research
01/16/2014

Efficient Planning under Uncertainty with Macro-actions

Deciding how to act in partially observable environments remains an acti...
research
10/26/2020

POMDP Manipulation Planning under Object Composition Uncertainty

Manipulating unknown objects in a cluttered environment is difficult bec...
research
12/07/2022

Policy Transfer via Enhanced Action Space

Though transfer learning is promising to increase the learning efficienc...
research
06/15/2016

Strategic Attentive Writer for Learning Macro-Actions

We present a novel deep recurrent neural network architecture that learn...
research
03/02/2018

Active model learning and diverse action sampling for task and motion planning

The objective of this work is to augment the basic abilities of a robot ...

Please sign up or login with your details

Forgot password? Click here to reset