Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information

by   Jin Li, et al.

End-to-end learning robotic manipulation with high data efficiency is one of the key challenges in robotics. The latest methods that utilize human demonstration data and unsupervised representation learning has proven to be a promising direction to improve RL learning efficiency. The use of demonstration data also allows "warming-up" the RL policies using offline data with imitation learning or the recently emerged offline reinforcement learning algorithms. However, existing works often treat offline policy learning and online exploration as two separate processes, which are often accompanied by severe performance drop during the offline-to-online transition. Furthermore, many robotic manipulation tasks involve complex sub-task structures, which are very challenging to be solved in RL with sparse reward. In this work, we propose a unified offline-to-online RL framework that resolves the transition performance drop issue. Additionally, we introduce goal-aware state information to the RL agent, which can greatly reduce task complexity and accelerate policy learning. Combined with an advanced unsupervised representation learning module, our framework achieves great training efficiency and performance compared with the state-of-the-art methods in multiple robotic manipulation tasks.


page 1

page 2

page 3

page 4

page 5

page 6

page 7


How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation

Reinforcement learning (RL) has been shown to be effective at learning c...

Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated Environments

One of the key challenges of Reinforcement Learning (RL) is the ability ...

Benchmarking Offline Reinforcement Learning on Real-Robot Hardware

Learning policies from previously recorded data is a promising direction...

Robot Policy Learning from Demonstration Using Advantage Weighting and Early Termination

Learning robotic tasks in the real world is still highly challenging and...

Graph-Structured Policy Learning for Multi-Goal Manipulation Tasks

Multi-goal policy learning for robotic manipulation is challenging. Prio...

Learning Vision-based Robotic Manipulation Tasks Sequentially in Offline Reinforcement Learning Settings

With the rise of deep reinforcement learning (RL) methods, many complex ...

Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning

The difficulty of appropriately assigning credit is particularly heighte...

Please sign up or login with your details

Forgot password? Click here to reset