Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning

03/02/2023
by   Archit Sharma, et al.
0

In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on. An aspirational goal is to construct self-improving robots: robots that can learn and improve on their own, from autonomous interaction with minimal human supervision or oversight. Such robots could collect and train on much larger datasets, and thus learn more robust and performant policies. While reinforcement learning offers a framework for such autonomous learning via trial-and-error, practical realizations end up requiring extensive human supervision for reward function design and repeated resetting of the environment between episodes of interactions. In this work, we propose MEDAL++, a novel design for self-improving robotic systems: given a small set of expert demonstrations at the start, the robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations. The policy and reward function are learned end-to-end from high-dimensional visual inputs, bypassing the need for explicit state estimation or task-specific pre-training for visual encoders used in prior work. We first evaluate our proposed algorithm on a simulated non-episodic benchmark EARL, finding that MEDAL++ is both more data efficient and gets up to 30 Our real-robot experiments show that MEDAL++ can be applied to manipulation problems in larger environments than those considered in prior work, and autonomous self-improvement can improve the success rate by 30-70 behavior cloning on just the expert data. Code, training and evaluation videos along with a brief overview is available at: https://architsharma97.github.io/self-improving-robots/

READ FULL TEXT

page 1

page 7

page 8

page 12

research
11/16/2022

Learning Reward Functions for Robotic Manipulation by Observing Humans

Observing a human demonstrator manipulate objects provides a rich, scala...
research
10/17/2020

Learning from Suboptimal Demonstration via Self-Supervised Reward Regression

Learning from Demonstration (LfD) seeks to democratize robotics by enabl...
research
07/07/2022

Energy-based Legged Robots Terrain Traversability Modeling via Deep Inverse Reinforcement Learning

This work reports on developing a deep inverse reinforcement learning me...
research
05/22/2019

Practical Robot Learning from Demonstrations using Deep End-to-End Training

Robots need to learn behaviors in intuitive and practical ways for wides...
research
09/25/2019

"Good Robot!": Efficient Reinforcement Learning for Multi-Step Visual Tasks via Reward Shaping

In order to learn effectively, robots must be able to extract the intang...
research
09/07/2021

Robot Sound Interpretation: Learning Visual-Audio Representations for Voice-Controlled Robots

Inspired by sensorimotor theory, we propose a novel pipeline for voice-c...
research
10/29/2021

Learning to Be Cautious

A key challenge in the field of reinforcement learning is to develop age...

Please sign up or login with your details

Forgot password? Click here to reset