Learning to Play by Imitating Humans

06/11/2020
by   Rostam Dinyari, et al.
0

Acquiring multiple skills has commonly involved collecting a large number of expert demonstrations per task or engineering custom reward functions. Recently it has been shown that it is possible to acquire a diverse set of skills by self-supervising control on top of human teleoperated play data. Play is rich in state space coverage and a policy trained on this data can generalize to specific tasks at test time outperforming policies trained on individual expert task demonstrations. In this work, we explore the question of whether robots can learn to play to autonomously generate play data that can ultimately enhance performance. By training a behavioral cloning policy on a relatively small quantity of human play, we autonomously generate a large quantity of cloned play data that can be used as additional training. We demonstrate that a general purpose goal-conditioned policy trained on this augmented dataset substantially outperforms one trained only with the original human data on 18 difficult user-specified manipulation tasks in a simulated robotic tabletop environment. A video example of a robot imitating human play can be seen here: https://learning-to-play.github.io/videos/undirected_play1.mp4

READ FULL TEXT

page 1

page 2

research
03/05/2019

Learning Latent Plans from Play

We propose learning from teleoperated play data (LfP) as a way to scale ...
research
05/10/2023

Learning Video-Conditioned Policies for Unseen Manipulation Tasks

The ability to specify robot commands by a non-expert user is critical f...
research
03/10/2022

PLATO: Predicting Latent Affordances Through Object-Centric Play

Constructing a diverse repertoire of manipulation skills in a scalable f...
research
11/30/2018

Hierarchical Policy Design for Sample-Efficient Learning of Robot Table Tennis Through Self-Play

Training robots with physical bodies requires developing new methods and...
research
01/13/2021

Asymmetric self-play for automatic goal discovery in robotic manipulation

We train a single, goal-conditioned policy that can solve many robotic m...
research
05/15/2020

Grounding Language in Play

Natural language is perhaps the most versatile and intuitive way for hum...
research
10/18/2022

From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data

While large-scale sequence modeling from offline data has led to impress...

Please sign up or login with your details

Forgot password? Click here to reset