Generating Automatic Curricula via Self-Supervised Active Domain Randomization

Goal-directed Reinforcement Learning (RL) traditionally considers an agent interacting with an environment, prescribing a real-valued reward to an agent proportional to the completion of some goal. Goal-directed RL has seen large gains in sample efficiency, due to the ease of reusing or generating new experience by proposing goals. In this work, we build on the framework of self-play, allowing an agent to interact with itself in order to make progress on some unknown task. We use Active Domain Randomization and self-play to create a novel, coupled environment-goal curriculum, where agents learn through progressively more difficult tasks and environment variations. Our method, Self-Supervised Active Domain Randomization (SS-ADR), generates a growing curriculum, encouraging the agent to try tasks that are just outside of its current capabilities, while building a domain-randomization curriculum that enables state-of-the-art results on various sim2real transfer tasks. Our results show that a curriculum of co-evolving the environment difficulty along with the difficulty of goals set in each environment provides practical benefits in the goal-directed tasks tested.


page 4

page 5


Automatic Goal Generation using Dynamical Distance Learning

Reinforcement Learning (RL) agents can learn to solve complex sequential...

Automatic Curriculum Learning through Value Disagreement

Continually solving new, unsolved tasks is the key to learning diverse b...

Outcome-directed Reinforcement Learning by Uncertainty Temporal Distance-Aware Curriculum Goal Generation

Current reinforcement learning (RL) often suffers when solving a challen...

It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation

We are interested in training general-purpose reinforcement learning age...

Effects of Reward Shaping on Curriculum Learning in Goal Conditioned Tasks

Real-time control for robotics is a popular research area in the reinfor...

Situated Dialogue Learning through Procedural Environment Generation

We teach goal-driven agents to interactively act and speak in situated e...

A Song of Ice and Fire: Analyzing Textual Autotelic Agents in ScienceWorld

Building open-ended agents that can autonomously discover a diversity of...

Please sign up or login with your details

Forgot password? Click here to reset