One After Another: Learning Incremental Skills for a Changing World

by   Nur Muhammad Shafiullah, et al.

Reward-free, unsupervised discovery of skills is an attractive alternative to the bottleneck of hand-designing rewards in environments where task supervision is scarce or expensive. However, current skill pre-training methods, like many RL techniques, make a fundamental assumption - stationary environments during training. Traditional methods learn all their skills simultaneously, which makes it difficult for them to both quickly adapt to changes in the environment, and to not forget earlier skills after such adaptation. On the other hand, in an evolving or expanding environment, skill learning must be able to adapt fast to new environment situations while not forgetting previously learned skills. These two conditions make it difficult for classic skill discovery to do well in an evolving environment. In this work, we propose a new framework for skill discovery, where skills are learned one after another in an incremental fashion. This framework allows newly learned skills to adapt to new environment or agent dynamics, while the fixed old skills ensure the agent doesn't forget a learned skill. We demonstrate experimentally that in both evolving and static environments, incremental skills significantly outperform current state-of-the-art skill discovery methods on both skill quality and the ability to solve downstream tasks. Videos for learned skills and code are made public on


page 7

page 8


Unsupervised Skill Discovery with Bottleneck Option Learning

Having the ability to acquire inherent skills from environments without ...

Choreographer: Learning and Adapting Skills in Imagination

Unsupervised skill learning aims to learn a rich repertoire of behaviors...

One Big Net For Everything

I apply recent work on "learning to think" (2015) and on PowerPlay (2011...

Reset-Free Lifelong Learning with Skill-Space Planning

The objective of lifelong reinforcement learning (RL) is to optimize age...

Skill Discovery of Coordination in Multi-agent Reinforcement Learning

Unsupervised skill discovery drives intelligent agents to explore the un...

Voyager: An Open-Ended Embodied Agent with Large Language Models

We introduce Voyager, the first LLM-powered embodied lifelong learning a...

Mathematical model bridges disparate timescales of lifelong learning

Lifelong learning occurs on timescales ranging from minutes to decades. ...

Please sign up or login with your details

Forgot password? Click here to reset