CoMPS: Continual Meta Policy Search

12/08/2021
by   Glen Berseth, et al.
9

We develop a new continual meta-learning method to address challenges in sequential multi-task learning. In this setting, the agent's goal is to achieve high reward over any sequence of tasks quickly. Prior meta-reinforcement learning algorithms have demonstrated promising results in accelerating the acquisition of new tasks. However, they require access to all tasks during training. Beyond simply transferring past experience to new tasks, our goal is to devise continual reinforcement learning algorithms that learn to learn, using their experience on previous tasks to learn new tasks more quickly. We introduce a new method, continual meta-policy search (CoMPS), that removes this limitation by meta-training in an incremental fashion, over each task in a sequence, without revisiting prior tasks. CoMPS continuously repeats two subroutines: learning a new task using RL and using the experience from RL to perform completely offline meta-learning to prepare for subsequent task learning. We find that CoMPS outperforms prior continual learning and off-policy meta-reinforcement methods on several sequences of challenging continuous control tasks.

READ FULL TEXT

page 8

page 9

research
03/19/2019

Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables

Deep reinforcement learning algorithms require large amounts of experien...
research
05/29/2023

Continual Task Allocation in Meta-Policy Network via Sparse Prompting

How to train a generalizable meta-policy by continually learning a seque...
research
11/20/2017

Modular Continual Learning in a Unified Visual Environment

A core aspect of human intelligence is the ability to learn new tasks qu...
research
07/13/2022

Continual Meta-Reinforcement Learning for UAV-Aided Vehicular Wireless Networks

Unmanned aerial base stations (UABSs) can be deployed in vehicular wirel...
research
10/21/2022

Continual Reinforcement Learning with Group Symmetries

Continual reinforcement learning (RL) aims to learn a sequence of tasks ...
research
02/22/2022

Continual Auxiliary Task Learning

Learning auxiliary tasks, such as multiple predictions about the world, ...
research
04/01/2019

Guided Meta-Policy Search

Reinforcement learning (RL) algorithms have demonstrated promising resul...

Please sign up or login with your details

Forgot password? Click here to reset