Discovering Diverse Solutions in Deep Reinforcement Learning

03/12/2021
by   Takayuki Osa, et al.
0

Reinforcement learning (RL) algorithms are typically limited to learning a single solution of a specified task, even though there often exists diverse solutions to a given task. Compared with learning a single solution, learning a set of diverse solutions is beneficial because diverse solutions enable robust few-shot adaptation and allow the user to select a preferred solution. Although previous studies have showed that diverse behaviors can be modeled with a policy conditioned on latent variables, an approach for modeling an infinite set of diverse solutions with continuous latent variables has not been investigated. In this study, we propose an RL method that can learn infinitely many solutions by training a policy conditioned on a continuous or discrete low-dimensional latent variable. Through continuous control tasks, we demonstrate that our method can learn diverse solutions in a data-efficient manner and that the solutions can be used for few-shot adaptation to solve unseen tasks.

READ FULL TEXT
research
11/16/2020

Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

We present a hierarchical planning and control framework that enables an...
research
01/05/2019

Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization

Real-world tasks are often highly structured. Hierarchical reinforcement...
research
09/10/2018

VPE: Variational Policy Embedding for Transfer Reinforcement Learning

Reinforcement Learning methods are capable of solving complex problems, ...
research
07/12/2022

DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Recent algorithms designed for reinforcement learning tasks focus on fin...
research
07/19/2022

Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

As a pivotal component to attaining generalizable solutions in human int...
research
10/27/2020

One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL

While reinforcement learning algorithms can learn effective policies for...
research
10/17/2019

Single Episode Policy Transfer in Reinforcement Learning

Transfer and adaptation to new unknown environmental dynamics is a key c...

Please sign up or login with your details

Forgot password? Click here to reset