Stein Variational Goal Generation For Reinforcement Learning in Hard Exploration Problems

by   Nicolas Castanet, et al.

Multi-goal Reinforcement Learning has recently attracted a large amount of research interest. By allowing experience to be shared between related training tasks, this setting favors generalization for new tasks at test time, whenever some smoothness exists in the considered representation space of goals. However, in settings with discontinuities in state or goal spaces (e.g. walls in a maze), a majority of goals are difficult to reach, due to the sparsity of rewards in the absence of expert knowledge. This implies hard exploration, for which some curriculum of goals must be discovered, to help agents learn by adapting training tasks to their current capabilities. Building on recent automatic curriculum learning techniques for goal-conditioned policies, we propose a novel approach: Stein Variational Goal Generation (SVGG), which seeks at preferably sampling new goals in the zone of proximal development of the agent, by leveraging a learned model of its abilities, and a goal distribution modeled as particles in the exploration space. Our approach relies on Stein Variational Gradient Descent to dynamically attract the goal sampling distribution in areas of appropriate difficulty. We demonstrate the performances of the approach, in terms of success coverage in the goal space, compared to recent state-of-the-art RL methods for hard exploration problems.


page 4

page 9

page 12

page 13

page 14


Curriculum goal masking for continuous deep reinforcement learning

Deep reinforcement learning has recently gained a focus on problems wher...

It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation

We are interested in training general-purpose reinforcement learning age...

Automated curricula through setter-solver interactions

Reinforcement learning algorithms use correlations between policies and ...

Scaling Goal-based Exploration via Pruning Proto-goals

One of the gnarliest challenges in reinforcement learning (RL) is explor...

State-Aware Variational Thompson Sampling for Deep Q-Networks

Thompson sampling is a well-known approach for balancing exploration and...

Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning

Learning to reach goal states and learning diverse skills through mutual...

Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning

What goals should a multi-goal reinforcement learning agent pursue durin...

Please sign up or login with your details

Forgot password? Click here to reset