The Dreaming Variational Autoencoder for Reinforcement Learning Environments

10/02/2018
by   Per-Arne Andersen, et al.
10

Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.

READ FULL TEXT

page 7

page 8

page 11

page 12

research
09/01/2021

A Survey of Exploration Methods in Reinforcement Learning

Exploration is an essential component of reinforcement learning algorith...
research
09/10/2016

Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

We consider scenarios from the real-time strategy game StarCraft as new ...
research
06/15/2018

Improving width-based planning with compact policies

Optimal action selection in decision problems characterized by sparse, d...
research
01/06/2019

What Should I Do Now? Marrying Reinforcement Learning and Symbolic Planning

Long-term planning poses a major difficulty to many reinforcement learni...
research
01/19/2020

FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback

Reinforcement learning has been successful in training autonomous agents...
research
09/08/2020

Deep Active Inference for Partially Observable MDPs

Deep active inference has been proposed as a scalable approach to percep...
research
05/08/2023

Reinforcement Learning for Topic Models

We apply reinforcement learning techniques to topic modeling by replacin...

Please sign up or login with your details

Forgot password? Click here to reset