Self-Adapting Goals Allow Transfer of Predictive Models to New Tasks

by   Kai Olav Ellefsen, et al.

A long-standing challenge in Reinforcement Learning is enabling agents to learn a model of their environment which can be transferred to solve other problems in a world with the same underlying rules. One reason this is difficult is the challenge of learning accurate models of an environment. If such a model is inaccurate, the agent's plans and actions will likely be sub-optimal, and likely lead to the wrong outcomes. Recent progress in model-based reinforcement learning has improved the ability for agents to learn and use predictive models. In this paper, we extend a recent deep learning architecture which learns a predictive model of the environment that aims to predict only the value of a few key measurements, which are be indicative of an agent's performance. Predicting only a few measurements rather than the entire future state of an environment makes it more feasible to learn a valuable predictive model. We extend this predictive model with a small, evolving neural network that suggests the best goals to pursue in the current state. We demonstrate that this allows the predictive model to transfer to new scenarios where goals are different, and that the adaptive goals can even adjust agent behavior on-line, changing its strategy to fit the current context.


Learning to Predict Without Looking Ahead: World Models Without Forward Prediction

Much of model-based reinforcement learning involves learning a model of ...

Zero Shot Learning on Simulated Robots

In this work we present a method for leveraging data from one source to ...

Iterative Model-Based Reinforcement Learning Using Simulations in the Differentiable Neural Computer

We propose a lifelong learning architecture, the Neural Computer Agent (...

Reinforcement learning and inverse reinforcement learning with system 1 and system 2

Inferring a person's goal from their behavior is an important problem in...

Power-seeking can be probable and predictive for trained agents

Power-seeking behavior is a key source of risk from advanced AI, but our...

Partner Approximating Learners (PAL): Simulation-Accelerated Learning with Explicit Partner Modeling in Multi-Agent Domains

Mixed cooperative-competitive control scenarios such as human-machine in...

Scaling All-Goals Updates in Reinforcement Learning Using Convolutional Neural Networks

Being able to reach any desired location in the environment can be a val...

Please sign up or login with your details

Forgot password? Click here to reset