Learning Parsimonious Dynamics for Generalization in Reinforcement Learning

by   Tankred Saanum, et al.

Humans are skillful navigators: We aptly maneuver through new places, realize when we are back at a location we have seen before, and can even conceive of shortcuts that go through parts of our environments we have never visited. Current methods in model-based reinforcement learning on the other hand struggle with generalizing about environment dynamics out of the training distribution. We argue that two principles can help bridge this gap: latent learning and parsimonious dynamics. Humans tend to think about environment dynamics in simple terms – we reason about trajectories not in reference to what we expect to see along a path, but rather in an abstract latent space, containing information about the places' spatial coordinates. Moreover, we assume that moving around in novel parts of our environment works the same way as in parts we are familiar with. These two principles work together in tandem: it is in the latent space that the dynamics show parsimonious characteristics. We develop a model that learns such parsimonious dynamics. Using a variational objective, our model is trained to reconstruct experienced transitions in a latent space using locally linear transformations, while encouraged to invoke as few distinct transformations as possible. Using our framework, we demonstrate the utility of learning parsimonious latent dynamics models in a range of policy learning and planning tasks.


page 7

page 9


Planning from Images with Deep Latent Gaussian Process Dynamics

Planning is a powerful approach to control problems with known environme...

Dream and Search to Control: Latent Space Planning for Continuous Control

Learning and planning with latent space dynamics has been shown to be us...

Learning Latent Dynamics for Planning from Pixels

Planning has been very successful for control tasks with known environme...

Prototypical context-aware dynamics generalization for high-dimensional model-based reinforcement learning

The latent world model provides a promising way to learn policies in a c...

Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images

We introduce Embed to Control (E2C), a method for model learning and con...

Learning Powerful Policies by Using Consistent Dynamics Model

Model-based Reinforcement Learning approaches have the promise of being ...

Hindsight States: Blending Sim and Real Task Elements for Efficient Reinforcement Learning

Reinforcement learning has shown great potential in solving complex task...

Please sign up or login with your details

Forgot password? Click here to reset