DARLA: Improving Zero-Shot Transfer in Reinforcement Learning

by   Irina Higgins, et al.

Domain adaptation is an important open problem in deep reinforcement learning (RL). In many scenarios of interest data is hard to obtain, so agents may learn a source policy in a setting where data is readily available, with the hope that it generalises well to the target domain. We propose a new multi-stage RL agent, DARLA (DisentAngled Representation Learning Agent), which learns to see before learning to act. DARLA's vision is based on learning a disentangled representation of the observed environment. Once DARLA can see, it is able to acquire source policies that are robust to many domain shifts - even with no access to the target domain. DARLA significantly outperforms conventional baselines in zero-shot domain adaptation scenarios, an effect that holds across a variety of RL environments (Jaco arm, DeepMind Lab) and base RL algorithms (DQN, A3C and EC).


page 5

page 7

page 13

page 15


Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation

Despite the recent success of deep reinforcement learning (RL), domain a...

Unified State Representation Learning under Data Augmentation

The capacity for rapid domain adaptation is important to increasing the ...

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Although deep reinforcement learning (deep RL) methods have lots of stre...

Domain Adaptation of Reinforcement Learning Agents based on Network Service Proximity

The dynamic and evolutionary nature of service requirements in wireless ...

Provably Sample-Efficient RL with Side Information about Latent Dynamics

We study reinforcement learning (RL) in settings where observations are ...

Subequivariant Graph Reinforcement Learning in 3D Environments

Learning a shared policy that guides the locomotion of different agents ...

Transfer of Deep Reactive Policies for MDP Planning

Domain-independent probabilistic planners input an MDP description in a ...

Please sign up or login with your details

Forgot password? Click here to reset