Cross-domain Random Pre-training with Prototypes for Reinforcement Learning

by   Xin Liu, et al.

Task-agnostic cross-domain pre-training shows great potential in image-based Reinforcement Learning (RL) but poses a big challenge. In this paper, we propose CRPTpro, a Cross-domain self-supervised Random Pre-Training framework with prototypes for image-based RL. CRPTpro employs cross-domain random policy to easily and quickly sample diverse data from multiple domains, to improve pre-training efficiency. Moreover, prototypical representation learning with a novel intrinsic loss is proposed to pre-train an effective and generic encoder across different domains. Without finetuning, the cross-domain encoder can be implemented for challenging downstream visual-control RL tasks defined in different domains efficiently. Compared with prior arts like APT and Proto-RL, CRPTpro achieves better performance on cross-domain downstream RL tasks without extra training on exploration agents for expert data collection, greatly reducing the burden of pre-training. Experiments on DeepMind Control suite (DMControl) demonstrate that CRPTpro outperforms APT significantly on 11/12 cross-domain RL tasks with only 39 state-of-the-art cross-domain pre-training method in both policy learning performance and pre-training efficiency. The complete code will be released at


CrossNER: Evaluating Cross-Domain Named Entity Recognition

Cross-domain named entity recognition (NER) models are able to cope with...

An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning

Unsupervised object-centric representation (OCR) learning has recently d...

Decoupling Representation Learning from Reinforcement Learning

In an effort to overcome limitations of reward-driven feature learning i...

RePreM: Representation Pre-training with Masked Model for Reinforcement Learning

Inspired by the recent success of sequence modeling in RL and the use of...

Omni-Training for Data-Efficient Deep Learning

Learning a generalizable deep model from a few examples in a short time ...

Cross-Domain Style Mixing for Face Cartoonization

Cartoon domain has recently gained increasing popularity. Previous studi...

Parrot: Data-Driven Behavioral Priors for Reinforcement Learning

Reinforcement learning provides a general framework for flexible decisio...

Please sign up or login with your details

Forgot password? Click here to reset