Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

05/29/2023
by   Haoran He, et al.
0

Diffusion models have demonstrated highly-expressive generative capabilities in vision and NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are also powerful in modeling complex policies or trajectories in offline datasets. However, these works have been limited to single-task settings where a generalist agent capable of addressing multi-task predicaments is absent. In this paper, we aim to investigate the effectiveness of a single diffusion model in modeling large-scale multi-task offline data, which can be challenging due to diverse and multimodal data distribution. Specifically, we propose Multi-Task Diffusion Model (MTDiff), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multi-task offline settings. MTDiff leverages vast amounts of knowledge available in multi-task data and performs implicit knowledge sharing among tasks. For generative planning, we find MTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data synthesis, MTDiff generates high-quality data for testing tasks given a single demonstration as a prompt, which enhances the low-quality datasets for even unseen tasks.

READ FULL TEXT

page 9

page 18

page 19

research
09/16/2021

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

Offline reinforcement learning (RL) algorithms have shown promising resu...
research
02/03/2023

AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

Diffusion models have demonstrated their powerful generative capability ...
research
07/30/2023

MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

Autonomous driving technology is poised to transform transportation syst...
research
05/31/2023

Offline Meta Reinforcement Learning with In-Distribution Online Adaptation

Recent offline meta-reinforcement learning (meta-RL) methods typically u...
research
02/01/2022

Planner-Reasoner Inside a Multi-task Reasoning Agent

We consider the problem of multi-task reasoning (MTR), where an agent ca...
research
11/28/2022

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

The potential of offline reinforcement learning (RL) is that high-capaci...
research
11/19/2021

Generalized Decision Transformer for Offline Hindsight Information Matching

How to extract as much learning signal from each trajectory data has bee...

Please sign up or login with your details

Forgot password? Click here to reset