Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction

06/21/2019
by   Fengda Zhu, et al.
0

Deep reinforcement learning has made significant progress in the field of continuous control, such as physical control and autonomous driving. However, it is challenging for a reinforcement model to learn a policy for each task sequentially due to catastrophic forgetting. Specifically, the model would forget knowledge it learned in the past when trained on a new task. We consider this challenge from two perspectives: i) acquiring task-specific skills is difficult since task information and rewards are not highly related; ii) learning knowledge from previous experience is difficult in continuous control domains. In this paper, we introduce an end-to-end framework namely Continual Diversity Adversarial Network (CDAN). We first develop an unsupervised diversity exploration method to learn task-specific skills using an unsupervised objective. Then, we propose an adversarial self-correction mechanism to learn knowledge by exploiting past experience. The two learning procedures are presumably reciprocal. To evaluate the proposed method, we propose a new continuous reinforcement learning environment named Continual Ant Maze (CAM) and a new metric termed Normalized Shorten Distance (NSD). The experimental results confirm the effectiveness of diversity exploration and self-correction. It is worthwhile noting that our final result outperforms baseline by 18.35

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2022

The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning

We study the use of model-based reinforcement learning methods, in parti...
research
02/01/2019

Policy Consolidation for Continual Reinforcement Learning

We propose a method for tackling catastrophic forgetting in deep reinfor...
research
02/25/2019

S-TRIGGER: Continual State Representation Learning via Self-Triggered Generative Replay

We consider the problem of building a state representation model for con...
research
09/04/2018

Transferring Deep Reinforcement Learning with Adversarial Objective and Augmentation

In the past few years, deep reinforcement learning has been proven to so...
research
09/15/2021

Life-Long Multi-Task Learning of Adaptive Path Tracking Policy for Autonomous Vehicle

This paper proposes a life-long adaptive path tracking policy learning m...
research
11/11/2020

Behaviorally Diverse Traffic Simulation via Reinforcement Learning

Traffic simulators are important tools in autonomous driving development...
research
11/10/2018

Diversity-Driven Extensible Hierarchical Reinforcement Learning

Hierarchical reinforcement learning (HRL) has recently shown promising a...

Please sign up or login with your details

Forgot password? Click here to reset