Offline Reinforcement Learning for Autonomous Driving with Safety and Exploration Enhancement

10/13/2021
by   Tianyu Shi, et al.
0

Reinforcement learning (RL) is a powerful data-driven control method that has been largely explored in autonomous driving tasks. However, conventional RL approaches learn control policies through trial-and-error interactions with the environment and therefore may cause disastrous consequences such as collisions when testing in real-world traffic. Offline RL has recently emerged as a promising framework to learn effective policies from previously-collected, static datasets without the requirement of active interactions, making it especially appealing for autonomous driving applications. Despite promising, existing offline RL algorithms such as Batch-Constrained deep Q-learning (BCQ) generally lead to rather conservative policies with limited exploration efficiency. To address such issues, this paper presents an enhanced BCQ algorithm by employing a learnable parameter noise scheme in the perturbation model to increase the diversity of observed actions. In addition, a Lyapunov-based safety enhancement strategy is incorporated to constrain the explorable state space within a safe region. Experimental results in highway and parking traffic scenarios show that our approach outperforms the conventional RL method, as well as state-of-the-art offline RL algorithms.

READ FULL TEXT

page 1

page 5

page 6

research
09/18/2023

Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration

Safe Reinforcement Learning (RL) aims to find a policy that achieves hig...
research
02/18/2021

Continuous Doubly Constrained Batch Reinforcement Learning

Reliant on too many experiments to learn good actions, current Reinforce...
research
11/12/2021

DriverGym: Democratising Reinforcement Learning for Autonomous Driving

Despite promising progress in reinforcement learning (RL), developing al...
research
07/21/2022

Addressing Optimism Bias in Sequence Modeling for Reinforcement Learning

Impressive results in natural language processing (NLP) based on the Tra...
research
12/05/2022

Bi-Level Optimization Augmented with Conditional Variational Autoencoder for Autonomous Driving in Dense Traffic

Autonomous driving has a natural bi-level structure. The goal of the upp...
research
04/11/2022

Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios

When learning to behave in a stochastic environment where safety is crit...
research
06/15/2023

Datasets and Benchmarks for Offline Safe Reinforcement Learning

This paper presents a comprehensive benchmarking suite tailored to offli...

Please sign up or login with your details

Forgot password? Click here to reset