Curriculum Proximal Policy Optimization with Stage-Decaying Clipping for Self-Driving at Unsignalized Intersections

08/31/2023
by   Zengqi Peng, et al.
0

Unsignalized intersections are typically considered as one of the most representative and challenging scenarios for self-driving vehicles. To tackle autonomous driving problems in such scenarios, this paper proposes a curriculum proximal policy optimization (CPPO) framework with stage-decaying clipping. By adjusting the clipping parameter during different stages of training through proximal policy optimization (PPO), the vehicle can first rapidly search for an approximate optimal policy or its neighborhood with a large parameter, and then converges to the optimal policy with a small one. Particularly, the stage-based curriculum learning technology is incorporated into the proposed framework to improve the generalization performance and further accelerate the training process. Moreover, the reward function is specially designed in view of different curriculum settings. A series of comparative experiments are conducted in intersection-crossing scenarios with bi-lane carriageways to verify the effectiveness of the proposed CPPO method. The results show that the proposed approach demonstrates better adaptiveness to different dynamic and complex environments, as well as faster training speed over baseline methods.

READ FULL TEXT

page 1

page 2

page 6

research
07/10/2022

State Dropout-Based Curriculum Reinforcement Learning for Self-Driving at Unsignalized Intersections

Traversing intersections is a challenging problem for autonomous vehicle...
research
03/07/2023

Chance-Aware Lane Change with High-Level Model Predictive Control Through Curriculum Reinforcement Learning

Lane change in dense traffic is considered a challenging problem that ty...
research
07/15/2019

Proximal Policy Optimization with Mixed Distributed Training

Instability and slowness are two main problems in deep reinforcement lea...
research
12/01/2019

Automated curriculum generation for Policy Gradients from Demonstrations

In this paper, we present a technique that improves the process of train...
research
09/12/2021

Encoding Distributional Soft Actor-Critic for Autonomous Driving in Multi-lane Scenarios

In this paper, we propose a new reinforcement learning (RL) algorithm, c...
research
03/16/2021

Sparse Curriculum Reinforcement Learning for End-to-End Driving

Deep reinforcement Learning for end-to-end driving is limited by the nee...
research
09/15/2020

Soft policy optimization using dual-track advantage estimator

In reinforcement learning (RL), we always expect the agent to explore as...

Please sign up or login with your details

Forgot password? Click here to reset