Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space

03/04/2019
by   Zhou Fan, et al.
1

In this paper we propose a hybrid architecture of actor-critic algorithms for reinforcement learning in parameterized action space, which consists of multiple parallel sub-actor networks to decompose the structured action space into simpler action spaces along with a critic network to guide the training of all sub-actor networks. While this paper is mainly focused on parameterized action space, the proposed architecture, which we call hybrid actor-critic, can be extended for more general action spaces which has a hierarchical structure. We present an instance of the hybrid actor-critic architecture based on proximal policy optimization (PPO), which we refer to as hybrid proximal policy optimization (H-PPO). Our experiments test H-PPO on a collection of tasks with parameterized action space, where H-PPO demonstrates superior performance over previous methods of parameterized action reinforcement learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2021

Variational Quantum Soft Actor-Critic

Quantum computing has a superior advantage in tackling specific problems...
research
06/07/2023

Adaptive Frequency Green Light Optimal Speed Advisory based on Hybrid Actor-Critic Reinforcement Learning

Green Light Optimal Speed Advisory (GLOSA) system suggests speeds to veh...
research
02/15/2021

Distributionally-Constrained Policy Optimization via Unbalanced Optimal Transport

We consider constrained policy optimization in Reinforcement Learning, w...
research
11/13/2020

Scaffolding Reflection in Reinforcement Learning Framework for Confinement Escape Problem

This paper formulates an application of reinforcement learning for an ev...
research
07/25/2022

Flowsheet synthesis through hierarchical reinforcement learning and graph neural networks

Process synthesis experiences a disruptive transformation accelerated by...
research
07/09/2018

Partial Policy-based Reinforcement Learning for Anatomical Landmark Localization in 3D Medical Images

Deploying the idea of long-term cumulative return, reinforcement learnin...
research
01/22/2020

On Simple Reactive Neural Networks for Behaviour-Based Reinforcement Learning

We present a behaviour-based reinforcement learning approach, inspired b...

Please sign up or login with your details

Forgot password? Click here to reset