PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm

06/11/2023
by   Wensong Bai, et al.
0

In this paper, we propose the first fully push-forward-based Distributional Reinforcement Learning algorithm, called Push-forward-based Actor-Critic EncourageR (PACER). Specifically, PACER establishes a stochastic utility value policy gradient theorem and simultaneously leverages the push-forward operator in the construction of both the actor and the critic. Moreover, based on maximum mean discrepancies (MMD), a novel sample-based encourager is designed to incentivize exploration. Experimental evaluations on various continuous control benchmarks demonstrate the superiority of our algorithm over the state-of-the-art.

READ FULL TEXT
research
05/08/2021

Generative Actor-Critic: An Off-policy Algorithm Using the Push-forward Model

Model-free deep reinforcement learning has achieved great success in man...
research
10/04/2020

FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning

In this paper, we propose a new type of Actor, named forward-looking Act...
research
01/15/2022

Recursive Least Squares Advantage Actor-Critic Algorithms

As an important algorithm in deep reinforcement learning, advantage acto...
research
07/13/2020

Implicit Distributional Reinforcement Learning

To improve the sample efficiency of policy-gradient based reinforcement ...
research
05/24/2021

GMAC: A Distributional Perspective on Actor-Critic Framework

In this paper, we devise a distributional framework on actor-critic as a...
research
11/29/2022

Autotuning PID control using Actor-Critic Deep Reinforcement Learning

This work is an exploratory research concerned with determining in what ...
research
05/07/2020

Curious Hierarchical Actor-Critic Reinforcement Learning

Hierarchical abstraction and curiosity-driven exploration are two common...

Please sign up or login with your details

Forgot password? Click here to reset