PAC-Bayesian Soft Actor-Critic Learning

01/30/2023
by   Bahareh Tasdighi, et al.
0

Actor-critic algorithms address the dual goals of reinforcement learning, policy evaluation and improvement, via two separate function approximators. The practicality of this approach comes at the expense of training instability, caused mainly by the destructive effect of the approximation errors of the critic on the actor. We tackle this bottleneck by employing an existing Probably Approximately Correct (PAC) Bayesian bound for the first time as the critic training objective of the Soft Actor-Critic (SAC) algorithm. We further demonstrate that the online learning performance improves significantly when a stochastic actor explores multiple futures by critic-guided random search. We observe our resulting algorithm to compare favorably to the state of the art on multiple classical control and locomotion tasks in both sample efficiency and asymptotic performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2019

Provably Convergent Off-Policy Actor-Critic with Function Approximation

We present the first provably convergent off-policy actor-critic algorit...
research
03/21/2023

SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations

We study the training performance of ROS local planners based on Reinfor...
research
12/08/2021

Hyper-parameter optimization based on soft actor critic and hierarchical mixture regularization

Hyper-parameter optimization is a crucial problem in machine learning as...
research
10/14/2019

Actor Critic with Differentially Private Critic

Reinforcement learning algorithms are known to be sample inefficient, an...
research
12/29/2017

Boosting the Actor with Dual Critic

This paper proposes a new actor-critic-style algorithm called Dual Actor...
research
03/07/2023

A Strategy-Oriented Bayesian Soft Actor-Critic Model

Adopting reasonable strategies is challenging but crucial for an intelli...
research
12/20/2022

Variational Quantum Soft Actor-Critic for Robotic Arm Control

Deep Reinforcement Learning is emerging as a promising approach for the ...

Please sign up or login with your details

Forgot password? Click here to reset