Robust Reinforcement Learning with Wasserstein Constraint

06/01/2020
by   Linfang Hou, et al.
0

Robust Reinforcement Learning aims to find the optimal policy with some extent of robustness to environmental dynamics. Existing learning algorithms usually enable the robustness through disturbing the current state or simulating environmental parameters in a heuristic way, which lack quantified robustness to the system dynamics (i.e. transition probability). To overcome this issue, we leverage Wasserstein distance to measure the disturbance to the reference transition kernel. With Wasserstein distance, we are able to connect transition kernel disturbance to the state disturbance, i.e. reduce an infinite-dimensional optimization problem to a finite-dimensional risk-aware problem. Through the derived risk-aware optimal Bellman equation, we show the existence of optimal robust policies, provide a sensitivity analysis for the perturbations, and then design a novel robust learning algorithm–Wasserstein Robust Advantage Actor-Critic algorithm (WRAAC). The effectiveness of the proposed algorithm is verified in the Cart-Pole environment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2021

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization

Deep reinforcement learning algorithms can perform poorly in real-world ...
research
06/20/2020

Entropic Risk Constrained Soft-Robust Policy Optimization

Having a perfect model to compute the optimal policy is often infeasible...
research
03/11/2018

Soft-Robust Actor-Critic Policy-Gradient

Robust Reinforcement Learning aims to derive an optimal behavior that ac...
research
12/18/2022

Risk-Sensitive Reinforcement Learning with Exponential Criteria

While risk-neutral reinforcement learning has shown experimental success...
research
05/18/2023

Bayesian Risk-Averse Q-Learning with Streaming Observations

We consider a robust reinforcement learning problem, where a learning ag...
research
07/30/2019

Wasserstein Robust Reinforcement Learning

Reinforcement learning algorithms, though successful, tend to over-fit t...
research
04/15/2020

On Linear Optimization over Wasserstein Balls

Wasserstein balls, which contain all probability measures within a pre-s...

Please sign up or login with your details

Forgot password? Click here to reset