Risk-Sensitive Policy with Distributional Reinforcement Learning

12/30/2022
by   Thibaut Théate, et al.
27

Classical reinforcement learning (RL) techniques are generally concerned with the design of decision-making policies driven by the maximisation of the expected outcome. Nevertheless, this approach does not take into consideration the potential risk associated with the actions taken, which may be critical in certain applications. To address that issue, the present research work introduces a novel methodology based on distributional RL to derive sequential decision-making policies that are sensitive to the risk, the latter being modelled by the tail of the return probability distribution. The core idea is to replace the Q function generally standing at the core of learning schemes in RL by another function taking into account both the expected return and the risk. Named the risk-based utility function U, it can be extracted from the random return distribution Z naturally learnt by any distributional RL algorithm. This enables to span the complete potential trade-off between risk minimisation and expected return maximisation, in contrast to fully risk-averse methodologies. Fundamentally, this research yields a truly practical and accessible solution for learning risk-sensitive policies with minimal modification to the distributional RL algorithm, and with an emphasis on the interpretability of the resulting decision-making process.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2021

Conservative Offline Distributional Reinforcement Learning

Many reinforcement learning (RL) problems in practice are offline, learn...
research
03/15/2012

Parametric Return Density Estimation for Reinforcement Learning

Most conventional Reinforcement Learning (RL) algorithms aim to optimize...
research
03/01/2022

Distributional Reinforcement Learning for Scheduling of (Bio)chemical Production Processes

Reinforcement Learning (RL) has recently received significant attention ...
research
11/12/2021

Two steps to risk sensitivity

Distributional reinforcement learning (RL) – in which agents learn about...
research
11/05/2019

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

While maximizing expected return is the goal in most reinforcement learn...
research
05/09/2023

Distributional Multi-Objective Decision Making

For effective decision support in scenarios with conflicting objectives,...
research
01/13/2023

Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning

In safety-critical decision-making scenarios being able to identify wors...

Please sign up or login with your details

Forgot password? Click here to reset