Deep RBF Value Functions for Continuous Control

02/05/2020
by   Kavosh Asadi, et al.
4

A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned state-action value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep RBF value functions: state-action value functions learned using a deep neural network with a radial-basis function (RBF) output layer. We show that the optimal action with respect to a deep RBF value function can be easily approximated up to any desired accuracy. Moreover, deep RBF value functions can represent any true value function up to any desired accuracy owing to their support for universal function approximation. By learning a deep RBF value function, we extend the standard DQN algorithm to continuous control, and demonstrate that the resultant agent, RBF-DQN, outperforms standard baselines on a set of continuous-action RL problems.

READ FULL TEXT
research
01/16/2019

Representation Learning on Graphs: A Reinforcement Learning Application

In this work, we study value function approximation in reinforcement lea...
research
12/04/2015

Q-Networks for Binary Vector Actions

In this paper reinforcement learning with binary vector actions was inve...
research
05/23/2019

Recurrent Value Functions

Despite recent successes in Reinforcement Learning, value-based methods ...
research
03/06/2018

Smoothed Action Value Functions for Learning Gaussian Policies

State-action value functions (i.e., Q-values) are ubiquitous in reinforc...
research
12/22/2020

Dynamic penalty function approach for constraints handling in reinforcement learning

Reinforcement learning (RL) is attracting attentions as an effective way...
research
01/28/2022

Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error

In this work, we study the use of the Bellman equation as a surrogate ob...
research
10/19/2021

Continuous Control with Action Quantization from Demonstrations

In Reinforcement Learning (RL), discrete actions, as opposed to continuo...

Please sign up or login with your details

Forgot password? Click here to reset