Reinforcement Learning with Feedback-modulated TD-STDP

08/29/2020
by   Stephen Chung, et al.
12

Spiking neuron networks have been used successfully to solve simple reinforcement learning tasks with continuous action set applying learning rules based on spike-timing-dependent plasticity (STDP). However, most of these models cannot be applied to reinforcement learning tasks with discrete action set since they assume that the selected action is a deterministic function of firing rate of neurons, which is continuous. In this paper, we propose a new STDP-based learning rule for spiking neuron networks which contains feedback modulation. We show that the STDP-based learning rule can be used to solve reinforcement learning tasks with discrete action set at a speed similar to standard reinforcement learning algorithms when applied to the CartPole and LunarLander tasks. Moreover, we demonstrate that the agent is unable to solve these tasks if feedback modulation is omitted from the learning rule. We conclude that feedback modulation allows better credit assignment when only the units contributing to the executed action and TD error participate in learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2020

An Alternative to Backpropagation in Deep Reinforcement Learning

State-of-the-art deep learning algorithms mostly rely on gradient backpr...
research
05/19/2022

Reinforcement Learning with Brain-Inspired Modulation can Improve Adaptation to Environmental Changes

Developments in reinforcement learning (RL) have allowed algorithms to a...
research
08/08/2017

Learning Feedforward and Recurrent Deterministic Spiking Neuron Network Feedback Controllers

We consider the problem of feedback control when the controller is const...
research
10/15/2019

Reinforcement learning with spiking coagents

Neuroscientific theory suggests that dopaminergic neurons broadcast glob...
research
06/04/2019

Reinforcement Learning with Low-Complexity Liquid State Machines

We propose reinforcement learning on simple networks consisting of rando...
research
03/08/2023

Using Memory-Based Learning to Solve Tasks with State-Action Constraints

Tasks where the set of possible actions depend discontinuously on the st...
research
10/19/2020

Every Hidden Unit Maximizing Output Weights Maximizes The Global Reward

For a network of stochastic units trained on a reinforcement learning ta...

Please sign up or login with your details

Forgot password? Click here to reset