Hindsight Network Credit Assignment: Efficient Credit Assignment in Networks of Discrete Stochastic Units

10/14/2021
by   Kenny Young, et al.
0

Training neural networks with discrete stochastic variables presents a unique challenge. Backpropagation is not directly applicable, nor are the reparameterization tricks used in networks with continuous stochastic variables. To address this challenge, we present Hindsight Network Credit Assignment (HNCA), a novel learning algorithm for networks of discrete stochastic units. HNCA works by assigning credit to each unit based on the degree to which its output influences its immediate children in the network. We prove that HNCA produces unbiased gradient estimates with reduced variance compared to the REINFORCE estimator, while the computational cost is similar to that of backpropagation. We first apply HNCA in a contextual bandit setting to optimize a reward function that is unknown to the agent. In this setting, we empirically demonstrate that HNCA significantly outperforms REINFORCE, indicating that the variance reduction implied by our theoretical analysis is significant and impactful. We then show how HNCA can be extended to optimize a more general function of the outputs of a network of stochastic units, where the function is known to the agent. We apply this extended version of HNCA to train a discrete variational auto-encoder and empirically show it compares favourably to other strong methods. We believe that the ideas underlying HNCA can help stimulate new ways of thinking about efficient credit assignment in stochastic compute graphs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2020

Hindsight Network Credit Assignment

We present Hindsight Network Credit Assignment (HNCA), a novel learning ...
research
07/25/2023

Structural Credit Assignment with Coordinated Exploration

A biologically plausible method for training an Artificial Neural Networ...
research
11/16/2015

MuProp: Unbiased Backpropagation for Stochastic Neural Networks

Deep neural networks are powerful parametric models that can be trained ...
research
12/23/2014

Difference Target Propagation

Back-propagation has been the workhorse of recent successes of deep lear...
research
10/09/2019

Who's responsible? Jointly quantifying the contribution of the learning algorithm and training data

A fancy learning algorithm A outperforms a baseline method B when they a...
research
08/01/2022

Replacing Backpropagation with Biological Plausible Top-down Credit Assignment in Deep Neural Networks Training

Top-down connections in the biological brain has been shown to be import...
research
07/03/2009

Credit Assignment in Adaptive Evolutionary Algorithms

In this paper, a new method for assigning credit to search operators is ...

Please sign up or login with your details

Forgot password? Click here to reset