Mathematically Modeling the Lexicon Entropy of Emergent Language

11/28/2022
by   Brendon Boldt, et al.
Carnegie Mellon University
0

We formulate a stochastic process, FiLex, as a mathematical model of lexicon entropy in deep learning-based emergent language systems. Defining a model mathematically allows it to generate clear predictions which can be directly and decisively tested. We empirically verify across four different environments that FiLex predicts the correct correlation between hyperparameters (training steps, lexicon size, learning rate, rollout buffer size, and Gumbel-Softmax temperature) and the emergent language's entropy in 20 out of 20 environment-hyperparameter combinations. Furthermore, our experiments reveal that different environments show diverse relationships between their hyperparameters and entropy which demonstrates the need for a model which can make well-defined predictions at a precise level of granularity.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/22/2022

Modeling Emergent Lexicon Formation with a Self-Reinforcing Stochastic Process

We introduce FiLex, a self-reinforcing stochastic process which models f...
10/14/2020

Temperature check: theory and practice for training models with softmax-cross-entropy losses

The softmax function combined with a cross-entropy loss is a principled ...
07/19/2019

Hyperparameter Optimisation with Early Termination of Poor Performers

It is typical for a machine learning system to have numerous hyperparame...
04/11/2014

Bayesian image segmentations by Potts prior and loopy belief propagation

This paper presents a Bayesian image segmentation model based on Potts p...
09/09/2019

Training Deep Neural Networks by optimizing over nonlocal paths in hyperparameter space

Hyperparameter optimization is both a practical issue and an interesting...
05/24/2021

Guided Hyperparameter Tuning Through Visualization and Inference

For deep learning practitioners, hyperparameter tuning for optimizing mo...
05/22/2021

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly

The learning rate (LR) schedule is one of the most important hyper-param...

Please sign up or login with your details

Forgot password? Click here to reset