On (Emergent) Systematic Generalisation and Compositionality in Visual Referential Games with Straight-Through Gumbel-Softmax Estimator

by   Kevin Denamganai, et al.

The drivers of compositionality in artificial languages that emerge when two (or more) agents play a non-visual referential game has been previously investigated using approaches based on the REINFORCE algorithm and the (Neural) Iterated Learning Model. Following the more recent introduction of the Straight-Through Gumbel-Softmax (ST-GS) approach, this paper investigates to what extent the drivers of compositionality identified so far in the field apply in the ST-GS context and to what extent do they translate into (emergent) systematic generalisation abilities, when playing a visual referential game. Compositionality and the generalisation abilities of the emergent languages are assessed using topographic similarity and zero-shot compositional tests. Firstly, we provide evidence that the test-train split strategy significantly impacts the zero-shot compositional tests when dealing with visual stimuli, whilst it does not when dealing with symbolic ones. Secondly, empirical evidence shows that using the ST-GS approach with small batch sizes and an overcomplete communication channel improves compositionality in the emerging languages. Nevertheless, while shown robust with symbolic stimuli, the effect of the batch size is not so clear-cut when dealing with visual stimuli. Our results also show that not all overcomplete communication channels are created equal. Indeed, while increasing the maximum sentence length is found to be beneficial to further both compositionality and generalisation abilities, increasing the vocabulary size is found detrimental. Finally, a lack of correlation between the language compositionality at training-time and the agents' generalisation abilities is observed in the context of discriminative referential games with visual stimuli. This is similar to previous observations in the field using the generative variant with symbolic stimuli.


page 7

page 22


Visual Referential Games Further the Emergence of Disentangled Representations

Natural languages are powerful tools wielded by human beings to communic...

Compositional Obverter Communication Learning From Raw Visual Input

One of the distinguishing aspects of human language is its compositional...

Meta-Referential Games to Learn Compositional Learning Behaviours

Human beings use compositionality to generalise from past experiences to...

Zero-Shot Translation using Diffusion Models

In this work, we show a novel method for neural machine translation (NMT...

Internal and external pressures on language emergence: least effort, object constancy and frequency

In previous work, artificial agents were shown to achieve almost perfect...

Im-Promptu: In-Context Composition from Image Prompts

Large language models are few-shot learners that can solve diverse tasks...

When Hearing Defers to Touch

Hearing is often believed to be more sensitive than touch. This assertio...

Please sign up or login with your details

Forgot password? Click here to reset