Grounding and Distinguishing Conceptual Vocabulary Through Similarity Learning in Embodied Simulations

by   Sadaf Ghaffari, et al.

We present a novel method for using agent experiences gathered through an embodied simulation to ground contextualized word vectors to object representations. We use similarity learning to make comparisons between different object types based on their properties when interacted with, and to extract common features pertaining to the objects' behavior. We then use an affine transformation to calculate a projection matrix that transforms contextualized word vectors from different transformer-based language models into this learned space, and evaluate whether new test instances of transformed token vectors identify the correct concept in the object embedding space. Our results expose properties of the embedding spaces of four different transformer models and show that grounding object token vectors is usually more helpful to grounding verb and attribute token vectors than the reverse, which reflects earlier conclusions in the analogical reasoning and psycholinguistic literature.


page 12

page 13


Image Captioning with Visual Object Representations Grounded in the Textual Modality

We present our work in progress exploring the possibilities of a shared ...

Interpreting Embedding Spaces by Conceptualization

One of the main methods for semantic interpretation of text is mapping i...

Exploiting Embodied Simulation to Detect Novel Object Classes Through Interaction

In this paper we present a novel method for a naive agent to detect nove...

Addressing Token Uniformity in Transformers via Singular Value Transformation

Token uniformity is commonly observed in transformer-based models, in wh...

Meta-Personalizing Vision-Language Models to Find Named Instances in Video

Large-scale vision-language models (VLM) have shown impressive results f...

Analyzing Transformer Dynamics as Movement through Embedding Space

Transformer language models exhibit intelligent behaviors such as unders...

Linear Spaces of Meanings: the Compositional Language of VLMs

We investigate compositional structures in vector data embeddings from p...

Please sign up or login with your details

Forgot password? Click here to reset