Learning Multilingual Word Embeddings Using Image-Text Data

05/29/2019
by   Karan Singhal, et al.
0

There has been significant interest recently in learning multilingual word embeddings -- in which semantically similar words across languages have similar embeddings. State-of-the-art approaches have relied on expensive labeled data, which is unavailable for low-resource languages, or have involved post-hoc unification of monolingual embeddings. In the present paper, we investigate the efficacy of multilingual embeddings learned from weakly-supervised image-text data. In particular, we propose methods for learning multilingual embeddings using image-text data, by enforcing similarity between the representations of the image and that of the text. Our experiments reveal that even without using any expensive labeled data, a bag-of-words-based embedding model trained on image-text data achieves performance comparable to the state-of-the-art on crosslingual semantic similarity tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2016

Multilingual Word Embeddings using Multigraphs

We present a family of neural-network--inspired models for computing con...
research
11/01/2018

GlobalTrait: Personality Alignment of Multilingual Word Embeddings

We propose a multilingual model to recognize Big Five Personality traits...
research
02/18/2023

RetVec: Resilient and Efficient Text Vectorizer

This paper describes RetVec, a resilient multilingual embedding scheme d...
research
01/08/2014

Learning Multilingual Word Representations using a Bag-of-Words Autoencoder

Recent work on learning multilingual word representations usually relies...
research
08/03/2018

Efficient Purely Convolutional Text Encoding

In this work, we focus on a lightweight convolutional architecture that ...
research
02/25/2020

Language-Independent Tokenisation Rivals Language-Specific Tokenisation for Word Similarity Prediction

Language-independent tokenisation (LIT) methods that do not require labe...
research
07/31/2020

Evaluating Semantic Interaction on Word Embeddings via Simulation

Semantic interaction (SI) attempts to learn the user's cognitive intents...

Please sign up or login with your details

Forgot password? Click here to reset