Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings

by   Taichi Aida, et al.

Languages are dynamic entities, where the meanings associated with words constantly change with time. Detecting the semantic variation of words is an important task for various NLP applications that must make time-sensitive predictions. Existing work on semantic variation prediction have predominantly focused on comparing some form of an averaged contextualised representation of a target word computed from a given corpus. However, some of the previously associated meanings of a target word can become obsolete over time (e.g. meaning of gay as happy), while novel usages of existing words are observed (e.g. meaning of cell as a mobile phone). We argue that mean representations alone cannot accurately capture such semantic variations and propose a method that uses the entire cohort of the contextualised embeddings of the target word, which we refer to as the sibling distribution. Experimental results on SemEval-2020 Task 1 benchmark dataset for semantic variation prediction show that our method outperforms prior work that consider only the mean embeddings, and is comparable to the current state-of-the-art. Moreover, a qualitative analysis shows that our method detects important semantic changes in words that are not captured by the existing methods. Source code is available at https://github.com/a1da4/svp-gauss .


page 1

page 2

page 3

page 4


Autoencoding Word Representations through Time for Semantic Change Detection

Semantic change detection concerns the task of identifying words whose m...

A Dataset To Evaluate The Representations Learned By Video Prediction Models

We present a parameterized synthetic dataset called Moving Symbols to su...

Contextually Propagated Term Weights for Document Representation

Word embeddings predict a word from its neighbours by learning small, de...

DUKweb: Diachronic word representations from the UK Web Archive corpus

Lexical semantic change (detecting shifts in the meaning and usage of wo...

Enriching Word Embeddings with Temporal and Spatial Information

The meaning of a word is closely linked to sociocultural factors that ca...

SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in BERT-based Embedding Spaces

Lexical semantic change detection (also known as semantic shift tracing)...

DialectGram: Detecting Dialectal Variation at Multiple Geographic Resolutions

Several computational models have been developed to detect and analyze d...

Please sign up or login with your details

Forgot password? Click here to reset