EVE: Explainable Vector Based Embedding Technique Using Wikipedia

by   M. Atif Qureshi, et al.

We present an unsupervised explainable word embedding technique, called EVE, which is built upon the structure of Wikipedia. The proposed model defines the dimensions of a semantic vector representing a word using human-readable labels, thereby it readily interpretable. Specifically, each vector is constructed using the Wikipedia category graph structure together with the Wikipedia article link structure. To test the effectiveness of the proposed word embedding model, we consider its usefulness in three fundamental tasks: 1) intruder detection - to evaluate its ability to identify a non-coherent vector from a list of coherent vectors, 2) ability to cluster - to evaluate its tendency to group related vectors together while keeping unrelated vectors in separate clusters, and 3) sorting relevant items first - to evaluate its ability to rank vectors (items) relevant to the query in the top order of the result. For each task, we also propose a strategy to generate a task-specific human-interpretable explanation from the model. These demonstrate the overall effectiveness of the explainable embeddings generated by EVE. Finally, we compare EVE with the Word2Vec, FastText, and GloVe embedding techniques across the three tasks, and report improvements over the state-of-the-art.


page 1

page 2

page 3

page 4


Vector Embedding of Wikipedia Concepts and Entities

Using deep learning for different machine learning tasks such as image c...

Lex2vec: making Explainable Word Embedding via Distant Supervision

In this technical report we propose an algorithm, called Lex2vec, that e...

Evaluation method of word embedding by roots and affixes

Word embedding has been shown to be remarkably effective in a lot of Nat...

Beyond Word Embeddings: Learning Entity and Concept Representations from Large Scale Knowledge Bases

Text representation using neural word embeddings has proven efficacy in ...

An Unbiased Approach to Quantification of Gender Inclination using Interpretable Word Representations

Recent advances in word embedding provide significant benefit to various...

Representation Learning of Image Schema

Image schema is a recurrent pattern of reasoning where one entity is map...

Uncovering and Displaying the Coherent Groups of Rank Data by Exploratory Riffle Shuffling

Let n respondents rank order d items, and suppose that d << n. Our main ...

Please sign up or login with your details

Forgot password? Click here to reset