Lexical Sememe Prediction using Dictionary Definitions by Capturing Local Semantic Correspondence

by   Jiaju Du, et al.

Sememes, defined as the minimum semantic units of human languages in linguistics, have been proven useful in many NLP tasks. Since manual construction and update of sememe knowledge bases (KBs) are costly, the task of automatic sememe prediction has been proposed to assist sememe annotation. In this paper, we explore the approach of applying dictionary definitions to predicting sememes for unannotated words. We find that sememes of each word are usually semantically matched to different words in its dictionary definition, and we name this matching relationship local semantic correspondence. Accordingly, we propose a Sememe Correspondence Pooling (SCorP) model, which is able to capture this kind of matching to predict sememes. We evaluate our model and baseline methods on a famous sememe KB HowNet and find that our model achieves state-of-the-art performance. Moreover, further quantitative analysis shows that our model can properly learn the local semantic correspondence between sememes and words in dictionary definitions, which explains the effectiveness of our model. The source codes of this paper can be obtained from https://github.com/thunlp/scorp.


page 1

page 2

page 3

page 4


A Unified Model for Reverse Dictionary and Definition Modelling

We train a dual-way neural dictionary to guess words from definitions (r...

Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets

A sememe is defined as the minimum semantic unit of human languages. Sem...

Sememe Prediction for BabelNet Synsets using Multilingual and Multimodal Information

In linguistics, a sememe is defined as the minimum semantic unit of lang...

Improving Word Vector with Prior Knowledge in Semantic Dictionary

Using low dimensional vector space to represent words has been very effe...

Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions

Huge numbers of new words emerge every day, leading to a great need for ...

Automatic Construction of Sememe Knowledge Bases via Dictionaries

A sememe is defined as the minimum semantic unit in linguistics. Sememe ...

PunFields at SemEval-2017 Task 7: Employing Roget's Thesaurus in Automatic Pun Recognition and Interpretation

The article describes a model of automatic interpretation of English pun...

Please sign up or login with your details

Forgot password? Click here to reset