Learning Neural Word Salience Scores

09/04/2017
by   Krasen Samardzhiev, et al.
0

Measuring the salience of a word is an essential step in numerous NLP tasks. Heuristic approaches such as tfidf have been used so far to estimate the salience of words. We propose Neural Word Salience (NWS) scores, unlike heuristics, are learnt from a corpus. Specifically, we learn word salience scores such that, using pre-trained word embeddings as the input, can accurately predict the words that appear in a sentence, given the words that appear in the sentences preceding or succeeding that sentence. Experimental results on sentence similarity prediction show that the learnt word salience scores perform comparably or better than some of the state-of-the-art approaches for representing sentences on benchmark datasets for sentence similarity, while using only a fraction of the training and prediction times required by prior methods. Moreover, our NWS scores positively correlate with psycholinguistic measures such as concreteness, and imageability implying a close connection to the salience as perceived by humans.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2017

Think Globally, Embed Locally --- Locally Linear Meta-embedding of Words

Distributed word embeddings have shown superior performances in numerous...
research
11/19/2015

Joint Word Representation Learning using a Corpus and a Semantic Lexicon

Methods for learning word representations using large text corpora have ...
research
10/29/2021

The Golden Rule as a Heuristic to Measure the Fairness of Texts Using Machine Learning

To treat others as one would wish to be treated is a common formulation ...
research
08/29/2021

Sentence Structure and Word Relationship Modeling for Emphasis Selection

Emphasis Selection is a newly proposed task which focuses on choosing wo...
research
05/31/2023

Assessing Word Importance Using Models Trained for Semantic Tasks

Many NLP tasks require to automatically identify the most significant wo...
research
05/15/2016

A Proposal for Linguistic Similarity Datasets Based on Commonality Lists

Similarity is a core notion that is used in psychology and two branches ...
research
01/22/2021

Evaluation Discrepancy Discovery: A Sentence Compression Case-study

Reliable evaluation protocols are of utmost importance for reproducible ...

Please sign up or login with your details

Forgot password? Click here to reset