Keywords lie far from the mean of all words in local vector space

08/21/2020
by   Eirini Papagiannopoulou, et al.
0

Keyword extraction is an important document process that aims at finding a small set of terms that concisely describe a document's topics. The most popular state-of-the-art unsupervised approaches belong to the family of the graph-based methods that build a graph-of-words and use various centrality measures to score the nodes (candidate keywords). In this work, we follow a different path to detect the keywords from a text document by modeling the main distribution of the document's words using local word vector representations. Then, we rank the candidates based on their position in the text and the distance between the corresponding local vectors and the main distribution's center. We confirm the high performance of our approach compared to strong baselines and state-of-the-art unsupervised keyword extraction methods, through an extended experimental study, investigating the properties of the local representations.

READ FULL TEXT
research
11/27/2018

sCAKE: Semantic Connectivity Aware Keyword Extraction

Keyword Extraction is an important task in several text analysis endeavo...
research
07/15/2019

RaKUn: Rank-based Keyword extraction via Unsupervised learning and Meta vertex aggregation

Keyword extraction is used for summarizing the content of a document and...
research
01/31/2021

Extending Neural Keyword Extraction with TF-IDF tagset matching

Keyword extraction is the task of identifying words (or multi-word expre...
research
08/15/2022

Retrieval-efficiency trade-off of Unsupervised Keyword Extraction

Efficiently identifying keyphrases that represent a given document is a ...
research
07/26/2023

Unsupervised extraction of local and global keywords from a single text

We propose an unsupervised, corpus-independent method to extract keyword...
research
09/05/2017

Semantic Document Distance Measures and Unsupervised Document Revision Detection

In this paper, we model the document revision detection problem as a min...

Please sign up or login with your details

Forgot password? Click here to reset