Semantic Search as Extractive Paraphrase Span Detection

12/09/2021
by   Jenna Kanerva, et al.
0

In this paper, we approach the problem of semantic search by framing the search task as paraphrase span detection, i.e. given a segment of text as a query phrase, the task is to identify its paraphrase in a given document, the same modelling setup as typically used in extractive question answering. On the Turku Paraphrase Corpus of 100,000 manually extracted Finnish paraphrase pairs including their original document context, we find that our paraphrase span detection model outperforms two strong retrieval baselines (lexical similarity and BERT sentence embeddings) by 31.9pp and 22.4pp respectively in terms of exact match, and by 22.3pp and 12.9pp in terms of token-level F-score. This demonstrates a strong advantage of modelling the task in terms of span retrieval, rather than sentence similarity. Additionally, we introduce a method for creating artificial paraphrase data through back-translation, suitable for languages where manually annotated paraphrase resources for training the span detection model are not available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2020

Pivot Through English: Reliably Answering Multilingual Questions without Document Retrieval

Existing methods for open-retrieval question answering in lower resource...
research
06/16/2019

Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection

The common practice in coreference resolution is to identify and evaluat...
research
07/01/2020

Iterative Paraphrastic Augmentation with Discriminative Span Alignment

We introduce a novel paraphrastic augmentation strategy based on sentenc...
research
07/24/2019

SpanBERT: Improving Pre-training by Representing and Predicting Spans

We present SpanBERT, a pre-training method that is designed to better re...
research
12/21/2020

Narrative Incoherence Detection

Motivated by the increasing popularity of intelligent editing assistant,...
research
04/15/2021

Detect and Classify – Joint Span Detection and Classification for Health Outcomes

A health outcome is a measurement or an observation used to capture and ...
research
10/31/2022

1Cademy @ Causal News Corpus 2022: Enhance Causal Span Detection via Beam-Search-based Position Selector

In this paper, we present our approach and empirical observations for Ca...

Please sign up or login with your details

Forgot password? Click here to reset