Extractive is not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization

by   Shiyue Zhang, et al.

The problems of unfaithful summaries have been widely discussed under the context of abstractive summarization. Though extractive summarization is less prone to the common unfaithfulness issues of abstractive summaries, does that mean extractive is equal to faithful? Turns out that the answer is no. In this work, we define a typology with five types of broad unfaithfulness problems (including and beyond not-entailment) that can appear in extractive summaries, including incorrect coreference, incomplete coreference, incorrect discourse, incomplete discourse, as well as other misleading information. We ask humans to label these problems out of 1500 English summaries produced by 15 diverse extractive systems. We find that 33 five issues. To automatically detect these problems, we find that 5 existing faithfulness evaluation metrics for summarization have poor correlations with human judgment. To remedy this, we propose a new metric, ExtEval, that is designed for detecting unfaithful extractive summaries and is shown to have the best performance. We hope our work can increase the awareness of unfaithfulness problems in extractive summarization and help future work to evaluate and resolve these issues. Our data and code are publicly available at https://github.com/ZhangShiyue/extractive_is_not_faithful


page 1

page 2

page 3

page 4


EmailSum: Abstractive Email Thread Summarization

Recent years have brought about an interest in the challenging task of s...

BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization

Most existing text summarization datasets are compiled from the news dom...

Metrics also Disagree in the Low Scoring Range: Revisiting Summarization Evaluation Metrics

In text summarization, evaluating the efficacy of automatic metrics with...

Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs

Abstractive conversation summarization has received much attention recen...

SummerTime: Text Summarization Toolkit for Non-experts

Recent advances in summarization provide models that can generate summar...

Finding a Balanced Degree of Automation for Summary Evaluation

Human evaluation for summarization tasks is reliable but brings in issue...

Hone as You Read: A Practical Type of Interactive Summarization

We present HARE, a new task where reader feedback is used to optimize do...

Please sign up or login with your details

Forgot password? Click here to reset