Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation

by   Yixin Liu, et al.
Carnegie Mellon University
Yale University

Interpretability and efficiency are two important considerations for the adoption of neural automatic metrics. In this work, we develop strong-performing automatic metrics for reference-based summarization evaluation, based on a two-stage evaluation pipeline that first extracts basic information units from one text sequence and then checks the extracted units in another sequence. The metrics we developed include two-stage metrics that can provide high interpretability at both the fine-grained unit level and summary level, and one-stage metrics that achieve a balance between efficiency and interoperability. We make the developed tools publicly available through a Python package and GitHub.


page 1

page 2

page 3

page 4


Finding a Balanced Degree of Automation for Summary Evaluation

Human evaluation for summarization tasks is reliable but brings in issue...

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

Human evaluation is the foundation upon which the evaluation of both sum...

Interpretable Automatic Fine-grained Inconsistency Detection in Text Summarization

Existing factual consistency evaluation approaches for text summarizatio...

Evaluating and Improving Factuality in Multimodal Abstractive Summarization

Current metrics for evaluating factuality for abstractive document summa...

EASSE: Easier Automatic Sentence Simplification Evaluation

We introduce EASSE, a Python package aiming to facilitate and standardis...

At Which Level Should We Extract? An Empirical Study on Extractive Document Summarization

Extractive methods have proven to be very effective in automatic documen...

50+ Metrics for Calendar Mining

In this report we propose 50+ metrics which can be measured by organizat...

Please sign up or login with your details

Forgot password? Click here to reset