Distributional Discrepancy: A Metric for Unconditional Text Generation

by   Ping Cai, et al.

The goal of unconditional text generation is training a model with real sentences, to generate novel sentences which should be the same quality and diversity as the training data. However, when different metrics are used for comparing these methods, the contradictory conclusions are drawn. The difficulty is that both the sample diversity and the sample quality should be taken into account simultaneously, when a generative model is evaluated. To solve this issue, a novel metric of distributional discrepancy (DD) is designed to evaluate generators according to the discrepancy between the generated sentences and the real training sentences. But, a challenge is that it can't compute DD directly because the distribution of real sentences is unavailable. Thus, we propose a method to estimate DD by training a neural-network-based text classifier. For comparison, three existing metrics, Bilingual Evaluation Understudy (BLEU) verse self-BLEU, language model score verse reverse language model score, Fr'chet Embedding Distance (FED), together with the proposed DD, are used to evaluate two popular generative models of LSTM and GPT-2 on both syntactic and real data. Experimental results show DD is much better than the three existing metrics in ranking these generative models.


page 1

page 2

page 3

page 4


Jointly Measuring Diversity and Quality in Text Generation Models

Text generation is an important Natural Language Processing task with va...

On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation

The goal of text generation models is to fit the underlying real probabi...

Random Network Distillation as a Diversity Metric for Both Image and Text Generation

Generative models are increasingly able to produce remarkably high quali...

Eval all, trust a few, do wrong to none: Comparing sentence generation models

In this paper, we study recent neural generative models for text generat...

Sparse Text Generation

Current state-of-the-art text generators build on powerful language mode...

Separating the Human Touch from AI-Generated Text using Higher Criticism: An Information-Theoretic Approach

We propose a method to determine whether a given article was entirely wr...

Detecting Levels of Depression in Text Based on Metrics

Depression is one of the most common and a major concern for society. Pr...

Please sign up or login with your details

Forgot password? Click here to reset