Assessing Discourse Relations in Language Generationfrom Pre-trained Language Models

by   Wei-Jen Ko, et al.
The University of Texas at Austin

Recent advances in NLP have been attributed to the emergence of large-scale pre-trained language models. GPT-2, in particular, is suited for generation tasks given its left-to-right language modeling objective, yet the linguistic quality of its generated text has largely remain unexplored. Our work takes a step in understanding GPT-2's outputs in terms of discourse coherence. We perform a comprehensive study on the validity of explicit discourse relations in GPT-2's outputs under both organic generation and fine-tuned scenarios. Results show GPT-2 does not always generate text containing valid discourse relations; nevertheless, its text is more aligned with human expectation in the fine-tuned scenario. We propose a decoupled strategy to mitigate these problems and highlight the importance of explicitly modeling discourse information.


Assessing Discourse Relations in Language Generation from Pre-trained Language Models

Recent advances in NLP have been attributed to the emergence of large-sc...

Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models

With a growing number of BERTology work analyzing different components o...

Labeling Explicit Discourse Relations using Pre-trained Language Models

Labeling explicit discourse relations is one of the most challenging sub...

Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues

Discourse processing suffers from data sparsity, especially for dialogue...

Personality Traits in Large Language Models

The advent of large language models (LLMs) has revolutionized natural la...

Text-based NP Enrichment

Understanding the relations between entities denoted by NPs in text is a...

What GPT Knows About Who is Who

Coreference resolution – which is a crucial task for understanding disco...

Please sign up or login with your details

Forgot password? Click here to reset