SLM: Learning a Discourse Language Representation with Sentence Unshuffling

10/30/2020
by   Haejun Lee, et al.
0

We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner. Recent pre-training methods in NLP focus on learning either bottom or top-level language representations: contextualized word representations derived from language model objectives at one extreme and a whole sequence representation learned by order classification of two given textual segments at the other. However, these models are not directly encouraged to capture representations of intermediate-size structures that exist in natural languages such as sentences and the relationships among them. To that end, we propose a new approach to encourage learning of a contextualized sentence-level representation by shuffling the sequence of input sentences and training a hierarchical transformer model to reconstruct the original ordering. Through experiments on downstream tasks such as GLUE, SQuAD, and DiscoEval, we show that this feature of our model improves the performance of the original BERT by large margins.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/13/2021

Dict-BERT: Enhancing Language Model Pre-training with Dictionary

Pre-trained language models (PLMs) aim to learn universal language repre...
research
10/13/2020

Corruption Is Not All Bad: Incorporating Discourse Structure into Pre-training via Corruption for Essay Scoring

Existing approaches for automated essay scoring and document representat...
research
01/17/2023

Learning a Formality-Aware Japanese Sentence Representation

While the way intermediate representations are generated in encoder-deco...
research
09/10/2021

Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations

Current language models are usually trained using a self-supervised sche...
research
03/30/2022

Auto-MLM: Improved Contrastive Learning for Self-supervised Multi-lingual Knowledge Retrieval

Contrastive learning (CL) has become a ubiquitous approach for several n...
research
10/14/2020

A Self-supervised Representation Learning of Sentence Structure for Authorship Attribution

Syntactic structure of sentences in a document substantially informs abo...
research
01/20/2020

Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching

Transformer has been successfully applied to many natural language proce...

Please sign up or login with your details

Forgot password? Click here to reset