Does Pretraining for Summarization Require Knowledge Transfer?

09/10/2021
by   Kundan Krishna, et al.
0

Pretraining techniques leveraging enormous datasets have driven recent advances in text summarization. While folk explanations suggest that knowledge transfer accounts for pretraining's benefits, little is known about why it works or what makes a pretraining task or dataset suitable. In this paper, we challenge the knowledge transfer story, showing that pretraining on documents consisting of character n-grams selected at random, we can nearly match the performance of models pretrained on real corpora. This work holds the promise of eliminating upstream corpora, which may alleviate some concerns over offensive language, bias, and copyright issues. To see whether the small residual benefit of using real data could be accounted for by the structure of the pretraining task, we design several tasks motivated by a qualitative study of summarization corpora. However, these tasks confer no appreciable benefit, leaving open the possibility of a small role for knowledge transfer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2019

Multi-stage Pretraining for Abstractive Summarization

Neural models for abstractive summarization tend to achieve the best per...
research
09/28/2022

Downstream Datasets Make Surprisingly Good Pretraining Corpora

For most natural language processing tasks, the dominant practice is to ...
research
12/20/2022

Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization

In long document controllable summarization, where labeled data is scarc...
research
09/30/2021

Compositional generalization in semantic parsing with pretrained transformers

Large-scale pretraining instills large amounts of knowledge in deep neur...
research
06/05/2023

LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning

Lifelong learning offers a promising paradigm of building a generalist a...
research
02/27/2020

Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue Representation Learning

Multi-role dialogue understanding comprises a wide range of diverse task...
research
09/08/2023

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

Large volumes of text data have contributed significantly to the develop...

Please sign up or login with your details

Forgot password? Click here to reset