To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks

06/15/2020
by   Sinong Wang, et al.
0

Pretraining NLP models with variants of Masked Language Model (MLM) objectives has recently led to a significant improvements on many tasks. This paper examines the benefits of pretrained models as a function of the number of training samples used in the downstream task. On several text classification tasks, we show that as the number of training examples grow into the millions, the accuracy gap between finetuning BERT-based model and training vanilla LSTM from scratch narrows to within 1 might reach a diminishing return point as the supervised data size increases significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2021

Back-Translated Task Adaptive Pretraining: Improving Accuracy and Robustness on Text Classification

Language models (LMs) pretrained on a large text corpus and fine-tuned o...
research
12/09/2022

Audiovisual Masked Autoencoders

Can we leverage the audiovisual information already present in video to ...
research
02/05/2020

Aligning the Pretraining and Finetuning Objectives of Language Models

We demonstrate that explicitly aligning the pretraining objectives to th...
research
11/02/2018

Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks

Pretraining with language modeling and related unsupervised tasks has re...
research
02/14/2023

Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models

Pretrained large language models have become indispensable for solving v...
research
07/26/2019

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Language model pretraining has led to significant performance gains but ...
research
01/18/2021

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

In this work, we explore joint energy-based model (EBM) training during ...

Please sign up or login with your details

Forgot password? Click here to reset