NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task–Next Sentence Prediction

09/08/2021
by   Yi Sun, et al.
0

Using prompts to utilize language models to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm. Nonetheless, virtually all prompt-based methods are token-level, meaning they all utilize GPT's left-to-right language model or BERT's masked language model to perform cloze-style tasks. In this paper, we attempt to accomplish several NLP tasks in the zero-shot scenario using a BERT original pre-training task abandoned by RoBERTa and other models–Next Sentence Prediction (NSP). Unlike token-level techniques, our sentence-level prompt-based method NSP-BERT does not need to fix the length of the prompt or the position to be predicted, allowing it to handle tasks such as entity linking with ease. Based on the characteristics of NSP-BERT, we offer several quick building templates for various downstream tasks. We suggest a two-stage prompt method for word sense disambiguation tasks in particular. Our strategies for mapping the labels significantly enhance the model's performance on sentence pair tasks. On the FewCLUE benchmark, our NSP-BERT outperforms other zero-shot methods on most of these tasks and comes close to the few-shot methods.

READ FULL TEXT
research
07/17/2022

ELECTRA is a Zero-Shot Learner, Too

Recently, for few-shot or even zero-shot learning, the new paradigm "pre...
research
03/03/2021

OAG-BERT: Pre-train Heterogeneous Entity-augmented Academic Language Model

To enrich language models with domain knowledge is crucial but difficult...
research
05/24/2022

On the Role of Bidirectionality in Language Model Pre-Training

Prior work on language model pre-training has explored different archite...
research
04/19/2023

MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning

Prompt-based learning reformulates downstream tasks as cloze problems by...
research
10/12/2020

Zero-shot Entity Linking with Efficient Long Range Sequence Modeling

This paper considers the problem of zero-shot entity linking, in which a...
research
06/13/2023

Improving Zero-Shot Detection of Low Prevalence Chest Pathologies using Domain Pre-trained Language Models

Recent advances in zero-shot learning have enabled the use of paired ima...
research
09/28/2021

Template-free Prompt Tuning for Few-shot NER

Prompt-based methods have been successfully applied in sentence-level fe...

Please sign up or login with your details

Forgot password? Click here to reset