A Universal Discriminator for Zero-Shot Generalization

11/15/2022
by   Haike Xu, et al.
0

Generative modeling has been the dominant approach for large-scale pretraining and zero-shot generalization. In this work, we challenge this convention by showing that discriminative approaches perform substantially better than generative ones on a large number of NLP tasks. Technically, we train a single discriminator to predict whether a text sample comes from the true data distribution, similar to GANs. Since many NLP tasks can be formulated as selecting from a few options, we use this discriminator to predict the option with the highest probability. This simple formulation achieves state-of-the-art zero-shot results on the T0 benchmark, outperforming T0 by 16.0%, 7.8%, and 11.5% respectively on different scales. In the finetuning setting, our approach also achieves new state-of-the-art results on a wide range of NLP tasks, with only 1/4 parameters of previous methods. Meanwhile, our approach requires minimal prompting efforts, which largely improves robustness and is essential for real-world applications. Furthermore, we also jointly train a generalized UD in combination with generative tasks, which maintains its advantage on discriminative tasks and simultaneously works on generative tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2022

ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization

We propose a multitask pretraining approach ZeroPrompt for zero-shot gen...
research
04/29/2022

Prompt Consistency for Zero-Shot Task Generalization

One of the most impressive results of recent NLP history is the ability ...
research
06/01/2023

Systematic Evaluation of GPT-3 for Zero-Shot Personality Estimation

Very large language models (LLMs) perform extremely well on a spectrum o...
research
05/21/2023

Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers

This paper explores the effectiveness of model-generated signals in impr...
research
11/25/2022

Learning with Silver Standard Data for Zero-shot Relation Extraction

The superior performance of supervised relation extraction (RE) methods ...
research
03/16/2023

Self-Consistent Learning: Cooperation between Generators and Discriminators

Using generated data to improve the performance of downstream discrimina...
research
09/13/2023

How (Not) to Use Sociodemographic Information for Subjective NLP Tasks

Annotators' sociodemographic backgrounds (i.e., the individual compositi...

Please sign up or login with your details

Forgot password? Click here to reset