Meta-tuning Language Models to Answer Prompts Better

04/10/2021
by   Ruiqi Zhong, et al.
9

Large pretrained language models like GPT-3 have acquired a surprising ability to perform zero-shot classification (ZSC). For example, to classify review sentiments, we can "prompt" the language model with the review and the question "Is the review positive?" as the context, and ask it to predict whether the next word is "Yes" or "No". However, these models are not specialized for answering these prompts. To address this weakness, we propose meta-tuning, which trains the model to specialize in answering prompts but still generalize to unseen tasks. To create the training data, we aggregated 43 existing datasets, annotated 441 label descriptions in total, and unified them into the above question answering (QA) format. After meta-tuning, our model outperforms a same-sized QA model for most labels on unseen tasks, and we forecast that the performance would improve for even larger models. Therefore, measuring ZSC performance on non-specialized language models might underestimate their true capability, and community-wide efforts on aggregating datasets and unifying their formats can help build models that understand prompts better.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2022

RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering

We introduce RoMQA, the first benchmark for robust, multi-evidence, mult...
research
10/05/2022

Ask Me Anything: A simple strategy for prompting language models

Large language models (LLMs) transfer well to new tasks out-of-the-box s...
research
05/02/2020

UnifiedQA: Crossing Format Boundaries With a Single QA System

Question answering (QA) tasks have been posed using a variety of formats...
research
12/20/2022

Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

Large pretrained language models have shown surprising In-Context Learni...
research
10/06/2022

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

Meta-training, which fine-tunes the language model (LM) on various downs...
research
12/16/2021

DREAM: Uncovering Mental Models behind Language Models

To what extent do language models (LMs) build "mental models" of a scene...
research
07/28/2023

A Critical Review of Large Language Models: Sensitivity, Bias, and the Path Toward Specialized AI

This paper examines the comparative effectiveness of a specialized compi...

Please sign up or login with your details

Forgot password? Click here to reset