What is Not in the Context? Evaluation of Few-shot Learners with Informative Demonstrations

12/03/2022
by   Michal Štefánik, et al.
0

Large language models demonstrate an emergent ability to learn a new task from a small number of input-output demonstrations, referred to as in-context few-shot learning. However, recent work shows that in such settings, models mainly learn to mimic the new task distribution, instead of the mechanics of the new task. We argue that the commonly-used evaluation settings of few-shot models utilizing a random selection of in-context demonstrations is not able to disentangle models' ability to learn new skills from demonstrations, as most of the such-selected demonstrations are not informative for prediction beyond exposing the new task's input and output distribution. Therefore, we introduce an evaluation technique that disentangles few-shot learners' gain from in-context learning by picking the demonstrations sharing a specific, informative concept with the predicted sample, in addition to the performance reached by mainly non-informative samples. We find that regardless of the model size, existing few-shot learners are not able to benefit from observing such informative concepts in demonstrations. We also find that such ability may not be obtained trivially by exposing the informative demonstrations in the training process, leaving the challenge of training true in-context learners open.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2022

Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations

Although large language models can be prompted for both zero- and few-sh...
research
05/23/2023

Active Learning Principles for In-Context Learning with Large Language Models

The remarkable advancements in large language models (LLMs) have signifi...
research
05/23/2023

Concept-aware Training Improves In-context Learning Ability of Language Models

Many recent language models (LMs) of Transformers family exhibit so-call...
research
05/24/2023

Self-ICL: Zero-Shot In-Context Learning with Self-Generated Demonstrations

Large language models (LMs) have exhibited superior in-context learning ...
research
05/24/2023

Coverage-based Example Selection for In-Context Learning

In-context learning (ICL), the ability of large language models to perfo...
research
07/18/2023

Overthinking the Truth: Understanding how Language Models Process False Demonstrations

Modern language models can imitate complex patterns through few-shot lea...
research
05/18/2023

Efficient Prompting via Dynamic In-Context Learning

The primary way of building AI applications is shifting from training sp...

Please sign up or login with your details

Forgot password? Click here to reset