Boosting Natural Language Generation from Instructions with Meta-Learning

by   Budhaditya Deb, et al.

Recent work has shown that language models (LMs) trained with multi-task instructional learning (MTIL) can solve diverse NLP tasks in zero- and few-shot settings with improved performance compared to prompt tuning. MTIL illustrates that LMs can extract and use information about the task from instructions beyond the surface patterns of the inputs and outputs. This suggests that meta-learning may further enhance the utilization of instructions for effective task transfer. In this paper we investigate whether meta-learning applied to MTIL can further improve generalization to unseen tasks in a zero-shot setting. Specifically, we propose to adapt meta-learning to MTIL in three directions: 1) Model Agnostic Meta Learning (MAML), 2) Hyper-Network (HNet) based adaptation to generate task specific parameters conditioned on instructions, and 3) an approach combining HNet and MAML. Through extensive experiments on the large scale Natural Instructions V2 dataset, we show that our proposed approaches significantly improve over strong baselines in zero-shot settings. In particular, meta-learning improves the effectiveness of instructions and is most impactful when the test tasks are strictly zero-shot (i.e. no similar tasks in the training set) and are "hard" for LMs, illustrating the potential of meta-learning for MTIL for out-of-distribution tasks.


page 8

page 9

page 15

page 16

page 17


Meta-Learning via Classifier(-free) Guidance

State-of-the-art meta-learning techniques do not optimize for zero-shot ...

Continued Pretraining for Better Zero- and Few-Shot Promptability

Recently introduced language model prompting methods can achieve high ac...

Meta-learning framework with applications to zero-shot time-series forecasting

Can meta-learning discover generic ways of processing time-series (TS) f...

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?

Prompt tuning (PT) which only tunes the embeddings of an additional sequ...

Meta-Optimization for Higher Model Generalizability in Single-Image Depth Prediction

Model generalizability to unseen datasets, concerned with in-the-wild ro...

Zero-shot meta-learning for small-scale data from human subjects

While developments in machine learning led to impressive performance gai...

Meta-Learning for Effective Multi-task and Multilingual Modelling

Natural language processing (NLP) tasks (e.g. question-answering in Engl...

Please sign up or login with your details

Forgot password? Click here to reset