Hyperdecoders: Instance-specific decoders for multi-task NLP

03/15/2022
by   Hamish Ivison, et al.
0

We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder. This approach produces a unique decoder for every input instance, allowing the network a larger degree of flexibility than prior work that specializes the decoder for each task. We apply our method to sequence classification tasks, extractive QA, and summarisation and find that it often outperforms fully finetuning the underlying model and surpasses previous parameter efficient fine-tuning methods. Gains are particularly large when evaluated out-of-domain on the MRQA benchmark. In addition, as the pretrained model is frozen, our method eliminates negative interference among unrelated tasks, a common failure mode in fully fine-tuned approaches. An analysis of the embeddings produced by our model suggests that a large benefit of our approach is allowing the encoder more effective control over the decoder, allowing mapping from hidden representations to a final text-based label without interference from other tasks' output formats or labels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2020

Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters Less Data

Multi-Task Learning (MTL) has emerged as a promising approach for transf...
research
11/07/2022

Multi-Head Adapter Routing for Data-Efficient Fine-Tuning

Parameter-efficient fine-tuning (PEFT) methods can adapt large language ...
research
12/12/2022

Parameter-Efficient Finetuning of Transformers for Source Code

Pretrained Transformers achieve state-of-the-art performance in various ...
research
12/19/2022

Multilingual Sequence-to-Sequence Models for Hebrew NLP

Recent work attributes progress in NLP to large language models (LMs) wi...
research
09/15/2023

Large Language Models for Failure Mode Classification: An Investigation

In this paper we present the first investigation into the effectiveness ...
research
05/29/2023

Multi-Modal Face Stylization with a Generative Prior

In this work, we introduce a new approach for artistic face stylization....
research
04/17/2023

CyFormer: Accurate State-of-Health Prediction of Lithium-Ion Batteries via Cyclic Attention

Predicting the State-of-Health (SoH) of lithium-ion batteries is a funda...

Please sign up or login with your details

Forgot password? Click here to reset