Different Tunes Played with Equal Skill: Exploring a Unified Optimization Subspace for Delta Tuning

by   Jing Yi, et al.
Tsinghua University

Delta tuning (DET, also known as parameter-efficient tuning) is deemed as the new paradigm for using pre-trained language models (PLMs). Up to now, various DETs with distinct design elements have been proposed, achieving performance on par with fine-tuning. However, the mechanisms behind the above success are still under-explored, especially the connections among various DETs. To fathom the mystery, we hypothesize that the adaptations of different DETs could all be reparameterized as low-dimensional optimizations in a unified optimization subspace, which could be found by jointly decomposing independent solutions of different DETs. Then we explore the connections among different DETs by conducting optimization within the subspace. In experiments, we find that, for a certain DET, conducting optimization simply in the subspace could achieve comparable performance to its original space, and the found solution in the subspace could be transferred to another DET and achieve non-trivial performance. We also visualize the performance landscape of the subspace and find that there exists a substantial region where different DETs all perform well. Finally, we extend our analysis and show the strong connections between fine-tuning and DETs.


Towards a Unified View of Parameter-Efficient Transfer Learning

Fine-tuning large pre-trained language models on downstream tasks has be...

A Kernel-Based View of Language Model Fine-Tuning

It has become standard to solve NLP tasks by fine-tuning pre-trained lan...

Exploring Lottery Prompts for Pre-trained Language Models

Consistently scaling pre-trained language models (PLMs) imposes substant...

Exploring Low-dimensional Intrinsic Task Subspace via Prompt Tuning

How can pre-trained language models (PLMs) learn universal representatio...

AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuning

Entity Matching (EM) involves identifying different data representations...

Reconstruction of Multi-user Binary Subspace Chirps

We consider codebooks of Complex Grassmannian Lines consisting of Binary...

Please sign up or login with your details

Forgot password? Click here to reset