Improving Generalization in Meta-Learning via Meta-Gradient Augmentation

06/14/2023
by   Ren Wang, et al.
0

Meta-learning methods typically follow a two-loop framework, where each loop potentially suffers from notorious overfitting, hindering rapid adaptation and generalization to new tasks. Existing schemes solve it by enhancing the mutual-exclusivity or diversity of training samples, but these data manipulation strategies are data-dependent and insufficiently flexible. This work alleviates overfitting in meta-learning from the perspective of gradient regularization and proposes a data-independent Meta-Gradient Augmentation (MGAug) method. The key idea is to first break the rote memories by network pruning to address memorization overfitting in the inner loop, and then the gradients of pruned sub-networks naturally form the high-quality augmentation of the meta-gradient to alleviate learner overfitting in the outer loop. Specifically, we explore three pruning strategies, including random width pruning, random parameter pruning, and a newly proposed catfish pruning that measures a Meta-Memorization Carrying Amount (MMCA) score for each parameter and prunes high-score ones to break rote memories as much as possible. The proposed MGAug is theoretically guaranteed by the generalization bound from the PAC-Bayes framework. In addition, we extend a lightweight version, called MGAug-MaxUp, as a trade-off between performance gains and resource overhead. Extensive experiments on multiple few-shot learning benchmarks validate MGAug's effectiveness and significant improvement over various meta-baselines. The code is publicly available at <https://github.com/xxLifeLover/Meta-Gradient-Augmentation>.

READ FULL TEXT

page 1

page 3

page 4

page 9

research
04/13/2020

Regularizing Meta-Learning via Gradient Dropout

With the growing attention on learning-to-learn new tasks using only a f...
research
02/05/2021

In-Loop Meta-Learning with Gradient-Alignment Reward

At the heart of the standard deep learning training loop is a greedy gra...
research
07/07/2020

Meta-Learning with Network Pruning

Meta-learning is a powerful paradigm for few-shot learning. Although wit...
research
09/10/2022

Adaptive Meta-learner via Gradient Similarity for Few-shot Text Classification

Few-shot text classification aims to classify the text under the few-sho...
research
03/22/2023

Meta-augmented Prompt Tuning for Better Few-shot Learning

Prompt tuning is a parameter-efficient method, which freezes all PLM par...
research
06/01/2022

Dataset Distillation using Neural Feature Regression

Dataset distillation aims to learn a small synthetic dataset that preser...
research
12/28/2022

Wormhole MAML: Meta-Learning in Glued Parameter Space

In this paper, we introduce a novel variation of model-agnostic meta-lea...

Please sign up or login with your details

Forgot password? Click here to reset