Does Interference Exist When Training a Once-For-All Network?

04/20/2022
by   Jordan Shipard, et al.
0

The Once-For-All (OFA) method offers an excellent pathway to deploy a trained neural network model into multiple target platforms by utilising the supernet-subnet architecture. Once trained, a subnet can be derived from the supernet (both architecture and trained weights) and deployed directly to the target platform with little to no retraining or fine-tuning. To train the subnet population, OFA uses a novel training method called Progressive Shrinking (PS) which is designed to limit the negative impact of interference during training. It is believed that higher interference during training results in lower subnet population accuracies. In this work we take a second look at this interference effect. Surprisingly, we find that interference mitigation strategies do not have a large impact on the overall subnet population performance. Instead, we find the subnet architecture selection bias during training to be a more important aspect. To show this, we propose a simple-yet-effective method called Random Subnet Sampling (RSS), which does not have mitigation on the interference effect. Despite no mitigation, RSS is able to produce a better performing subnet population than PS in four small-to-medium-sized datasets; suggesting that the interference effect does not play a pivotal role in these datasets. Due to its simplicity, RSS provides a 1.9× reduction in training times compared to PS. A 6.1× reduction can also be achieved with a reasonable drop in performance when the number of RSS training epochs are reduced. Code available at https://github.com/Jordan-HS/RSS-Interference-CVPRW2022.

READ FULL TEXT
research
06/02/2023

Resolving Interference When Merging Models

Transfer learning - i.e., further fine-tuning a pre-trained model on a d...
research
06/19/2017

meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting

We propose a simple yet effective technique for neural network learning....
research
05/12/2023

A Comprehensive Analysis of Adapter Efficiency

Adapters have been positioned as a parameter-efficient fine-tuning (PEFT...
research
05/05/2023

Physics-based network fine-tuning for robust quantitative susceptibility mapping from high-pass filtered phase

Purpose: To improve the generalization ability of convolutional neural n...
research
09/11/2019

Should We Worry About Interference in Emerging Dense NGSO Satellite Constellations?

Many satellite operators are planning to deploy NGSO systems for broadba...
research
02/14/2023

A modern look at the relationship between sharpness and generalization

Sharpness of minima is a promising quantity that can correlate with gene...
research
06/20/2023

Augmenting Sub-model to Improve Main Model

Image classification has improved with the development of training techn...

Please sign up or login with your details

Forgot password? Click here to reset