ADMP: An Adversarial Double Masks Based Pruning Framework For Unsupervised Cross-Domain Compression

by   Xiaoyu Feng, et al.

Despite the recent progress of network pruning, directly applying it to the Internet of Things (IoT) applications still faces two challenges, i.e. the distribution divergence between end and cloud data and the missing of data label on end devices. One straightforward solution is to combine the unsupervised domain adaptation (UDA) technique and pruning. For example, the model is first pruned on the cloud and then transferred from cloud to end by UDA. However, such a naive combination faces high performance degradation. Hence this work proposes an Adversarial Double Masks based Pruning (ADMP) for such cross-domain compression. In ADMP, we construct a Knowledge Distillation framework not only to produce pseudo labels but also to provide a measurement of domain divergence as the output difference between the full-size teacher and the pruned student. Unlike existing mask-based pruning works, two adversarial masks, i.e. soft and hard masks, are adopted in ADMP. So ADMP can prune the model effectively while still allowing the model to extract strong domain-invariant features and robust classification boundaries. During training, the Alternating Direction Multiplier Method is used to overcome the binary constraint of 0,1-masks. On Office31 and ImageCLEF-DA datasets, the proposed ADMP can prune 60 loss respectively. Compared with the state of art, we can achieve about 1.63x parameters reduction and 4.1


page 1

page 2

page 3

page 4


Knowledge Distillation for BERT Unsupervised Domain Adaptation

A pre-trained language model, BERT, has brought significant performance ...

Distilling Universal and Joint Knowledge for Cross-Domain Model Compression on Time Series Data

For many real-world time series tasks, the computational complexity of p...

Data-Free Knowledge Transfer: A Survey

In the last decade, many deep learning models have been well trained and...

Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation

The teacher-student (T/S) learning has been shown effective in unsupervi...

On the Robustness of Domain Adaption to Adversarial Attacks

State-of-the-art deep neural networks (DNNs) have been proved to have ex...

ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations

Structured pruning methods are among the effective strategies for extrac...

How Erdös and Rényi Win the Lottery

Random masks define surprisingly effective sparse neural network models,...

Please sign up or login with your details

Forgot password? Click here to reset