A Contrastive Knowledge Transfer Framework for Model Compression and Transfer Learning

03/14/2023
by   Kaiqi Zhao, et al.
0

Knowledge Transfer (KT) achieves competitive performance and is widely used for image classification tasks in model compression and transfer learning. Existing KT works transfer the information from a large model ("teacher") to train a small model ("student") by minimizing the difference of their conditionally independent output distributions. However, these works overlook the high-dimension structural knowledge from the intermediate representations of the teacher, which leads to limited effectiveness, and they are motivated by various heuristic intuitions, which makes it difficult to generalize. This paper proposes a novel Contrastive Knowledge Transfer Framework (CKTF), which enables the transfer of sufficient structural knowledge from the teacher to the student by optimizing multiple contrastive objectives across the intermediate representations between them. Also, CKTF provides a generalized agreement to existing KT techniques and increases their performance significantly by deriving them as specific cases of CKTF. The extensive evaluation shows that CKTF consistently outperforms the existing KT works by 0.04 compression and by 0.4 datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2020

Contrastive Distillation on Intermediate Representations for Language Model Compression

Existing language model compression methods mostly use a simple L2 loss ...
research
12/15/2020

Wasserstein Contrastive Representation Distillation

The primary goal of knowledge distillation (KD) is to encapsulate the in...
research
11/08/2019

Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning

This paper proposes a step toward obtaining general models of knowledge ...
research
07/27/2023

Contrastive Knowledge Amalgamation for Unsupervised Image Classification

Knowledge amalgamation (KA) aims to learn a compact student model to han...
research
03/28/2018

Adversarial Network Compression

Neural network compression has recently received much attention due to t...
research
10/02/2018

LIT: Block-wise Intermediate Representation Training for Model Compression

Knowledge distillation (KD) is a popular method for reducing the computa...
research
02/27/2023

You Only Transfer What You Share: Intersection-Induced Graph Transfer Learning for Link Prediction

Link prediction is central to many real-world applications, but its perf...

Please sign up or login with your details

Forgot password? Click here to reset