QKD: Quantization-aware Knowledge Distillation

by   Jangho Kim, et al.

Quantization and Knowledge distillation (KD) methods are widely used to reduce memory and power consumption of deep neural networks (DNNs), especially for resource-constrained edge devices. Although their combination is quite promising to meet these requirements, it may not work as desired. It is mainly because the regularization effect of KD further diminishes the already reduced representation power of a quantized model. To address this short-coming, we propose Quantization-aware Knowledge Distillation (QKD) wherein quantization and KD are care-fully coordinated in three phases. First, Self-studying (SS) phase fine-tunes a quantized low-precision student network without KD to obtain a good initialization. Second, Co-studying (CS) phase tries to train a teacher to make it more quantizaion-friendly and powerful than a fixed teacher. Finally, Tutoring (TU) phase transfers knowledge from the trained teacher to the student. We extensively evaluate our method on ImageNet and CIFAR-10/100 datasets and show an ablation study on networks with both standard and depthwise-separable convolutions. The proposed QKD outperformed existing state-of-the-art methods (e.g., 1.3 on MobileNetV2 with W4A4). Additionally, QKD could recover the full-precision accuracy at as low as W3A3 quantization on ResNet and W6A6 quantization on MobilenetV2.


Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks

The quantization of deep neural networks (QDNNs) has been actively studi...

PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation

As edge devices become prevalent, deploying Deep Neural Networks (DNN) o...

Model compression via distillation and quantization

Deep neural networks (DNNs) continue to make significant advances, solvi...

Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks

The deep layers of modern neural networks extract a rather rich set of f...

Feature Affinity Assisted Knowledge Distillation and Quantization of Deep Neural Networks on Label-Free Data

In this paper, we propose a feature affinity (FA) assisted knowledge dis...

Quantization Mimic: Towards Very Tiny CNN for Object Detection

In this paper, we propose a simple and general framework for training ve...

Please sign up or login with your details

Forgot password? Click here to reset