Deep Collective Knowledge Distillation

04/18/2023
by   Jihyeon Seo, et al.
0

Many existing studies on knowledge distillation have focused on methods in which a student model mimics a teacher model well. Simply imitating the teacher's knowledge, however, is not sufficient for the student to surpass that of the teacher. We explore a method to harness the knowledge of other students to complement the knowledge of the teacher. We propose deep collective knowledge distillation for model compression, called DCKD, which is a method for training student models with rich information to acquire knowledge from not only their teacher model but also other student models. The knowledge collected from several student models consists of a wealth of information about the correlation between classes. Our DCKD considers how to increase the correlation knowledge of classes during training. Our novel method enables us to create better performing student models for collecting knowledge. This simple yet powerful method achieves state-of-the-art performances in many experiments. For example, for ImageNet, ResNet18 trained with DCKD achieves 72.27%, which outperforms the pretrained ResNet18 by 2.52%. For CIFAR-100, the student model of ShuffleNetV1 with DCKD achieves 6.55% higher top-1 accuracy than the pretrained ShuffleNetV1.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2021

Learning Student-Friendly Teacher Networks for Knowledge Distillation

We propose a novel knowledge distillation approach to facilitate the tra...
research
05/04/2022

Generalized Knowledge Distillation via Relationship Matching

The knowledge of a well-trained deep neural network (a.k.a. the "teacher...
research
11/15/2020

Online Ensemble Model Compression using Knowledge Distillation

This paper presents a novel knowledge distillation based model compressi...
research
09/15/2020

Noisy Self-Knowledge Distillation for Text Summarization

In this paper we apply self-knowledge distillation to text summarization...
research
12/01/2021

Information Theoretic Representation Distillation

Despite the empirical success of knowledge distillation, there still lac...
research
07/10/2022

1st Place Solution to the EPIC-Kitchens Action Anticipation Challenge 2022

In this report, we describe the technical details of our submission to t...
research
06/01/2022

ORC: Network Group-based Knowledge Distillation using Online Role Change

In knowledge distillation, since a single, omnipotent teacher network ca...

Please sign up or login with your details

Forgot password? Click here to reset