ShrinkTeaNet: Million-scale Lightweight Face Recognition via Shrinking Teacher-Student Networks

by   Chi Nhan Duong, et al.

Large-scale face recognition in-the-wild has been recently achieved matured performance in many real work applications. However, such systems are built on GPU platforms and mostly deploy heavy deep network architectures. Given a high-performance heavy network as a teacher, this work presents a simple and elegant teacher-student learning paradigm, namely ShrinkTeaNet, to train a portable student network that has significantly fewer parameters and competitive accuracy against the teacher network. Far apart from prior teacher-student frameworks mainly focusing on accuracy and compression ratios in closed-set problems, our proposed teacher-student network is proved to be more robust against open-set problem, i.e. large-scale face recognition. In addition, this work introduces a novel Angular Distillation Loss for distilling the feature direction and the sample distributions of the teacher's hypersphere to its student. Then ShrinkTeaNet framework can efficiently guide the student's learning process with the teacher's knowledge presented in both intermediate and last stages of the feature embedding. Evaluations on LFW, CFP-FP, AgeDB, IJB-B and IJB-C Janus, and MegaFace with one million distractors have demonstrated the efficiency of the proposed approach to learn robust student networks which have satisfying accuracy and compact sizes. Our ShrinkTeaNet is able to support the light-weight architecture achieving high performance with 99.77


page 1

page 2

page 3

page 4


Robust Student Network Learning

Deep neural networks bring in impressive accuracy in various application...

Deep Face Recognition Model Compression via Knowledge Transfer and Distillation

Fully convolutional networks (FCNs) have become de facto tool to achieve...

MarginDistillation: distillation for margin-based softmax

The usage of convolutional neural networks (CNNs) in conjunction with a ...

Teacher-Student Adversarial Depth Hallucination to Improve Face Recognition

We present the Teacher-Student Generative Adversarial Network (TS-GAN) t...

Improving Fast Segmentation With Teacher-student Learning

Recently, segmentation neural networks have been significantly improved ...

Distilling EEG Representations via Capsules for Affective Computing

Affective computing with Electroencephalogram (EEG) is a challenging tas...

Please sign up or login with your details

Forgot password? Click here to reset