Network Binarization via Contrastive Learning

07/06/2022
by   Yuzhang Shang, et al.
12

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit. However, there is still a huge performance gap between Binary Neural Networks (BNNs) and their full-precision (FP) counterparts. As the quantization error caused by weights binarization has been reduced in earlier works, the activations binarization becomes the major obstacle for further improvement of the accuracy. BNN characterises a unique and interesting structure, where the binary and latent FP activations exist in the same forward pass (i.e. Binarize(𝐚_F) = 𝐚_B). To mitigate the information degradation caused by the binarization operation from FP to binary activations, we establish a novel contrastive learning framework while training BNNs through the lens of Mutual Information (MI) maximization. MI is introduced as the metric to measure the information shared between binary and FP activations, which assists binarization with contrastive learning. Specifically, the representation ability of the BNNs is greatly strengthened via pulling the positive pairs with binary and FP activations from the same input samples, as well as pushing negative pairs from different samples (the number of negative pairs can be exponentially large). This benefits the downstream tasks, not only classification but also segmentation and depth estimation, etc. The experimental results show that our method can be implemented as a pile-up module on existing state-of-the-art binarization methods and can remarkably improve the performance over them on CIFAR-10/100 and ImageNet, in addition to the great generalization ability on NYUD-v2.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2022

Do More Negative Samples Necessarily Hurt in Contrastive Learning?

Recent investigations in noise contrastive estimation suggest, both empi...
research
11/20/2022

Towards Generalizable Graph Contrastive Learning: An Information Theory Perspective

Graph contrastive learning (GCL) emerges as the most representative appr...
research
08/17/2022

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets

This paper studies the Binary Neural Networks (BNNs) in which weights an...
research
09/26/2019

Balanced Binary Neural Networks with Gated Residual

Binary neural networks have attracted numerous attention in recent years...
research
12/20/2022

Redistribution of Weights and Activations for AdderNet Quantization

Adder Neural Network (AdderNet) provides a new way for developing energy...
research
04/04/2022

Soft Threshold Ternary Networks

Large neural networks are difficult to deploy on mobile devices because ...
research
05/13/2023

GSB: Group Superposition Binarization for Vision Transformer with Limited Training Samples

Affected by the massive amount of parameters, ViT usually suffers from s...

Please sign up or login with your details

Forgot password? Click here to reset