FATNN: Fast and Accurate Ternary Neural Networks

08/12/2020
by   Peng Chen, et al.
10

Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts. However, 2 bits are required to encode the ternary representation with only 3 quantization levels leveraged. As a result, conventional TNNs have similar memory consumption and speed compared with the standard 2-bit models, but have worse representational capability. Moreover, there is still a significant gap in accuracy between TNNs and full-precision networks, hampering their deployment to real applications. To tackle these two challenges, in this work, we first show that, under some mild constraints, the computational complexity of ternary inner product can be reduced by 2x. Second, to mitigate the performance gap, we elaborately design an implementation-dependent ternary quantization algorithm. The proposed framework is termed Fast and Accurate Ternary Neural Networks (FATNN). Experiments on image classification demonstrate that our FATNN surpasses the state-of-the-arts by a significant margin in accuracy. More importantly, speedup evaluation comparing with various precisions is analyzed on several platforms, which serves as a strong benchmark for further research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2020

Least squares binary quantization of neural networks

Quantizing weights and activations of deep neural networks results in si...
research
08/02/2019

U-Net Fixed-Point Quantization for Medical Image Segmentation

Model quantization is leveraged to reduce the memory consumption and the...
research
01/14/2021

On the quantization of recurrent neural networks

Integer quantization of neural networks can be defined as the approximat...
research
09/19/2017

Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks

A low precision deep neural network training technique for producing spa...
research
02/03/2020

Widening and Squeezing: Towards Accurate and Efficient QNNs

Quantization neural networks (QNNs) are very attractive to the industry ...
research
02/23/2020

PoET-BiN: Power Efficient Tiny Binary Neurons

The success of neural networks in image classification has inspired vari...
research
06/27/2018

Generalized chart constraints for efficient PCFG and TAG parsing

Chart constraints, which specify at which string positions a constituent...

Please sign up or login with your details

Forgot password? Click here to reset