FP8 is a natural progression for accelerating deep learning training
inf...
Quantization techniques can reduce the size of Deep Neural Networks and
...
We show that selecting a fixed precision for all activations in Convolut...
We show that, during inference with Convolutional Neural Networks (CNNs)...
Tartan (TRT), a hardware accelerator for inference with Deep Neural Netw...
Loom (LM), a hardware inference accelerator for Convolutional Neural Net...
Stripes is a Deep Neural Network (DNN) accelerator that uses bit-serial
...
This work investigates how using reduced precision data in Convolutional...