MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

11/05/2021
by   Yuhang Li, et al.
0

Model quantization has emerged as an indispensable technique to accelerate deep learning inference. While researchers continue to push the frontier of quantization algorithms, existing quantization work is often unreproducible and undeployable. This is because researchers do not choose consistent training pipelines and ignore the requirements for hardware deployments. In this work, we propose Model Quantization Benchmark (MQBench), a first attempt to evaluate, analyze, and benchmark the reproducibility and deployability for model quantization algorithms. We choose multiple different platforms for real-world deployments, including CPU, GPU, ASIC, DSP, and evaluate extensive state-of-the-art quantization algorithms under a unified training pipeline. MQBench acts like a bridge to connect the algorithm and the hardware. We conduct a comprehensive analysis and find considerable intuitive or counter-intuitive insights. By aligning the training settings, we find existing algorithms have about the same performance on the conventional academic track. While for the hardware-deployable quantization, there is a huge accuracy gap which remains unsettled. Surprisingly, no existing algorithm wins every challenge in MQBench, and we hope this work could inspire future research directions.

READ FULL TEXT
research
03/11/2022

Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision

Contrastive Language-Image Pretraining (CLIP) has emerged as a novel par...
research
12/01/2021

Hardware-friendly Deep Learning by Network Quantization and Binarization

Quantization is emerging as an efficient approach to promote hardware-fr...
research
03/27/2021

Automated Backend-Aware Post-Training Quantization

Quantization is a key technique to reduce the resource requirement and i...
research
03/10/2022

An Empirical Study of Low Precision Quantization for TinyML

Tiny machine learning (tinyML) has emerged during the past few years aim...
research
02/12/2021

Confounding Tradeoffs for Neural Network Quantization

Many neural network quantization techniques have been developed to decre...
research
10/25/2021

Demystifying and Generalizing BinaryConnect

BinaryConnect (BC) and its many variations have become the de facto stan...
research
07/19/2023

Self-Supervised Learning for WiFi CSI-Based Human Activity Recognition: A Systematic Study

Recently, with the advancement of the Internet of Things (IoT), WiFi CSI...

Please sign up or login with your details

Forgot password? Click here to reset