This work studies post-training parameter quantization in large language...
Pruning is a popular technique for reducing the model size and computati...
A black-box spectral method is introduced for evaluating the adversarial...
Deploying deep learning models on embedded systems for computer vision t...
FPGAs provide a flexible and efficient platform to accelerate
rapidly-ch...
Quantization is a promising approach for reducing the inference time and...
Quantization is an effective method for reducing memory footprint and
in...