FAQ: Mitigating the Impact of Faults in the Weight Memory of DNN Accelerators through Fault-Aware Quantization

by   Muhammad Abdullah Hanif, et al.

Permanent faults induced due to imperfections in the manufacturing process of Deep Neural Network (DNN) accelerators are a major concern, as they negatively impact the manufacturing yield of the chip fabrication process. Fault-aware training is the state-of-the-art approach for mitigating such faults. However, it incurs huge retraining overheads, specifically when used for large DNNs trained on complex datasets. To address this issue, we propose a novel Fault-Aware Quantization (FAQ) technique for mitigating the effects of stuck-at permanent faults in the on-chip weight memory of DNN accelerators at a negligible overhead cost compared to fault-aware retraining while offering comparable accuracy results. We propose a lookup table-based algorithm to achieve ultra-low model conversion time. We present extensive evaluation of the proposed approach using five different DNNs, i.e., ResNet-18, VGG11, VGG16, AlexNet and MobileNetV2, and three different datasets, i.e., CIFAR-10, CIFAR-100 and ImageNet. The results demonstrate that FAQ helps in maintaining the baseline accuracy of the DNNs at low and moderate fault rates without involving costly fault-aware training. For example, for ResNet-18 trained on the CIFAR-10 dataset, at 0.04 fault rate FAQ offers (on average) an increase of 76.38 0.04 fault rate FAQ offers (on average) an increase of 70.47 results also show that FAQ incurs negligible overheads, i.e., less than 5 the time required to run 1 epoch of retraining. We additionally demonstrate the efficacy of our technique when used in conjunction with fault-aware retraining and show that the use of FAQ inside fault-aware retraining enables fast accuracy recovery.


page 1

page 2


Reduce: A Framework for Reducing the Overheads of Fault-Aware Retraining

Fault-aware retraining has emerged as a prominent technique for mitigati...

eFAT: Improving the Effectiveness of Fault-Aware Training for Mitigating Permanent Faults in DNN Hardware Accelerators

Fault-Aware Training (FAT) has emerged as a highly effective technique f...

Yield Loss Reduction and Test of AI and Deep Learning Accelerators

With data-driven analytics becoming mainstream, the global demand for de...

Analyzing and Mitigating the Impact of Permanent Faults on a Systolic Array Based Neural Network Accelerator

Due to their growing popularity and computational cost, deep neural netw...

RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults

To maximize the performance and energy efficiency of Spiking Neural Netw...

FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep Neural Networks

Model compression via quantization and sparsity enhancement has gained a...

Pinning Fault Mode Modeling for DWM Shifting

Extreme scaling for purposes of achieving higher density and lower energ...

Please sign up or login with your details

Forgot password? Click here to reset