RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults

To maximize the performance and energy efficiency of Spiking Neural Network (SNN) processing on resource-constrained embedded systems, specialized hardware accelerators/chips are employed. However, these SNN chips may suffer from permanent faults which can affect the functionality of weight memory and neuron behavior, thereby causing potentially significant accuracy degradation and system malfunctioning. Such permanent faults may come from manufacturing defects during the fabrication process, and/or from device/transistor damages (e.g., due to wear out) during the run-time operation. However, the impact of permanent faults in SNN chips and the respective mitigation techniques have not been thoroughly investigated yet. Toward this, we propose RescueSNN, a novel methodology to mitigate permanent faults in the compute engine of SNN chips without requiring additional retraining, thereby significantly cutting down the design time and retraining costs, while maintaining the throughput and quality. The key ideas of our RescueSNN methodology are (1) analyzing the characteristics of SNN under permanent faults; (2) leveraging this analysis to improve the SNN fault-tolerance through effective fault-aware mapping (FAM); and (3) devising lightweight hardware enhancements to support FAM. Our FAM technique leverages the fault map of SNN compute engine for (i) minimizing weight corruption when mapping weight bits on the faulty memory cells, and (ii) selectively employing faulty neurons that do not cause significant accuracy degradation to maintain accuracy and throughput, while considering the SNN operations and processing dataflow. The experimental results show that our RescueSNN improves accuracy by up to 80 reduction below 25 locations), as compared to running SNNs on the faulty chip without mitigation.


page 1

page 3

page 5

page 9

page 12

page 15

page 17


SoftSNN: Low-Cost Fault Tolerance for Spiking Neural Network Accelerators under Soft Errors

Specialized hardware accelerators have been designed and employed to max...

ReSpawn: Energy-Efficient Fault-Tolerance for Spiking Neural Networks considering Unreliable Memories

Spiking neural networks (SNNs) have shown a potential for having low ene...

Exposing Reliability Degradation and Mitigation in Approximate DNNs under Permanent Faults

Approximate computing is known for enhancing deep neural network acceler...

Improving Reliability of Spiking Neural Networks through Fault Aware Threshold Voltage Optimization

Spiking neural networks have made breakthroughs in computer vision by le...

FAQ: Mitigating the Impact of Faults in the Weight Memory of DNN Accelerators through Fault-Aware Quantization

Permanent faults induced due to imperfections in the manufacturing proce...

FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep Neural Networks

Model compression via quantization and sparsity enhancement has gained a...

Yield Loss Reduction and Test of AI and Deep Learning Accelerators

With data-driven analytics becoming mainstream, the global demand for de...

Please sign up or login with your details

Forgot password? Click here to reset