AnalogNets: ML-HW Co-Design of Noise-robust TinyML Models and Always-On Analog Compute-in-Memory Accelerator

by   Chuteng Zhou, et al.

Always-on TinyML perception tasks in IoT applications require very high energy efficiency. Analog compute-in-memory (CiM) using non-volatile memory (NVM) promises high efficiency and also provides self-contained on-chip model storage. However, analog CiM introduces new practical considerations, including conductance drift, read/write noise, fixed analog-to-digital (ADC) converter gain, etc. These additional constraints must be addressed to achieve models that can be deployed on analog CiM with acceptable accuracy loss. This work describes AnalogNets: TinyML models for the popular always-on applications of keyword spotting (KWS) and visual wake words (VWW). The model architectures are specifically designed for analog CiM, and we detail a comprehensive training methodology, to retain accuracy in the face of analog non-idealities, and low-precision data converters at inference time. We also describe AON-CiM, a programmable, minimal-area phase-change memory (PCM) analog CiM accelerator, with a novel layer-serial approach to remove the cost of complex interconnects associated with a fully-pipelined design. We evaluate the AnalogNets on a calibrated simulator, as well as real hardware, and find that accuracy degradation is limited to 0.8%/1.2% after 24 hours of PCM drift (8-bit) for KWS/VWW. AnalogNets running on the 14nm AON-CiM accelerator demonstrate 8.58/4.37 TOPS/W for KWS/VWW workloads using 8-bit activations, respectively, and increasing to 57.39/25.69 TOPS/W with 4-bit activations.


page 7

page 9

page 16

page 17


A Microprocessor implemented in 65nm CMOS with Configurable and Bit-scalable Accelerator for Programmable In-memory Computing

This paper presents a programmable in-memory-computing processor, demons...

A Charge Domain P-8T SRAM Compute-In-Memory with Low-Cost DAC/ADC Operation for 4-bit Input Processing

This paper presents a low cost PMOS-based 8T (P-8T) SRAM Compute-In-Memo...

Variability-Aware Training and Self-Tuning of Highly Quantized DNNs for Analog PIM

DNNs deployed on analog processing in memory (PIM) architectures are sub...

Fundamental Limits on Energy-Delay-Accuracy of In-memory Architectures in Inference Applications

This paper obtains fundamental limits on the computational precision of ...

Mitigating Adversarial Attack for Compute-in-Memory Accelerator Utilizing On-chip Finetune

Compute-in-memory (CIM) has been proposed to accelerate the convolution ...

Network insensitivity to parameter noise via adversarial regularization

Neuromorphic neural network processors, in the form of compute-in-memory...

Please sign up or login with your details

Forgot password? Click here to reset