Efficient Implementation of Multi-Channel Convolution in Monolithic 3D ReRAM Crossbar
Convolutional neural networks (CNNs) demonstrate promising accuracy in a wide range of applications. Among all layers in CNNs, convolution layers are the most computation-intensive and consume the most energy. As the maturity of device and fabrication technology, 3D resistive random access memory (ReRAM) receives substantial attention for accelerating large vector-matrix multiplication and convolution due to its high parallelism and energy efficiency benefits. However, implementing multi-channel convolution naively in 3D ReRAM will either produce incorrect results or exploit only partial parallelism of 3D ReRAM. In this paper, we propose a 3D ReRAM-based convolution accelerator architecture, which efficiently maps multi-channel convolution to monolithic 3D ReRAM. Our design has two key principles. First, we exploit the intertwined structure of 3D ReRAM to implement multi-channel convolution by using a state-of-the-art convolution algorithm. Second, we propose a new approach to efficiently implement negative weights by separating them from non-negative weights using configurable interconnects. Our evaluation demonstrates that our mapping scheme in 16-layer 3D ReRAM achieves a speedup of 5.79X, 927.81X, and 36.8X compared with a custom 2D ReRAM baseline and state-of-the-art CPU and GPU. Our design also reduces energy consumption by 2.12X, 1802.64X, and 114.1X compared with the same baseline.
READ FULL TEXT