A Unified Hardware Architecture for Convolutions and Deconvolutions in CNN
In this paper, a scalable neural network hardware architecture for image segmentation is proposed. By sharing the same computing resources, both convolution and deconvolution operations are handled by the same process element array. In addition, access to on-chip and off-chip memories is optimized to alleviate the burden introduced by partial sum. As an example, SegNet-Basic has been implemented using the proposed unified architecture by targeting on Xilinx ZC706 FPGA, which achieves the performance of 151.5 GOPS and 94.3 GOPS for convolution and deconvolution respectively. This unified convolution/deconvolution design is applicable to other CNNs with deconvolution.
READ FULL TEXT