FADEC: FPGA-based Acceleration of Video Depth Estimation by HW/SW Co-design

12/01/2022
by   Nobuho Hashimoto, et al.
0

3D reconstruction from videos has become increasingly popular for various applications, including navigation for autonomous driving of robots and drones, augmented reality (AR), and 3D modeling. This task often combines traditional image/video processing algorithms and deep neural networks (DNNs). Although recent developments in deep learning have improved the accuracy of the task, the large number of calculations involved results in low computation speed and high power consumption. Although there are various domain-specific hardware accelerators for DNNs, it is not easy to accelerate the entire process of applications that alternate between traditional image/video processing algorithms and DNNs. Thus, FPGA-based end-to-end acceleration is required for such complicated applications in low-power embedded environments. This paper proposes a novel FPGA-based accelerator for DeepVideoMVS, a DNN-based depth estimation method for 3D reconstruction. We employ HW/SW co-design to appropriately utilize heterogeneous components in modern SoC FPGAs, such as programmable logic (PL) and CPU, according to the inherent characteristics of the method. As some operations are unsuitable for hardware implementation, we determine the operations to be implemented in software through analyzing the number of times each operation is performed and its memory access pattern, and then considering comprehensive aspects: the ease of hardware implementation and degree of expected acceleration by hardware. The hardware and software implementations are executed in parallel on the PL and CPU to hide their execution latencies. The proposed accelerator was developed on a Xilinx ZCU104 board by using NNgen, an open-source high-level synthesis (HLS) tool. Experiments showed that the proposed accelerator operates 60.2 times faster than the software-only implementation on the same FPGA board with minimal accuracy degradation.

READ FULL TEXT

page 1

page 8

research
10/26/2017

The implementation of a Deep Recurrent Neural Network Language Model on a Xilinx FPGA

Recently, FPGA has been increasingly applied to problems such as speech ...
research
11/14/2019

An Efficient Hardware-Oriented Dropout Algorithm

This paper proposes a hardware-oriented dropout algorithm, which is effi...
research
05/07/2019

PI-BA Bundle Adjustment Acceleration on Embedded FPGAs with Co-observation Optimization

Bundle adjustment (BA) is a fundamental optimization technique used in m...
research
02/13/2019

A Scalable FPGA-based Architecture for Depth Estimation in SLAM

The current state of the art of Simultaneous Localisation and Mapping, o...
research
01/08/2018

In-RDBMS Hardware Acceleration of Advanced Analytics

The data revolution is fueled by advances in several areas, including da...
research
04/19/2021

Learning on Hardware: A Tutorial on Neural Network Accelerators and Co-Processors

Deep neural networks (DNNs) have the advantage that they can take into a...
research
05/25/2023

Are We There Yet? Product Quantization and its Hardware Acceleration

Conventional multiply-accumulate (MAC) operations have long dominated co...

Please sign up or login with your details

Forgot password? Click here to reset