VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion

11/29/2021
by   Hanqi Zhu, et al.
0

It has been well recognized that fusing the complementary information from depth-aware LiDAR point clouds and semantic-rich stereo images would benefit 3D object detection. Nevertheless, it is not trivial to explore the inherently unnatural interaction between sparse 3D points and dense 2D pixels. To ease this difficulty, the recent proposals generally project the 3D points onto the 2D image plane to sample the image data and then aggregate the data at the points. However, this approach often suffers from the mismatch between the resolution of point clouds and RGB images, leading to sub-optimal performance. Specifically, taking the sparse points as the multi-modal data aggregation locations causes severe information loss for high-resolution images, which in turn undermines the effectiveness of multi-sensor fusion. In this paper, we present VPFNet – a new architecture that cleverly aligns and aggregates the point cloud and image data at the `virtual' points. Particularly, with their density lying between that of the 3D points and 2D pixels, the virtual points can nicely bridge the resolution gap between the two sensors, and thus preserve more information for processing. Moreover, we also investigate the data augmentation techniques that can be applied to both point clouds and RGB images, as the data augmentation has made non-negligible contribution towards 3D object detectors to date. We have conducted extensive experiments on KITTI dataset, and have observed good performance compared to the state-of-the-art methods. Remarkably, our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021. The network design also takes computation efficiency into consideration – we can achieve a FPS of 15 on a single NVIDIA RTX 2080Ti GPU. The code will be made available for reproduction and further investigation.

READ FULL TEXT

page 1

page 2

page 5

page 11

research
12/10/2020

R-AGNO-RPN: A LIDAR-Camera Region Deep Network for Resolution-Agnostic Detection

Current neural networks-based object detection approaches processing LiD...
research
01/29/2020

ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes

3D object detection has seen quick progress thanks to advances in deep l...
research
11/08/2021

Frustum Fusion: Pseudo-LiDAR and LiDAR Fusion for 3D Detection

Most autonomous vehicles are equipped with LiDAR sensors and stereo came...
research
03/18/2022

Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion

Current LiDAR-only 3D detection methods inevitably suffer from the spars...
research
11/14/2019

PI-RCNN: An Efficient Multi-sensor 3D Object Detector with Point-based Attentive Cont-conv Fusion Module

LIDAR point clouds and RGB-images are both extremely essential for 3D ob...
research
02/21/2022

LiDAR-guided Stereo Matching with a Spatial Consistency Constraint

The complementary fusion of light detection and ranging (LiDAR) data and...
research
03/04/2023

Virtual Sparse Convolution for Multimodal 3D Object Detection

Recently, virtual/pseudo-point-based 3D object detection that seamlessly...

Please sign up or login with your details

Forgot password? Click here to reset