Recursive Cross-View: Use Only 2D Detectors to Achieve 3D Object Detection without 3D Annotations

by   Shun Gui, et al.

Heavily relying on 3D annotations limits the real-world application of 3D object detection. In this paper, we propose a method that does not demand any 3D annotation, while being able to predict full-oriented 3D bounding boxes. Our method, called Recursive Cross-View (RCV), transforms 3D detection into several 2D detection tasks, which only consume some 2D labels, based on the three-view principle. We propose a recursive paradigm, in which instance segmentation and 3D bounding box generation by Cross-View are implemented recursively until convergence. Specifically, a frustum is proposed via a 2D detector, followed by the recursive paradigm that finally outputs a full-oriented 3D box, class, and score. To justify that our method can be quickly used to new tasks in real-world scenarios, we do three experiments, namely indoor 3D human detection, full-oriented 3D hand detection, and real-time detection on a real 3D sensor. RCV achieves decent performance in these experiments. Once trained, our method can be viewed as a 3D annotation tool. Consequently, we formulate two 3D labeled dataset, namely '3D_HUMAN' and 'D_HAND', based on RCV, which could be used to pre-train other 3D detectors. Furthermore, estimated on the SUN RGB-D benchmark, our method achieves comparable performance with some full 3D supervised learning methods. RCV is the first 3D detection method that does not consume 3D labels and yields full-oriented 3D boxes on point clouds.


page 4

page 7

page 8


H2RBox: Horizonal Box Annotation is All You Need for Oriented Object Detection

Oriented object detection emerges in many applications from aerial image...

Ensembling object detectors for image and video data analysis

In this paper, we propose a method for ensembling the outputs of multipl...

Learning to Predict the 3D Layout of a Scene

While 2D object detection has improved significantly over the past, real...

Towards Toxic and Narcotic Medication Detection with Rotated Object Detector

Recent years have witnessed the advancement of deep learning vision tech...

MAP-Gen: An Automated 3D-Box Annotation Flow with Multimodal Attention Point Generator

Manually annotating 3D point clouds is laborious and costly, limiting th...

UFO^2: A Unified Framework towards Omni-supervised Object Detection

Existing work on object detection often relies on a single form of annot...

Label-Guided Auxiliary Training Improves 3D Object Detector

Detecting 3D objects from point clouds is a practical yet challenging ta...

Please sign up or login with your details

Forgot password? Click here to reset