DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection

07/18/2022
by   Liang Peng, et al.
2

Monocular 3D detection has drawn much attention from the community due to its low cost and setup simplicity. It takes an RGB image as input and predicts 3D boxes in the 3D space. The most challenging sub-task lies in the instance depth estimation. Previous works usually use a direct estimation method. However, in this paper we point out that the instance depth on the RGB image is non-intuitive. It is coupled by visual depth clues and instance attribute clues, making it hard to be directly learned in the network. Therefore, we propose to reformulate the instance depth to the combination of the instance visual surface depth (visual depth) and the instance attribute depth (attribute depth). The visual depth is related to objects' appearances and positions on the image. By contrast, the attribute depth relies on objects' inherent attributes, which are invariant to the object affine transformation on the image. Correspondingly, we decouple the 3D location uncertainty into visual depth uncertainty and attribute depth uncertainty. By combining different types of depths and associated uncertainties, we can obtain the final instance depth. Furthermore, data augmentation in monocular 3D detection is usually limited due to the physical nature, hindering the boost of performance. Based on the proposed instance depth disentanglement strategy, we can alleviate this problem. Evaluated on KITTI, our method achieves new state-of-the-art results, and extensive ablation studies validate the effectiveness of each component in our method. The codes are released at https://github.com/SPengLiang/DID-M3D.

READ FULL TEXT

page 2

page 7

page 11

research
07/29/2021

Probabilistic and Geometric Depth: Detecting Objects in Perspective

3D object detection is an important capability needed in various practic...
research
04/06/2021

Objects are Different: Flexible Monocular 3D Object Detection

The precise localization of 3D objects from a single image without depth...
research
09/17/2019

Task-Aware Monocular Depth Estimation for 3D Object Detection

Monocular depth estimation enables 3D perception from a single 2D image,...
research
07/26/2022

Monocular 3D Object Detection with Depth from Motion

Perceiving 3D objects from monocular inputs is crucial for robotic syste...
research
05/27/2020

Center3D: Center-based Monocular 3D Object Detection with Joint Depth Understanding

Localizing objects in 3D space and understanding their associated 3D pro...
research
12/09/2020

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

In this paper, we present ViP-DeepLab, a unified model attempting to tac...
research
07/21/2022

DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection

Modern neural networks use building blocks such as convolutions that are...

Please sign up or login with your details

Forgot password? Click here to reset