Accurate Ground-Truth Depth Image Generation via Overfit Training of Point Cloud Registration using Local Frame Sets

by   Jiwan Kim, et al.

Accurate three-dimensional perception is a fundamental task in several computer vision applications. Recently, commercial RGB-depth (RGB-D) cameras have been widely adopted as single-view depth-sensing devices owing to their efficient depth-sensing abilities. However, the depth quality of most RGB-D sensors remains insufficient owing to the inherent noise from a single-view environment. Recently, several studies have focused on the single-view depth enhancement of RGB-D cameras. Recent research has proposed deep-learning-based approaches that typically train networks using high-quality supervised depth datasets, which indicates that the quality of the ground-truth (GT) depth dataset is a top-most important factor for accurate system; however, such high-quality GT datasets are difficult to obtain. In this study, we developed a novel method for high-quality GT depth generation based on an RGB-D stream dataset. First, we defined consecutive depth frames in a local spatial region as a local frame set. Then, the depth frames were aligned to a certain frame in the local frame set using an unsupervised point cloud registration scheme. The registration parameters were trained based on an overfit-training scheme, which was primarily used to construct a single GT depth image for each frame set. The final GT depth dataset was constructed using several local frame sets, and each local frame set was trained independently. The primary advantage of this study is that a high-quality GT depth dataset can be constructed under various scanning environments using only the RGB-D stream dataset. Moreover, our proposed method can be used as a new benchmark GT dataset for accurate performance evaluations. We evaluated our GT dataset on previously benchmarked GT depth datasets and demonstrated that our method is superior to state-of-the-art depth enhancement frameworks.


page 1

page 3

page 5

page 6

page 7


AutoDepthNet: High Frame Rate Depth Map Reconstruction using Commodity Depth and RGB Cameras

Depth cameras have found applications in diverse fields, such as compute...

UnsupervisedR R: Unsupervised Point Cloud Registration via Differentiable Rendering

Aligning partial views of a scene into a single whole is essential to un...

PointMBF: A Multi-scale Bidirectional Fusion Network for Unsupervised RGB-D Point Cloud Registration

Point cloud registration is a task to estimate the rigid transformation ...

Accurate 3-D Reconstruction with RGB-D Cameras using Depth Map Fusion and Pose Refinement

Depth map fusion is an essential part in both stereo and RGB-D based 3-D...

Depth Completion with RGB Prior

Depth cameras are a prominent perception system for robotics, especially...

A Method to Generate High Precision Mesh Model and RGB-D Datasetfor 6D Pose Estimation Task

Recently, 3D version has been improved greatly due to the development of...

Deep End-to-End Alignment and Refinement for Time-of-Flight RGB-D Module

Recently, it is increasingly popular to equip mobile RGB cameras with Ti...

Please sign up or login with your details

Forgot password? Click here to reset