RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry

by   Claudio Cimarelli, et al.

Unsupervised learning for monocular camera motion and 3D scene understanding has gained popularity over traditional methods, relying on epipolar geometry or non-linear optimization. Notably, deep learning can overcome many issues of monocular vision, such as perceptual aliasing, low-textured areas, scale-drift, and degenerate motions. Also, concerning supervised learning, we can fully leverage video streams data without the need for depth or motion labels. However, in this work, we note that rotational motion can limit the accuracy of the unsupervised pose networks more than the translational component. Therefore, we present RAUM-VO, an approach based on a model-free epipolar constraint for frame-to-frame motion estimation (F2F) to adjust the rotation during training and online inference. To this end, we match 2D keypoints between consecutive frames using pre-trained deep networks, Superpoint and Superglue, while training a network for depth and pose estimation using an unsupervised training protocol. Then, we adjust the predicted rotation with the motion estimated by F2F using the 2D matches and initializing the solver with the pose network prediction. Ultimately, RAUM-VO shows a considerable accuracy improvement compared to other unsupervised pose networks on the KITTI dataset while reducing the complexity of other hybrid or traditional approaches and achieving comparable state-of-the-art results.


page 2

page 5


UnDeepVO: Monocular Visual Odometry through Unsupervised Deep Learning

We propose a novel monocular visual odometry (VO) system called UnDeepVO...

Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry

Monocular visual odometry approaches that purely rely on geometric cues ...

Pose Graph Optimization for Unsupervised Monocular Visual Odometry

Unsupervised Learning based monocular visual odometry (VO) has lately dr...

Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints

We present a novel approach for unsupervised learning of depth and ego-m...

The Probabilistic Normal Epipolar Constraint for Frame-To-Frame Rotation Optimization under Uncertain Feature Positions

The estimation of the relative pose of two camera views is a fundamental...

Unsupervised Monocular Depth Prediction for Indoor Continuous Video Streams

This paper studies unsupervised monocular depth prediction problem. Most...

Epipolar Geometry based Learning of Multi-view Depth and Ego-Motion from Monocular Sequences

Deep approaches to predict monocular depth and ego-motion have grown in ...

Please sign up or login with your details

Forgot password? Click here to reset