Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes

by   Fabien Delattre, et al.

We present an approach to estimating camera rotation in crowded, real-world scenes from handheld monocular video. While camera rotation estimation is a well-studied problem, no previous methods exhibit both high accuracy and acceptable speed in this setting. Because the setting is not addressed well by other datasets, we provide a new dataset and benchmark, with high-accuracy, rigorously verified ground truth, on 17 video sequences. Methods developed for wide baseline stereo (e.g., 5-point methods) perform poorly on monocular video. On the other hand, methods used in autonomous driving (e.g., SLAM) leverage specific sensor setups, specific motion models, or local optimization strategies (lagging batch processing) and do not generalize well to handheld video. Finally, for dynamic scenes, commonly used robustification techniques like RANSAC require large numbers of iterations, and become prohibitively slow. We introduce a novel generalization of the Hough transform on SO(3) to efficiently and robustly find the camera rotation most compatible with optical flow. Among comparably fast methods, ours reduces error by almost 50% over the next best, and is more accurate than any method, irrespective of speed. This represents a strong new performance point for crowded scenes, an important setting for computer vision. The code and the dataset are available at


page 1

page 5


A Photometrically Calibrated Benchmark For Monocular Visual Odometry

We present a dataset for evaluating the tracking accuracy of monocular v...

Fast, Robust, Continuous Monocular Egomotion Computation

We propose robust methods for estimating camera egomotion in noisy, real...

InterpolationSLAM: A Novel Robust Visual SLAM System in Rotational Motion

In recent years, visual SLAM has achieved great progress and development...

Colonoscopy 3D Video Dataset with Paired Depth from 2D-3D Registration

Screening colonoscopy is an important clinical application for several 3...

ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild

Estimating the pose of a moving camera from monocular video is a challen...

The Probabilistic Normal Epipolar Constraint for Frame-To-Frame Rotation Optimization under Uncertain Feature Positions

The estimation of the relative pose of two camera views is a fundamental...

Shadow Estimation Method for "The Episolar Constraint: Monocular Shape from Shadow Correspondence"

Recovering shadows is an important step for many vision algorithms. Curr...

Please sign up or login with your details

Forgot password? Click here to reset