ATOM: Accurate Tracking by Overlap Maximization

by   Martin Danelljan, et al.

While recent years have witnessed astonishing improvements in visual tracking robustness, the advancements in tracking accuracy have been severely limited. As the focus has been directed towards the development of powerful classifiers, the problem of accurate target state estimation has been largely overlooked. Instead, the majority of methods resort to simple multi-scale search in order to estimate the target bounding box. We argue that this approach is fundamentally limited as target estimation is a complex task, requiring high-level knowledge about the object. We thus address the problem of target state estimation in tracking. We propose a novel tracking architecture consisting of dedicated target estimation and classification components. Due to the complex nature of target estimation, we propose a component that can be entirely trained offline on large-scale datasets. Our target estimation component is trained to predict the overlap between the target object and an estimated bounding box. By carefully integrating target-specific information in the prediction, our approach achieves previously unseen bounding box accuracy. Furthermore, we integrate a classification component that is trained online to guarantee high discriminative power in the presence of distractors. Our final tracking framework, comprised of a unified multi-task architecture, sets a new state-of-the-art on four challenging benchmarks. On the large-scale TrackingNet dataset, our tracker ATOM achieves a relative gain of 15 over 30 FPS.


page 1

page 4


Accurate Bounding-box Regression with Distance-IoU Loss for Visual Tracking

Most existing tracking methods are based on using a classifier and multi...

RPT: Learning Point Set Representation for Siamese Visual Tracking

While remarkable progress has been made in robust visual tracking, accur...

SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines

Visual tracking problem demands to efficiently perform robust classifica...

Transforming Model Prediction for Tracking

Optimization based tracking methods have been widely successful by integ...

Higher Performance Visual Tracking with Dual-Modal Localization

Visual Object Tracking (VOT) has synchronous needs for both robustness a...

3D Visual Tracking Framework with Deep Learning for Asteroid Exploration

3D visual tracking is significant to deep space exploration programs, wh...

Towards Unified Token Learning for Vision-Language Tracking

In this paper, we present a simple, flexible and effective vision-langua...

Please sign up or login with your details

Forgot password? Click here to reset