DiffusionTrack: Diffusion Model For Multi-Object Tracking

by   Run Luo, et al.

Multi-object tracking (MOT) is a challenging vision task that aims to detect individual objects within a single frame and associate them across multiple frames. Recent MOT approaches can be categorized into two-stage tracking-by-detection (TBD) methods and one-stage joint detection and tracking (JDT) methods. Despite the success of these approaches, they also suffer from common problems, such as harmful global or local inconsistency, poor trade-off between robustness and model complexity, and lack of flexibility in different scenes within the same video. In this paper we propose a simple but robust framework that formulates object detection and association jointly as a consistent denoising diffusion process from paired noise boxes to paired ground-truth boxes. This novel progressive denoising diffusion strategy substantially augments the tracker's effectiveness, enabling it to discriminate between various objects. During the training stage, paired object boxes diffuse from paired ground-truth boxes to random distribution, and the model learns detection and tracking simultaneously by reversing this noising process. In inference, the model refines a set of paired randomly generated boxes to the detection and tracking results in a flexible one-step or multi-step denoising diffusion process. Extensive experiments on three widely used MOT benchmarks, including MOT17, MOT20, and Dancetrack, demonstrate that our approach achieves competitive performance compared to the current state-of-the-art methods.


page 3

page 4


DiffusionDet: Diffusion Model for Object Detection

We propose DiffusionDet, a new framework that formulates object detectio...

Diffusion-based 3D Object Detection with Random Boxes

3D object detection is an essential task for achieving autonomous drivin...

ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box

Multi-object tracking (MOT) aims at estimating bounding boxes and identi...

PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement

Dominant Person Search methods aim to localize and recognize query perso...

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

Existing Multiple-Object Tracking (MOT) methods either follow the tracki...

CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models

Camouflaged Object Detection (COD) is a challenging task in computer vis...

BOTT: Box Only Transformer Tracker for 3D Object Tracking

Tracking 3D objects is an important task in autonomous driving. Classica...

Please sign up or login with your details

Forgot password? Click here to reset