Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking with Transformer

by   Zhipeng Luo, et al.

With the prevalence of LiDAR sensors in autonomous driving, 3D object tracking has received increasing attention. In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in consecutive frames given an object template. Motivated by the success of transformers, we propose Point Tracking TRansformer (PTTR), which efficiently predicts high-quality 3D tracking results in a coarse-to-fine manner with the help of transformer operations. PTTR consists of three novel designs. 1) Instead of random sampling, we design Relation-Aware Sampling to preserve relevant points to the given template during subsampling. 2) We propose a Point Relation Transformer for effective feature aggregation and feature matching between the template and search region. 3) Based on the coarse tracking results, we employ a novel Prediction Refinement Module to obtain the final refined prediction through local feature pooling. In addition, motivated by the favorable properties of the Bird's-Eye View (BEV) of point clouds in capturing object motion, we further design a more advanced framework named PTTR++, which incorporates both the point-wise view and BEV representation to exploit their complementary effect in generating high-quality tracking results. PTTR++ substantially boosts the tracking performance on top of PTTR with low computational overhead. Extensive experiments over multiple datasets show that our proposed approaches achieve superior 3D tracking accuracy and efficiency.


page 7

page 14


PTTR: Relational 3D Point Cloud Object Tracking with Transformer

In a point cloud sequence, 3D object tracking aims to predict the locati...

OST: Efficient One-stream Network for 3D Single Object Tracking in Point Clouds

Although recent Siamese network-based trackers have achieved impressive ...

Exploiting More Information in Sparse Point Cloud for 3D Single Object Tracking

3D single object tracking is a key task in 3D computer vision. However, ...

Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking

Siamese network has been a de facto benchmark framework for 3D LiDAR obj...

CXTrack: Improving 3D Point Cloud Tracking with Contextual Information

3D single object tracking plays an essential role in many applications, ...

Modeling Continuous Motion for 3D Point Cloud Object Tracking

The task of 3D single object tracking (SOT) with LiDAR point clouds is c...

Efficient Joint Detection and Multiple Object Tracking with Spatially Aware Transformer

We propose a light-weight and highly efficient Joint Detection and Track...

Please sign up or login with your details

Forgot password? Click here to reset