Real-time 3D Single Object Tracking with Transformer

by   Jiayao Shan, et al.

LiDAR-based 3D single object tracking is a challenging issue in robotics and autonomous driving. Currently, existing approaches usually suffer from the problem that objects at long distance often have very sparse or partially-occluded point clouds, which makes the features extracted by the model ambiguous. Ambiguous features will make it hard to locate the target object and finally lead to bad tracking results. To solve this problem, we utilize the powerful Transformer architecture and propose a Point-Track-Transformer (PTT) module for point cloud-based 3D single object tracking task. Specifically, PTT module generates fine-tuned attention features by computing attention weights, which guides the tracker focusing on the important features of the target and improves the tracking ability in complex scenarios. To evaluate our PTT module, we embed PTT into the dominant method and construct a novel 3D SOT tracker named PTT-Net. In PTT-Net, we embed PTT into the voting stage and proposal generation stage, respectively. PTT module in the voting stage could model the interactions among point patches, which learns context-dependent features. Meanwhile, PTT module in the proposal generation stage could capture the contextual information between object and background. We evaluate our PTT-Net on KITTI and NuScenes datasets. Experimental results demonstrate the effectiveness of PTT module and the superiority of PTT-Net, which surpasses the baseline by a noticeable margin,  10 performance improvement in sparse scenarios. In general, the combination of transformer and tracking pipeline enables our PTT-Net to achieve state-of-the-art performance on both two datasets. Additionally, PTT-Net could run in real-time at 40FPS on NVIDIA 1080Ti GPU. Our code is open-sourced for the research community at


page 1

page 16


PTT: Point-Track-Transformer Module for 3D Single Object Tracking in Point Clouds

3D single object tracking is a key issue for robotics. In this paper, we...

Exploiting More Information in Sparse Point Cloud for 3D Single Object Tracking

3D single object tracking is a key task in 3D computer vision. However, ...

CXTrack: Improving 3D Point Cloud Tracking with Contextual Information

3D single object tracking plays an essential role in many applications, ...

P2B: Point-to-Box Network for 3D Object Tracking in Point Clouds

Towards 3D object tracking in point clouds, a novel point-to-box network...

Motion-to-Matching: A Mixed Paradigm for 3D Single Object Tracking

3D single object tracking with LiDAR points is an important task in the ...

MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors

3D single object tracking has been a crucial problem for decades with nu...

Pyramid Correlation based Deep Hough Voting for Visual Object Tracking

Most of the existing Siamese-based trackers treat tracking problem as a ...

Please sign up or login with your details

Forgot password? Click here to reset