3D-SiamRPN: An End-to-End Learning Method for Real-Time 3D Single Object Tracking Using Raw Point Cloud

by   Zheng Fang, et al.

3D single object tracking is a key issue for autonomous following robot, where the robot should robustly track and accurately localize the target for efficient following. In this paper, we propose a 3D tracking method called 3D-SiamRPN Network to track a single target object by using raw 3D point cloud data. The proposed network consists of two subnetworks. The first subnetwork is feature embedding subnetwork which is used for point cloud feature extraction and fusion. In this subnetwork, we first use PointNet++ to extract features of point cloud from template and search branches. Then, to fuse the information of features in the two branches and obtain their similarity, we propose two cross correlation modules, named Pointcloud-wise and Point-wise respectively. The second subnetwork is region proposal network(RPN), which is used to get the final 3D bounding box of the target object based on the fusion feature from cross correlation modules. In this subnetwork, we utilize the regression and classification branches of a region proposal subnetwork to obtain proposals and scores, thus get the final 3D bounding box of the target object. Experimental results on KITTI dataset show that our method has a competitive performance in both Success and Precision compared to the state-of-the-art methods, and could run in real-time at 20.8 FPS. Additionally, experimental results on H3D dataset demonstrate that our method also has good generalization ability and could achieve good tracking performance in a new scene without re-training.


page 1

page 5

page 7

page 9

page 11

page 14

page 15

page 17


3D Object Tracking with Transformer

Feature fusion and similarity computation are two core problems in 3D ob...

FVNet: 3D Front-View Proposal Generation for Real-Time Object Detection from Point Clouds

3D object detection from raw and sparse point clouds has been far less t...

VPIT: Real-time Embedded Single Object 3D Tracking Using Voxel Pseudo Images

In this paper, we propose a novel voxel-based 3D single object tracking ...

A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension

Referring expression comprehension aims to localize the object instance ...

End-to-end feature fusion siamese network for adaptive visual tracking

According to observations, different visual objects have different salie...

Marking anything: application of point cloud in extracting video target features

Extracting retrievable features from video is of great significance for ...

3D IoU-Net: IoU Guided 3D Object Detector for Point Clouds

Most existing point cloud based 3D object detectors focus on the tasks o...

Please sign up or login with your details

Forgot password? Click here to reset