Voxelized 3D Feature Aggregation for Multiview Detection

12/07/2021
by   Jiahao Ma, et al.
21

Multi-view detection incorporates multiple camera views to alleviate occlusion in crowded scenes, where the state-of-the-art approaches adopt homography transformations to project multi-view features to the ground plane. However, we find that these 2D transformations do not take into account the object's height, and with this neglection features along the vertical direction of same object are likely not projected onto the same ground plane point, leading to impure ground-plane features. To solve this problem, we propose VFA, voxelized 3D feature aggregation, for feature transformation and aggregation in multi-view detection. Specifically, we voxelize the 3D space, project the voxels onto each camera view, and associate 2D features with these projected voxels. This allows us to identify and then aggregate 2D features along the same vertical line, alleviating projection distortions to a large extent. Additionally, because different kinds of objects (human vs. cattle) have different shapes on the ground plane, we introduce the oriented Gaussian encoding to match such shapes, leading to increased accuracy and efficiency. We perform experiments on multiview 2D detection and multiview 3D detection problems. Results on four datasets (including a newly introduced MultiviewC dataset) show that our system is very competitive compared with the state-of-the-art approaches. MultiviewC are released at https://github.com/Robert-Mar/VFA.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 8

page 11

research
07/22/2022

3D Random Occlusion and Multi-Layer Projection for Deep Multi-Camera Pedestrian Localization

Although deep-learning based methods for monocular pedestrian detection ...
research
12/02/2020

Wide-Area Crowd Counting: Multi-View Fusion Networks for Counting in Large Scenes

Crowd counting in single-view images has achieved outstanding performanc...
research
08/04/2023

FB-BEV: BEV Representation from Forward-Backward View Transformations

View Transformation Module (VTM), where transformations happen between m...
research
10/04/2018

Multi-view X-ray R-CNN

Motivated by the detection of prohibited objects in carry-on luggage as ...
research
04/25/2023

MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes

Manipulation relationship detection (MRD) aims to guide the robot to gra...
research
05/21/2023

Unsupervised Multi-view Pedestrian Detection

With the prosperity of the video surveillance, multiple visual sensors h...
research
09/24/2021

Bringing Generalization to Deep Multi-view Detection

Multi-view Detection (MVD) is highly effective for occlusion reasoning a...

Please sign up or login with your details

Forgot password? Click here to reset