Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting

12/08/2020
by   Lingbo Liu, et al.
0

Crowd counting is a fundamental yet challenging problem, which desires rich information to generate pixel-wise crowd density maps. However, most previous methods only utilized the limited information of RGB images and may fail to discover the potential pedestrians in unconstrained environments. In this work, we find that incorporating optical and thermal information can greatly help to recognize pedestrians. To promote future researches in this field, we introduce a large-scale RGBT Crowd Counting (RGBT-CC) benchmark, which contains 2,030 pairs of RGB-thermal images with 138,389 annotated people. Furthermore, to facilitate the multimodal crowd counting, we propose a cross-modal collaborative representation learning framework, which consists of multiple modality-specific branches, a modality-shared branch, and an Information Aggregation-Distribution Module (IADM) to fully capture the complementary information of different modalities. Specifically, our IADM incorporates two collaborative information transfer components to dynamically enhance the modality-shared and modality-specific representations with a dual information propagation mechanism. Extensive experiments conducted on the RGBT-CC benchmark demonstrate the effectiveness of our framework for RGBT crowd counting. Moreover, the proposed approach is universal for multimodal crowd counting and is also capable to achieve superior performance on the ShanghaiTechRGBD dataset.

READ FULL TEXT

page 2

page 7

page 10

research
10/19/2022

Spatio-channel Attention Blocks for Cross-modal Crowd Counting

Crowd counting research has made significant advancements in real-world ...
research
07/31/2021

Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking

The target representation learned by convolutional neural networks plays...
research
11/30/2021

Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust Road Extraction

Land remote sensing analysis is a crucial research in earth science. In ...
research
05/14/2020

Ambient Sound Helps: Audiovisual Crowd Counting in Extreme Conditions

Visual crowd counting has been recently studied as a way to enable peopl...
research
01/08/2023

RGB-T Multi-Modal Crowd Counting Based on Transformer

Crowd counting aims to estimate the number of persons in a scene. Most s...
research
03/14/2023

PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection

Masked Autoencoders learn strong visual representations and achieve stat...
research
08/09/2020

SOFA-Net: Second-Order and First-order Attention Network for Crowd Counting

Automated crowd counting from images/videos has attracted more attention...

Please sign up or login with your details

Forgot password? Click here to reset