Detecting stereotypes and biases in Large Language Models (LLMs) can enh...
Transformer is beneficial for image denoising tasks since it can model
l...
LiDAR-camera fusion methods have shown impressive performance in 3D obje...
Multi-modal 3D object detection has been an active research topic in
aut...
Scene text erasing seeks to erase text contents from scene images and cu...
Nowadays, with the explosive growth of multimodal reviews on social medi...
The document layout analysis (DLA) aims to decompose document images int...
Document layout analysis (DLA) plays an important role in information
ex...
Document layout analysis (DLA) aims to divide a document image into diff...
Human-in-the-loop aims to train an accurate prediction model with minimu...
The document layout analysis (DLA) aims to split the document image into...
Texts from scene images typically consist of several characters and exhi...
Crowd counting aims to count the number of instantaneous people in a cro...
Crowd counting, i.e., estimation number of pedestrian in crowd images, i...