This paper presents IP-SLT, a simple yet effective framework for sign
la...
In the era of Large Language Models (LLMs), tremendous strides have been...
In fisheye images, rich distinct distortion patterns are regularly
distr...
Visual storytelling aims to generate a narrative based on a sequence of
...
In 3D human action recognition, limited supervised data makes it challen...
Recent progress in weakly supervised object detection is featured by a
c...
Reconstructing interacting hands from monocular RGB data is a challengin...
Existing face forgery detection models try to discriminate fake images b...
Image compression aims to reduce the information redundancy in images. M...
Recent approaches have utilized self-supervised auxiliary tasks as
repre...
Segment anything model (SAM) has achieved great success in the field of
...
In passage retrieval system, the initial passage retrieval results may b...
Hand gesture serves as a crucial role during the expression of sign lang...
In recent years, tremendous efforts have been made on document image
rec...
Recent researches on unsupervised person re-identification (reID) have
d...
We propose a novel framework to reconstruct accurate appearance and geom...
Diffusion models have shown remarkable success in visual synthesis, but ...
We study unsupervised domain adaptation (UDA) for semantic segmentation....
In this work, we are dedicated to leveraging the BERT pre-training succe...
In cooperative multi-agent tasks, parameter sharing among agents is a co...
Contour-based instance segmentation has been actively studied, thanks to...
In this work, we are dedicated to a new task, i.e., hand-object interact...
In this work, we are dedicated to text-guided image generation and propo...
We present SinDiffusion, leveraging denoising diffusion models to captur...
Card game AI has always been a hot topic in the research of artificial
i...
Temporal language grounding (TLG) aims to localize a video segment in an...
Document images captured by mobile devices are usually degraded by
uncon...
In document image rectification, there exist rich geometric constraints
...
In 3D action recognition, there exists rich complementary information be...
Low-light video enhancement (LLVE) is an important yet challenging task ...
Molecular representation learning has attracted much attention recently....
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkabl...
In this work, we explore neat yet effective Transformer-based frameworks...
The increased integration of renewable energy poses a slew of technical
...
Actor-critic Reinforcement Learning (RL) algorithms have achieved impres...
In this work, we are dedicated to multi-target active object tracking (A...
Cooperative multi-agent reinforcement learning (MARL) has made prominent...
Unsupervised domain adaptation (UDA) is an important topic in the comput...
Recent years have witnessed the great breakthrough of deep reinforcement...
Color constancy aims to restore the constant colors of a scene under
dif...
Multi-agent reinforcement learning is difficult to be applied in practic...
Due to the partial observability and communication constraints in many
m...
Learning an generalized prior for natural image restoration is an import...
Recently, masked image modeling (MIM) has become a promising direction f...
Active Multi-Object Tracking (AMOT) is a task where cameras are controll...
In cooperative multi-agent tasks, a team of agents jointly interact with...
In cooperative multi-agent systems, agents jointly take actions and rece...
Molecular conformation generation aims to generate three-dimensional
coo...
Existing unsupervised person re-identification methods only rely on visu...
Compared to flatbed scanners, portable smartphones are much more conveni...