The black-box nature of deep reinforcement learning (RL) hinders them fr...
Offline reinforcement learning (RL) optimizes the policy on a previously...
Transformers have shown superior performance on various vision tasks. Th...
Fine-grained image recognition is a longstanding computer vision challen...
Dynamic computation has emerged as a promising avenue to enhance the
inf...
Over the past decade, deep learning models have exhibited considerable
a...
The recent surge in research interest in applying large language models
...
Physics-informed neural networks (PINNs) are known to suffer from
optimi...
The quadratic computation complexity of self-attention has been a persis...
Graph convolutional networks have been widely used in skeleton-based act...
Early exiting has become a promising approach to improving the inference...
Image recognition and generation have long been developed independently ...
Offline reinforcement learning (RL) is challenged by the distributional ...
Training practical agents usually involve offline and online reinforceme...
The captivating realm of Minecraft has attracted substantial research
in...
Text-to-image (T2I) research has grown explosively in the past year, owi...
Self-attention mechanism has been a key factor in the recent progress of...
Recently, CLIP-guided image synthesis has shown appealing performance on...
Rotated object detection aims to identify and locate objects in images w...
Recent advancements in vision-language pre-training (e.g. CLIP) have sho...
Attention-based neural networks, such as Transformers, have become ubiqu...
Large deep learning models have achieved remarkable success in many
scen...
We present a novel bird's-eye-view (BEV) detector with perspective
super...
To effectively exploit the potential of large-scale models, various
pre-...
The superior performance of modern deep networks usually comes at the pr...
Text-video retrieval is an important multi-modal learning task, where th...
Offline reinforcement learning (RL) is challenged by the distributional ...
Recent years have witnessed the fast development of large-scale pre-trai...
Unsupervised reinforcement learning aims at learning a generalist policy...
Knowledge distillation is an effective approach to learn compact models
...
Spatial-wise dynamic convolution has become a promising approach to impr...
Recent research has revealed that reducing the temporal and spatial
redu...
Recently, Neural Radiance Fields (NeRF) has shown promising performances...
Early exiting is an effective paradigm for improving the inference effic...
Deep reinforcement learning (RL) algorithms suffer severe performance
de...
Despite the popularity of Model Compression and Multitask Learning, how ...
Self-supervised learning (SSL) has delivered superior performance on a
v...
While multitask representation learning has become a popular approach in...
Domain adaptive semantic segmentation attempts to make satisfactory dens...
Visual grounding, i.e., localizing objects in images according to natura...
Traditional knowledge distillation transfers "dark knowledge" of a
pre-t...
Unsupervised domain adaption (UDA) aims to adapt models learned from a
w...
Spatial redundancy widely exists in visual recognition tasks, i.e.,
disc...
Transformers have recently shown superior performances on various vision...
Recent works have shown that the computational efficiency of video
recog...
Self-supervised learning has shown its great potential to extract powerf...
Loss functions play an important role in training deep-network-based obj...
Assessing the performance of Generative Adversarial Networks (GANs) has ...
Deep reinforcement learning (RL) agents are becoming increasingly profic...
Convolution and self-attention are two powerful techniques for represent...