Filter pruning simultaneously accelerates the computation and reduces th...
Multimodal Large Language Model (MLLM) relies on the powerful LLM to per...
In this paper, we study teacher-student learning from the perspective of...
Modeling sparse and dense image matching within a unified functional
cor...
This paper focuses on the limitations of current over-parameterized shad...
Open-vocabulary object detection (OVD) aims to scale up vocabulary size ...
Vision transformers (ViTs) are changing the landscape of object detectio...
U-Nets have achieved tremendous success in medical image segmentation.
N...
We attempt to reduce the computational costs in vision transformers (ViT...
Pixel synthesis is a promising research paradigm for image generation, w...
In this paper, we propose a simple yet universal network termed SeqTR fo...
Light-weight super-resolution (SR) models have received considerable
att...
Vision Transformers (ViT) have made many breakthroughs in computer visio...
Semi-supervised object detection (SSOD) has achieved substantial progres...
Weakly supervised object localization (WSOL) aims to learn object locali...
While post-training quantization receives popularity mostly due to its
e...