In this paper, we present VideoGen, a text-to-video generation approach,...
In this paper, we study Text-to-3D content generation leveraging 2D diff...
Despite recent advances in syncing lip movements with any audio waves,
c...
Existing methods of multi-person video 3D human Pose and Shape Estimatio...
In the field of skeleton-based action recognition, current top-performin...
Current domain adaptation methods for face anti-spoofing leverage labele...
DETR is a novel end-to-end transformer architecture object detector, whi...
We present a strong object detector with encoder-decoder pretraining and...
Recently, transformer-based networks have shown impressive results in
se...
This paper proposes a novel Unified Feature Optimization (UFO) paradigm ...
Freezing the pre-trained backbone has become a standard paradigm to avoi...
Learning discriminative representation using large-scale face datasets i...
Many existing face anti-spoofing (FAS) methods focus on modeling the dec...