Large pre-trained models have had a significant impact on computer visio...
Single domain generalization aims to learn a model from a single trainin...
In real teaching scenarios, an excellent teacher always teaches what he ...
Few-shot action recognition in videos is challenging for its lack of
sup...
Domain generalization aims to learn a model that can generalize well on ...
Language-driven action localization in videos is a challenging task that...
Entity-aware image captioning aims to describe named entities and events...
How to model fine-grained spatial-temporal dynamics in videos has been a...
Video captioning has shown impressive progress in recent years. One key
...
Exploiting relationships among objects has achieved remarkable progress ...
Partial domain adaptation aims to transfer knowledge from a label-rich s...
Existing deep learning methods of video recognition usually require a la...