Video activity localization aims at understanding the semantic content i...
Few-shot (FS) and zero-shot (ZS) learning are two different approaches f...
This paper deals with the problem of localizing objects in image and vid...
The recently released Ego4D dataset and benchmark significantly scales a...
Untrimmed video understanding such as temporal action detection (TAD) of...
Long-form video understanding requires designing approaches that are abl...
Temporal action detection (TAD) is an important yet challenging task in ...
Temporal language grounding in videos aims to localize the temporal span...
Temporal action localization (TAL) is a fundamental yet challenging task...
Many video analysis tasks require temporal localization thus detection o...
Grounding language queries in videos aims at identifying the time interv...
Temporal action detection is a fundamental yet challenging task in video...
We study the problem of object detection from a novel perspective in whi...