Cross-category Video Highlight Detection via Set-based Learning

by   Minghao Xu, et al.

Autonomous highlight detection is crucial for enhancing the efficiency of video browsing on social media platforms. To attain this goal in a data-driven way, one may often face the situation where highlight annotations are not available on the target video category used in practice, while the supervision on another video category (named as source video category) is achievable. In such a situation, one can derive an effective highlight detector on target video category by transferring the highlight knowledge acquired from source video category to the target one. We call this problem cross-category video highlight detection, which has been rarely studied in previous works. For tackling such practical problem, we propose a Dual-Learner-based Video Highlight Detection (DL-VHD) framework. Under this framework, we first design a Set-based Learning module (SL-module) to improve the conventional pair-based learning by assessing the highlight extent of a video segment under a broader context. Based on such learning manner, we introduce two different learners to acquire the basic distinction of target category videos and the characteristics of highlight moments on source video category, respectively. These two types of highlight knowledge are further consolidated via knowledge distillation. Extensive experiments on three benchmark datasets demonstrate the superiority of the proposed SL-module, and the DL-VHD method outperforms five typical Unsupervised Domain Adaptation (UDA) algorithms on various cross-category highlight detection tasks. Our code is available at .


page 1

page 8

page 11


Upcycling Models under Domain and Category Shift

Deep neural networks (DNNs) often perform poorly in the presence of doma...

Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies

Movie highlights stand out of the screenplay for efficient browsing and ...

Transformer-Based Source-Free Domain Adaptation

In this paper, we study the task of source-free domain adaptation (SFDA)...

Should I take a walk? Estimating Energy Expenditure from Video Data

We explore the problem of automatically inferring the amount of kilocalo...

CSCL: Critical Semantic-Consistent Learning for Unsupervised Domain Adaptation

Unsupervised domain adaptation without consuming annotation process for ...

A Dual-level Detection Method for Video Copy Detection

With the development of multimedia technology, Video Copy Detection has ...

Please sign up or login with your details

Forgot password? Click here to reset