VRAG: Region Attention Graphs for Content-Based Video Retrieval

05/18/2022
by   Kennard Ng, et al.
8

Content-based Video Retrieval (CBVR) is used on media-sharing platforms for applications such as video recommendation and filtering. To manage databases that scale to billions of videos, video-level approaches that use fixed-size embeddings are preferred due to their efficiency. In this paper, we introduce Video Region Attention Graph Networks (VRAG) that improves the state-of-the-art of video-level methods. We represent videos at a finer granularity via region-level features and encode video spatio-temporal dynamics through region-level relations. Our VRAG captures the relationships between regions based on their semantic content via self-attention and the permutation invariant aggregation of Graph Convolution. In addition, we show that the performance gap between video-level and frame-level methods can be reduced by segmenting videos into shots and using shot embeddings for video retrieval. We evaluate our VRAG over several video retrieval tasks and achieve a new state-of-the-art for video-level retrieval. Furthermore, our shot-level VRAG shows higher retrieval precision than other existing video-level methods, and closer performance to frame-level methods at faster evaluation speeds. Finally, our code will be made publicly available.

READ FULL TEXT

page 2

page 3

page 11

page 12

page 13

page 14

research
08/04/2020

Temporal Context Aggregation for Video Retrieval with Contrastive Learning

The current research focus on Content-Based Video Retrieval requires hig...
research
06/16/2020

Exploiting Visual Semantic Reasoning for Video-Text Retrieval

Video retrieval is a challenging research topic bridging the vision and ...
research
09/23/2022

Marine Video Kit: A New Marine Video Dataset for Content-based Analysis and Retrieval

Effective analysis of unusual domain specific video collections represen...
research
03/15/2023

VVS: Video-to-Video Retrieval with Irrelevant Frame Suppression

In content-based video retrieval (CBVR), dealing with large-scale collec...
research
06/18/2021

Multi-Granularity Network with Modal Attention for Dense Affective Understanding

Video affective understanding, which aims to predict the evoked expressi...
research
04/16/2021

Self-supervised Video Retrieval Transformer Network

Content-based video retrieval aims to find videos from a large video dat...
research
01/12/2011

Content-Based Filtering for Video Sharing Social Networks

In this paper we compare the use of several features in the task of cont...

Please sign up or login with your details

Forgot password? Click here to reset