Exploiting Long-Term Dependencies for Generating Dynamic Scene Graphs

12/18/2021
by   Shengyu Feng, et al.
0

Structured video representation in the form of dynamic scene graphs is an effective tool for several video understanding tasks. Compared to the task of scene graph generation from images, dynamic scene graph generation is more challenging due to the temporal dynamics of the scene and the inherent temporal fluctuations of predictions. We show that capturing long-term dependencies is the key to effective generation of dynamic scene graphs. We present the detect-track-recognize paradigm by constructing consistent long-term object tracklets from a video, followed by transformers to capture the dynamics of objects and visual relations. Experimental results demonstrate that our Dynamic Scene Graph Detection Transformer (DSG-DETR) outperforms state-of-the-art methods by a significant margin on the benchmark dataset Action Genome. We also perform ablation studies and validate the effectiveness of each component of the proposed approach.

READ FULL TEXT

page 1

page 8

research
07/26/2021

Spatial-Temporal Transformer for Dynamic Scene Graph Generation

Dynamic scene graph generation aims at generating a scene graph of the g...
research
05/15/2023

Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs

Dynamic scene graphs generated from video clips could help enhance the s...
research
11/30/2022

Iterative Scene Graph Generation with Generative Transformers

Scene graphs provide a rich, structured representation of a scene by enc...
research
04/03/2023

Unbiased Scene Graph Generation in Videos

The task of dynamic scene graph generation (SGG) from videos is complica...
research
12/19/2016

Exploring Structure for Long-Term Tracking of Multiple Objects in Sports Videos

In this paper, we propose a novel approach for exploiting structural rel...
research
12/20/2017

Lost in Time: Temporal Analytics for Long-Term Video Surveillance

Video surveillance is a well researched area of study with substantial w...
research
08/05/2020

Learning Long-term Visual Dynamics with Region Proposal Interaction Networks

Learning long-term dynamics models is the key to understanding physical ...

Please sign up or login with your details

Forgot password? Click here to reset