Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning

by   Iulia Duta, et al.

Graph Neural Networks are perfectly suited to capture latent interactions occurring in the spatio-temporal domain. But when an explicit structure is not available, as in the visual domain, it is not obvious what atomic elements should be represented as nodes. They should depend on the context and the kinds of relations that we are interested in. We are focusing on modeling relations between instances by proposing a method that takes advantage of the locality assumption to create nodes that are clearly localised in space. Current works are using external object detectors or fixed regions to extract features corresponding to graph nodes, while we propose a module for generating the regions associated with each node dynamically, without explicit object-level supervision. Conditioned on the input, for each node we predict the location and size of a region and use them to pool node features using a differentiable mechanism. Constructing these localised, adaptive nodes makes our model biased towards object-centric representations and we show that it improves the modeling of visual interactions. By relying on a few localized nodes, our method learns to focus on salient regions leading to a more explainable model. Our model achieves superior results on video classification tasks involving instance interactions.


page 1

page 3

page 5

page 7


Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs

Graph neural networks have shown to learn effective node representations...

Neural Message Passing on Hybrid Spatio-Temporal Visual and Symbolic Graphs for Video Understanding

Many problems in video understanding require labeling multiple activitie...

(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering

Spatio-temporal scene-graph approaches to video-based reasoning tasks su...

Dynamic Graph Node Classification via Time Augmentation

Node classification for graph-structured data aims to classify nodes who...

Unified Graph Structured Models for Video Understanding

Accurate video understanding involves reasoning about the relationships ...

Graph Neural Processes for Spatio-Temporal Extrapolation

We study the task of spatio-temporal extrapolation that generates data a...

Discovering Gateway Ports in Maritime Using Temporal Graph Neural Network Port Classification

Vessel navigation is influenced by various factors, such as dynamic envi...

Please sign up or login with your details

Forgot password? Click here to reset