Where were my keys? – Aggregating Spatial-Temporal Instances of Objects for Efficient Retrieval over Long Periods of Time

10/25/2021
by   Ifrah Idrees, et al.
4

Robots equipped with situational awareness can help humans efficiently find their lost objects by leveraging spatial and temporal structure. Existing approaches to video and image retrieval do not take into account the unique constraints imposed by a moving camera with a partial view of the environment. We present a Detection-based 3-level hierarchical Association approach, D3A, to create an efficient query-able spatial-temporal representation of unique object instances in an environment. D3A performs online incremental and hierarchical learning to identify keyframes that best represent the unique objects in the environment. These keyframes are learned based on both spatial and temporal features and once identified their corresponding spatial-temporal information is organized in a key-value database. D3A allows for a variety of query patterns such as querying for objects with/without the following: 1) specific attributes, 2) spatial relationships with other objects, and 3) time slices. For a given set of 150 queries, D3A returns a small set of candidate keyframes (which occupy only 0.17 in 11.7 ms. This is 47x faster and 33 naively stores the object matches (detections) in the database without associating spatial-temporal information.

READ FULL TEXT

page 1

page 8

research
06/09/2018

Hierarchical Information Quadtree: Efficient Spatial Temporal Image Search for Multimedia Stream

Massive amount of multimedia data that contain times- tamps and geograph...
research
12/08/2018

Spatial-Temporal Person Re-identification

Most of current person re-identification (ReID) methods neglect a spatia...
research
05/25/2021

ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos

Detecting human-object interactions (HOI) is an important step toward a ...
research
01/18/2022

STURE: Spatial-Temporal Mutual Representation Learning for Robust Data Association in Online Multi-Object Tracking

Online multi-object tracking (MOT) is a longstanding task for computer v...
research
05/26/2018

Algebraic Expression of Spatial and Temporal Pattern

Universal learning machine is a theory trying to study machine learning ...
research
03/31/2022

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

3D visual perception tasks, including 3D detection and map segmentation ...
research
05/30/2020

Complex Sequential Understanding through the Awareness of Spatial and Temporal Concepts

Understanding sequential information is a fundamental task for artificia...

Please sign up or login with your details

Forgot password? Click here to reset