Coding local and global binary visual features extracted from video sequences

by   Luca Baroffio, et al.

Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks, while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW) model. Several applications, including for example visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget, while attaining a target level of efficiency. In this paper we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the Compress-Then-Analyze (CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: homography estimation and content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with CTA, especially in bandwidth limited scenarios.


page 6

page 10


Hybrid coding of visual content and local image features

Distributed visual analysis applications, such as mobile visual search o...

Fast keypoint detection in video sequences

A number of computer vision tasks exploit a succinct representation of t...

Multi-View Task-Driven Recognition in Visual Sensor Networks

Nowadays, distributed smart cameras are deployed for a wide set of tasks...

SuperGF: Unifying Local and Global Features for Visual Localization

Advanced visual localization techniques encompass image retrieval challe...

Keypoint Encoding for Improved Feature Extraction from Compressed Video at Low Bitrates

In many mobile visual analysis applications, compressed video is transmi...

Late Fusion of Local Indexing and Deep Feature Scores for Fast Image-to-Video Search on Large-Scale Databases

Low cost visual representation and fast query-by-example content search ...

Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search

Mobile landmark search (MLS) recently receives increasing attention for ...

Please sign up or login with your details

Forgot password? Click here to reset