Auto-pooling: Learning to Improve Invariance of Image Features from Image Sequences

01/15/2013
by   Sainbayar Sukhbaatar, et al.
0

Learning invariant representations from images is one of the hardest challenges facing computer vision. Spatial pooling is widely used to create invariance to spatial shifting, but it is restricted to convolutional models. In this paper, we propose a novel pooling method that can learn soft clustering of features from image sequences. It is trained to improve the temporal coherence of features, while keeping the information loss at minimum. Our method does not use spatial information, so it can be used with non-convolutional models too. Experiments on images extracted from natural videos showed that our method can cluster similar features together. When trained by convolutional features, auto-pooling outperformed traditional spatial pooling on an image classification task, even though it does not use the spatial topology of features.

READ FULL TEXT

page 5

page 6

page 7

page 11

page 12

research
03/01/2021

Maximal function pooling with applications

Inspired by the Hardy-Littlewood maximal function, we propose a novel po...
research
10/25/2016

Maxmin convolutional neural networks for image classification

Convolutional neural networks (CNN) are widely used in computer vision, ...
research
01/15/2013

Pooling-Invariant Image Feature Learning

Unsupervised dictionary learning has been a key component in state-of-th...
research
08/10/2013

Learning Features and their Transformations by Spatial and Temporal Spherical Clustering

Learning features invariant to arbitrary transformations in the data is ...
research
07/02/2020

Learning ordered pooling weights in image classification

Spatial pooling is an important step in computer vision systems like Con...
research
11/23/2015

Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation

Deep neural networks with alternating convolutional, max-pooling and dec...
research
07/23/2014

Visual Word Selection without Re-Coding and Re-Pooling

The Bag-of-Words (BoW) representation is widely used in computer vision....

Please sign up or login with your details

Forgot password? Click here to reset