Input Dropout for Spatially Aligned Modalities

02/07/2020
by   Sébastien de Blois, et al.
0

Computer vision datasets containing multiple modalities such as color, depth, and thermal properties are now commonly accessible and useful for solving a wide array of challenging tasks. However, deploying multi-sensor heads is not possible in many scenarios. As such many practical solutions tend to be based on simpler sensors, mostly for cost, simplicity and robustness considerations. In this work, we propose a training methodology to take advantage of these additional modalities available in datasets, even if they are not available at test time. By assuming that the modalities have a strong spatial correlation, we propose Input Dropout, a simple technique that consists in stochastic hiding of one or many input modalities at training time, while using only the canonical (e.g. RGB) modalities at test time. We demonstrate that Input Dropout trivially combines with existing deep convolutional architectures, and improves their performance on a wide range of computer vision tasks such as dehazing, 6-DOF object tracking, pedestrian detection and object classification.

READ FULL TEXT
research
10/19/2018

Learning with privileged information via adversarial discriminative modality distillation

Heterogeneous data modalities can provide complementary cues for several...
research
09/28/2020

Quantal synaptic dilution enhances sparse encoding and dropout regularisation in deep networks

Dropout is a technique that silences the activity of units stochasticall...
research
08/31/2023

RGB-T Tracking via Multi-Modal Mutual Prompt Learning

Object tracking based on the fusion of visible and thermal im-ages, know...
research
09/20/2022

Frequency Dropout: Feature-Level Regularization via Randomized Filtering

Deep convolutional neural networks have shown remarkable performance on ...
research
02/23/2018

No Blind Spots: Full-Surround Multi-Object Tracking for Autonomous Vehicles using Cameras & LiDARs

Online multi-object tracking (MOT) is extremely important for high-level...
research
10/18/2021

SCENIC: A JAX Library for Computer Vision Research and Beyond

Scenic is an open-source JAX library with a focus on Transformer-based m...
research
04/29/2023

Modality-invariant Visual Odometry for Embodied Vision

Effectively localizing an agent in a realistic, noisy setting is crucial...

Please sign up or login with your details

Forgot password? Click here to reset