Towards Good Practices for Missing Modality Robust Action Recognition

11/25/2022
by   Sangmin Woo, et al.
0

Standard multi-modal models assume the use of the same modalities in training and inference stages. However, in practice, the environment in which multi-modal models operate may not satisfy such assumption. As such, their performances degrade drastically if any modality is missing in the inference stage. We ask: how can we train a model that is robust to missing modalities? This paper seeks a set of good practices for multi-modal action recognition, with a particular interest in circumstances where some modalities are not available at an inference time. First, we study how to effectively regularize the model during training (e.g., data augmentation). Second, we investigate on fusion methods for robustness to missing modalities: we find that transformer-based fusion shows better robustness for missing modality than summation or concatenation. Third, we propose a simple modular network, ActionMAE, which learns missing modality predictive coding by randomly dropping modality features and tries to reconstruct them with the remaining modality features. Coupling these good practices, we build a model that is not only effective in multi-modal action recognition but also robust to modality missing. Our model achieves the state-of-the-arts on multiple benchmarks and maintains competitive performances even in missing modality scenarios. Codes are available at https://github.com/sangminwoo/ActionMAE.

READ FULL TEXT

page 1

page 4

research
04/21/2023

Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation

Using multiple spatial modalities has been proven helpful in improving s...
research
05/12/2023

MMG-Ego4D: Multi-Modal Generalization in Egocentric Action Recognition

In this paper, we study a novel problem in egocentric action recognition...
research
07/20/2023

MSQNet: Actor-agnostic Action Recognition with Multi-modal Query

Existing action recognition methods are typically actor-specific due to ...
research
08/09/2018

Overcoming Missing and Incomplete Modalities with Generative Adversarial Networks for Building Footprint Segmentation

The integration of information acquired with different modalities, spati...
research
05/13/2021

Robust Dynamic Multi-Modal Data Fusion: A Model Uncertainty Perspective

This paper is concerned with multi-modal data fusion (MMDF) under unexpe...
research
02/25/2022

On Modality Bias Recognition and Reduction

Making each modality in multi-modal data contribute is of vital importan...
research
12/31/2014

ModDrop: adaptive multi-modal gesture recognition

We present a method for gesture detection and localisation based on mult...

Please sign up or login with your details

Forgot password? Click here to reset