Low to High Dimensional Modality Hallucination using Aggregated Fields of View

by   Kausic Gunasekar, et al.

Real-world robotics systems deal with data from a multitude of modalities, especially for tasks such as navigation and recognition. The performance of those systems can drastically degrade when one or more modalities become inaccessible, due to factors such as sensors' malfunctions or adverse environments. Here, we argue modality hallucination as one effective way to ensure consistent modality availability and thereby reduce unfavorable consequences. While hallucinating data from a modality with richer information, e.g., RGB to depth, has been researched extensively, we investigate the more challenging low-to-high modality hallucination with interesting use cases in robotics and autonomous systems. We present a novel hallucination architecture that aggregates information from multiple fields of view of the local neighborhood to recover the lost information from the extant modality. The process is implemented by capturing a non-linear mapping between the data modalities and the learned mapping is used to aid the extant modality to mitigate the risk posed to the system in the adverse scenarios which involve modality loss. We also conduct extensive classification and segmentation experiments on UWRGBD and NYUD datasets and demonstrate that hallucination allays the negative effects of the modality loss. Implementation and models: https://github.com/kausic94/Hallucination


page 1

page 5

page 6

page 8


You Only Need One Detector: Unified Object Detector for Different Modalities based on Vision Transformers

Most systems use different models for different modalities, such as one ...

Complementary Random Masking for RGB-Thermal Semantic Segmentation

RGB-thermal semantic segmentation is one potential solution to achieve r...

Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities

Multimodal emotion recognition leverages complementary information acros...

Cross-Modal Knowledge Transfer Without Task-Relevant Source Data

Cost-effective depth and infrared sensors as alternatives to usual RGB s...

Consistent Multimodal Generation via A Unified GAN Framework

We investigate how to generate multimodal image outputs, such as RGB, de...

2DDATA: 2D Detection Annotations Transmittable Aggregation for Semantic Segmentation on Point Cloud

Recently, multi-modality models have been introduced because of the comp...

Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors

Using sensor data from multiple modalities presents an opportunity to en...

Please sign up or login with your details

Forgot password? Click here to reset