Deep-HOSeq: Deep Higher Order Sequence Fusion for Multimodal Sentiment Analysis

by   Sunny Verma, et al.

Multimodal sentiment analysis utilizes multiple heterogeneous modalities for sentiment classification. The recent multimodal fusion schemes customize LSTMs to discover intra-modal dynamics and design sophisticated attention mechanisms to discover the inter-modal dynamics from multimodal sequences. Although powerful, these schemes completely rely on attention mechanisms which is problematic due to two major drawbacks 1) deceptive attention masks, and 2) training dynamics. Nevertheless, strenuous efforts are required to optimize hyperparameters of these consolidate architectures, in particular their custom-designed LSTMs constrained by attention schemes. In this research, we first propose a common network to discover both intra-modal and inter-modal dynamics by utilizing basic LSTMs and tensor based convolution networks. We then propose unique networks to encapsulate temporal-granularity among the modalities which is essential while extracting information within asynchronous sequences. We then integrate these two kinds of information via a fusion layer and call our novel multimodal fusion scheme as Deep-HOSeq (Deep network with higher order Common and Unique Sequence information). The proposed Deep-HOSeq efficiently discovers all-important information from multimodal sequences and the effectiveness of utilizing both types of information is empirically demonstrated on CMU-MOSEI and CMU-MOSI benchmark datasets. The source code of our proposed Deep-HOSeq is and available at–ICDM-2020.


page 1

page 8


Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis

Multimodal sentiment analysis aims to extract and integrate semantic inf...

DM^2S^2: Deep Multi-Modal Sequence Sets with Hierarchical Modality Attention

There is increasing interest in the use of multimodal data in various we...

A Self-Adjusting Fusion Representation Learning Model for Unaligned Text-Audio Sequences

Inter-modal interaction plays an indispensable role in multimodal sentim...

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Multimodal sentiment analysis is a very actively growing field of resear...

Analyzing Unaligned Multimodal Sequence via Graph Convolution and Graph Pooling Fusion

In this paper, we study the task of multimodal sequence analysis which a...

Deep Multimodal Fusion by Channel Exchanging

Deep multimodal fusion by using multiple sources of data for classificat...

Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis

In multimodal sentiment analysis (MSA), the performance of a model highl...

Please sign up or login with your details

Forgot password? Click here to reset