"We care": Improving Code Mixed Speech Emotion Recognition in Customer-Care Conversations

by   N V S Abhishek, et al.

Speech Emotion Recognition (SER) is the task of identifying the emotion expressed in a spoken utterance. Emotion recognition is essential in building robust conversational agents in domains such as law, healthcare, education, and customer support. Most of the studies published on SER use datasets created by employing professional actors in a noise-free environment. In natural settings such as a customer care conversation, the audio is often noisy with speakers regularly switching between different languages as they see fit. We have worked in collaboration with a leading unicorn in the Conversational AI sector to develop Natural Speech Emotion Dataset (NSED). NSED is a natural code-mixed speech emotion dataset where each utterance in a conversation is annotated with emotion, sentiment, valence, arousal, and dominance (VAD) values. In this paper, we show that by incorporating word-level VAD value we improve on the task of SER by 2 High accuracy for negative emotion recognition is essential because customers expressing negative opinions/views need to be pacified with urgency, lest complaints and dissatisfaction snowball and get out of hand. Escalation of negative opinions speedily is crucial for business interests. Our study then can be utilized to develop conversational agents which are more polite and empathetic in such situations.


page 1

page 2

page 3

page 4


Multi-Task Learning Network for Emotion Recognition in Conversation

Conversational emotion recognition (CER) has attracted increasing intere...

Emotion Recognition in Conversation: Research Challenges, Datasets, and Recent Advances

Emotion is intrinsic to humans and consequently emotion understanding is...

Few-Shot Emotion Recognition in Conversation with Sequential Prototypical Networks

Several recent studies on dyadic human-human interactions have been done...

Distribution-based Emotion Recognition in Conversation

Automatic emotion recognition in conversation (ERC) is crucial for emoti...

Multi-Task Learning with Auxiliary Speaker Identification for Conversational Emotion Recognition

Conversational emotion recognition (CER) has attracted increasing intere...

The PRIORI Emotion Dataset: Linking Mood to Emotion Detected In-the-Wild

Bipolar Disorder is a chronic psychiatric illness characterized by patho...

Conversational Document Prediction to Assist Customer Care Agents

A frequent pattern in customer care conversations is the agents respondi...

Please sign up or login with your details

Forgot password? Click here to reset