EEG-Derived Voice Signature for Attended Speaker Detection

by   Hongxu Zhu, et al.

Objective: Conventional EEG-based auditory attention detection (AAD) is achieved by comparing the time-varying speech stimuli and the elicited EEG signals. However, in order to obtain reliable correlation values, these methods necessitate a long decision window, resulting in a long detection latency. Humans have a remarkable ability to recognize and follow a known speaker, regardless of the spoken content. In this paper, we seek to detect the attended speaker among the pre-enrolled speakers from the elicited EEG signals. In this manner, we avoid relying on the speech stimuli for AAD at run-time. In doing so, we propose a novel EEG-based attended speaker detection (E-ASD) task. Methods: We encode a speaker's voice with a fixed dimensional vector, known as speaker embedding, and project it to an audio-derived voice signature, which characterizes the speaker's unique voice regardless of the spoken content. We hypothesize that such a voice signature also exists in the listener's brain that can be decoded from the elicited EEG signals, referred to as EEG-derived voice signature. By comparing the audio-derived voice signature and the EEG-derived voice signature, we are able to effectively detect the attended speaker in the listening brain. Results: Experiments show that E-ASD can effectively detect the attended speaker from the 0.5s EEG decision windows, achieving 99.78% AAD accuracy, 99.94% AUC, and 0.27% EER. Conclusion: We conclude that it is possible to derive the attended speaker's voice signature from the EEG signals so as to detect the attended speaker in a listening brain. Significance: We present the first proof of concept for detecting the attended speaker from the elicited EEG signals in a cocktail party environment. The successful implementation of E-ASD marks a non-trivial, but crucial step towards smart hearing aids.


page 1

page 3


Towards Voice Reconstruction from EEG during Imagined Speech

Translating imagined speech from human brain activity into voice is a ch...

Low-latency auditory spatial attention detection based on spectro-spatial features from EEG

Detecting auditory attention based on brain signals enables many everyda...

EEG-informed attended speaker extraction from recorded speech mixtures with application in neuro-steered hearing prostheses

OBJECTIVE: We aim to extract and denoise the attended speaker in a noisy...

DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection

Auditory Attention Detection (AAD) aims to detect target speaker from br...

Using Deepfake Technologies for Word Emphasis Detection

In this work, we consider the task of automated emphasis detection for s...

Improving auditory attention decoding performance of linear and non-linear methods using state-space model

Identifying the target speaker in hearing aid applications is crucial to...

Analysis of tagging latency when comparing event-related potentials

Event-related potentials (ERPs) are very small voltage produced by the b...

Please sign up or login with your details

Forgot password? Click here to reset