BC-VAD: A Robust Bone Conduction Voice Activity Detection

12/06/2022
by   Niccolo' Polvani, et al.
0

Voice Activity Detection (VAD) is a fundamental module in many audio applications. Recent state-of-the-art VAD systems are often based on neural networks, but they require a computational budget that usually exceeds the capabilities of a small battery-operated device when preserving the performance of larger models. In this work, we rely on the input from a bone conduction microphone (BCM) to design an efficient VAD (BC-VAD) robust against residual non-stationary noises originating from the environment or speakers not wearing the BCM.We first show that a larger VAD system (58k parameters) achieves state-of-the-art results on a publicly available benchmark but fails when running on bone conduction signals. We then compare its variant BC-VAD (5k parameters and trained on BC data) with a baseline especially designed for a BCM and show that the proposed method achieves better performances under various metrics while keeping the realtime processing requirement for a microcontroller.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2022

Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction

Target-speaker voice activity detection is currently a promising approac...
research
09/05/2023

In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms

The recent ubiquitous adoption of remote conferencing has been accompani...
research
01/17/2023

The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description

We describe the system used by our team for the VoxCeleb Speaker Recogni...
research
06/25/2021

Voice Activity Detection for Transient Noisy Environment Based on Diffusion Nets

We address voice activity detection in acoustic environments of transien...
research
12/04/2017

Precision Scaling of Neural Networks for Efficient Audio Processing

While deep neural networks have shown powerful performance in many audio...
research
12/09/2021

X-Vector based voice activity detection for multi-genre broadcast speech-to-text

Voice Activity Detection (VAD) is a fundamental preprocessing step in au...
research
10/26/2020

MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection

We present MarbleNet, an end-to-end neural network for Voice Activity De...

Please sign up or login with your details

Forgot password? Click here to reset