A Study On Data Augmentation In Voice Anti-Spoofing

by   Ariel Cohen, et al.

In this paper, we perform an in-depth study of how data augmentation techniques improve synthetic or spoofed audio detection. Specifically, we propose methods to deal with channel variability, different audio compressions, different band-widths, and unseen spoofing attacks, which have all been shown to significantly degrade the performance of audio-based systems and Anti-Spoofing systems. Our results are based on the ASVspoof 2021 challenge, in the Logical Access (LA) and Deep Fake (DF) categories. Our study is Data-Centric, meaning that the models are fixed and we significantly improve the results by making changes in the data. We introduce two forms of data augmentation - compression augmentation for the DF part, compression channel augmentation for the LA part. In addition, a new type of online data augmentation, SpecAverage, is introduced in which the audio features are masked with their average value in order to improve generalization. Furthermore, we introduce a Log spectrogram feature design that improved the results. Our best single system and fusion scheme both achieve state-of-the-art performance in the DF category, with an EER of 15.46 for the LA task reduced the best baseline EER by 50 Our techniques to deal with spoofed data from a wide variety of distributions can be replicated and can help anti-spoofing and speech-based systems enhance their results.


page 3

page 4

page 6


Synthetic speech detection using meta-learning with prototypical loss

Recent works on speech spoofing countermeasures still lack generalizatio...

Time-Domain Based Embeddings for Spoofed Audio Representation

Anti-spoofing is the task of speech authentication. That is, identifying...

The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion

This paper describes our DKU replay detection system for the ASVspoof 20...

Robust Audio Anti-Spoofing with Fusion-Reconstruction Learning on Multi-Order Spectrograms

Robust audio anti-spoofing has been increasingly challenging due to the ...

Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection

In this paper, we propose the multi-perspective information fusion (MPIF...

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems

Spoofing countermeasure (CM) systems are critical in speaker verificatio...

Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation

The performance of spoofing countermeasure systems depends fundamentally...

Please sign up or login with your details

Forgot password? Click here to reset