A Whisper transformer for audio captioning trained with synthetic captions and transfer learning

05/15/2023
by   Marek Kadlčík, et al.
0

The field of audio captioning has seen significant advancements in recent years, driven by the availability of large-scale audio datasets and advancements in deep learning techniques. In this technical report, we present our approach to audio captioning, focusing on the use of a pretrained speech-to-text Whisper model and pretraining on synthetic captions. We discuss our training procedures and present our experiments' results, which include model size variations, dataset mixtures, and other hyperparameters. Our findings demonstrate the impact of different training strategies on the performance of the audio captioning model. Our code and trained models are publicly available on GitHub and Hugging Face Hub.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2023

Synth-AC: Enhancing Audio Captioning with Synthetic Supervision

Data-driven approaches hold promise for audio captioning. However, the d...
research
09/18/2023

RECAP: Retrieval-Augmented Audio Captioning

We present RECAP (REtrieval-Augmented Audio CAPtioning), a novel and eff...
research
04/01/2022

Learning Audio-Video Modalities from Image Captions

A major challenge in text-video and text-audio retrieval is the lack of ...
research
10/21/2019

Clotho: An Audio Captioning Dataset

Audio captioning is the novel task of general audio content description ...
research
03/30/2023

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

The advancement of audio-language (AL) multimodal learning tasks has bee...
research
09/19/2019

Large-scale representation learning from visually grounded untranscribed speech

Systems that can associate images with their spoken audio captions are a...
research
07/31/2023

LP-MusicCaps: LLM-Based Pseudo Music Captioning

Automatic music captioning, which generates natural language description...

Please sign up or login with your details

Forgot password? Click here to reset