TorchAudio: Building Blocks for Audio and Speech Processing

10/28/2021
by   Yao-Yuan Yang, et al.
0

This document describes version 0.10 of torchaudio: building blocks for machine learning applications in the audio and speech processing domain. The objective of torchaudio is to accelerate the development and deployment of machine learning applications for researchers and engineers by providing off-the-shelf building blocks. The building blocks are designed to be GPU-compatible, automatically differentiable, and production-ready. torchaudio can be easily installed from Python Package Index repository and the source code is publicly available under a BSD-2-Clause License (as of September 2021) at https://github.com/pytorch/audio. In this document, we provide an overview of the design principles, functionalities, and benchmarks of torchaudio. We also benchmark our implementation of several audio and speech operations and models. We verify through the benchmarks that our implementations of various operations and models are valid and perform similarly to other publicly available implementations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2020

Asteroid: the PyTorch-based audio source separation toolkit for researchers

This paper describes Asteroid, the PyTorch-based audio source separation...
research
06/28/2023

cuSLINK: Single-linkage Agglomerative Clustering on the GPU

In this paper, we propose cuSLINK, a novel and state-of-the-art reformul...
research
04/23/2021

NOMAD version 4: Nonlinear optimization with the MADS algorithm

NOMAD is software for optimizing blackbox problems. In continuous develo...
research
11/04/2019

pyannote.audio: neural building blocks for speaker diarization

We introduce pyannote.audio, an open-source toolkit written in Python fo...
research
02/25/2021

Named Tensor Notation

We propose a notation for tensors with named axes, which relieves the au...
research
11/25/2022

The TSN Building Blocks in Linux

Various application areas e.g. industrial automation, professional audio...
research
11/19/2019

KISS: Keeping It Simple for Scene Text Recognition

Over the past few years, several new methods for scene text recognition ...

Please sign up or login with your details

Forgot password? Click here to reset