ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet

11/29/2021
by   Siddhant Arora, et al.
0

As Automatic Speech Processing (ASR) systems are getting better, there is an increasing interest of using the ASR output to do downstream Natural Language Processing (NLP) tasks. However, there are few open source toolkits that can be used to generate reproducible results on different Spoken Language Understanding (SLU) benchmarks. Hence, there is a need to build an open source standard that can be used to have a faster start into SLU research. We present ESPnet-SLU, which is designed for quick development of spoken language understanding in a single framework. ESPnet-SLU is a project inside end-to-end speech processing toolkit, ESPnet, which is a widely used open-source standard for various speech processing tasks like ASR, Text to Speech (TTS) and Speech Translation (ST). We enhance the toolkit to provide implementations for various SLU benchmarks that enable researchers to seamlessly mix-and-match different ASR and NLU models. We also provide pretrained models with intensively tuned hyper-parameters that can match or even outperform the current state-of-the-art performances. The toolkit is publicly available at https://github.com/espnet/espnet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2020

ESPnet-ST: All-in-One Speech Translation Toolkit

We present ESPnet-ST, which is designed for the quick development of spe...
research
04/10/2023

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

ESPnet-ST-v2 is a revamp of the open-source ESPnet-ST toolkit necessitat...
research
04/04/2021

Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers

This paper introduces Timers and Such, a new open source dataset of spok...
research
05/17/2023

OpenSLU: A Unified, Modularized, and Extensible Toolkit for Spoken Language Understanding

Spoken Language Understanding (SLU) is one of the core components of a t...
research
05/02/2023

A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge

Recently there have been efforts to introduce new benchmark tasks for sp...
research
04/13/2019

M2H-GAN: A GAN-based Mapping from Machine to Human Transcripts for Speech Understanding

Deep learning is at the core of recent spoken language understanding (SL...
research
02/14/2023

TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments

The evidence is growing that machine and deep learning methods can learn...

Please sign up or login with your details

Forgot password? Click here to reset