Self-supervised learning (SSL) is at the origin of unprecedented improve...
This paper presents NAVER LABS Europe's systems for Tamasheq-French and
...
This paper describes the ON-TRAC Consortium translation systems develope...
Self-supervised models for speech processing emerged recently as popular...
In this paper we present two datasets for Tamasheq, a developing languag...
When documenting oral-languages, Unsupervised Word Segmentation (UWS) fr...
Self-Supervised Learning (SSL) using huge unlabeled data has been
succes...
For endangered languages, data collection campaigns have to accommodate ...
This paper describes the ON-TRAC Consortium translation systems develope...
For language documentation initiatives, transcription is an expensive
re...
The CMU Wilderness Multilingual Speech Dataset is a newly published
mult...
Since Bahdanau et al. [1] first introduced attention for neural machine
...
This paper presents an extension to a very low-resource parallel corpus
...
We present a first attempt to perform attentional word segmentation dire...
Word discovery is the task of extracting words from unsegmented text. In...