research
∙
10/10/2022
YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Visually grounded speech (VGS) models are trained on images paired with ...
research
∙
02/02/2022
Keyword localisation in untranscribed speech using visually grounded speech models
Keyword localisation is the task of finding where in a speech utterance ...
research
∙
06/16/2021
Attention-Based Keyword Localisation in Speech using Visual Grounding
Visually grounded speech models learn from images paired with spoken cap...
research
∙
12/14/2020