Reconstructing faces from voices

05/25/2019
by   Yandong Wen, et al.
0

Voice profiling aims at inferring various human parameters from their speech, e.g. gender, age, etc. In this paper, we address the challenge posed by a subtask of voice profiling - reconstructing someone's face from their voice. The task is designed to answer the question: given an audio clip spoken by an unseen person, can we picture a face that has as many common elements, or associations as possible with the speaker, in terms of identity? To address this problem, we propose a simple but effective computational framework based on generative adversarial networks (GANs). The network learns to generate faces from voices by matching the identities of generated faces to those of the speakers, on a training set. We evaluate the performance of the network by leveraging a closely related task - cross-modal matching. The results show that our model is able to generate faces that match several biometric characteristics of the speaker, and results in matching accuracies that are much better than chance.

READ FULL TEXT

page 6

page 7

page 8

research
04/01/2018

Seeing Voices and Hearing Faces: Cross-modal biometric matching

We introduce a seemingly impossible task: given only an audio clip of so...
research
04/28/2020

Cross-modal Speaker Verification and Recognition: A Multilingual Perspective

Recent years have seen a surge in finding association between faces and ...
research
05/23/2019

Speech2Face: Learning the Face Behind a Voice

How much can we infer about a person's looks from the way they speak? In...
research
09/01/2021

FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection

The strong relation between face and voice can aid active speaker detect...
research
05/15/2018

On Learning Associations of Faces and Voices

In this paper, we study the associations between human faces and voices....
research
10/05/2021

Voice Aging with Audio-Visual Style Transfer

Face aging techniques have used generative adversarial networks (GANs) a...
research
04/21/2021

Voice2Mesh: Cross-Modal 3D Face Model Generation from Voices

This work focuses on the analysis that whether 3D face models can be lea...

Please sign up or login with your details

Forgot password? Click here to reset