Recent advances in deep neural networks have achieved unprecedented succ...
Audio-visual speech recognition has received a lot of attention due to i...
Cross-lingual self-supervised learning has been a growing research topic...
We present RAVEn, a self-supervised multi-modal approach to jointly lear...
Video-to-speech synthesis (also known as lip-to-speech) refers to the
tr...
One of the most pressing challenges for the detection of face-manipulate...
Although current deep learning-based face forgery detectors achieve
impr...
There has recently been increasing interest, both theoretical and practi...