Video saliency prediction and detection are thriving research domains th...
Self-supervised sound source localization is usually challenged by the
m...
DeepFake based digital facial forgery is threatening public media securi...
Incorporating the audio stream enables Video Saliency Prediction (VSP) t...
DeepFake based digital facial forgery is threatening the public media
se...
Talking face generation with great practical significance has attracted ...
Multi-modal based speech separation has exhibited a specific advantage o...
Active speaker detection and speech enhancement have become two increasi...
The target representation learned by convolutional neural networks plays...
Current massive datasets demand light-weight access for analysis. Discre...