Zero-shot text-to-speech aims at synthesizing voices with unseen speech
...
Cross-lingual timbre and style generalizable text-to-speech (TTS) aims t...
Scaling text-to-speech to a large and wild dataset has been proven to be...
Direct speech-to-speech translation (S2ST) has gradually become popular ...
As a key component of automated speech recognition (ASR) and the front-e...
Solving long-tail large vocabulary object detection with deep learning b...