Kaizhi Qian

research

∙ 03/29/2023

Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos

Modeling sounds emitted from physical object interactions is critical fo...

0 Kun Su, et al. ∙

research

∙ 10/04/2021

On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis

Are end-to-end text-to-speech (TTS) models over-parametrized? To what ex...

1 Cheng-I Jeff Lai, et al. ∙

research

∙ 06/16/2021

Global Rhythm Style Transfer Without Text Transcriptions

Prosody plays an important role in characterizing the style of a speaker...

6 Kaizhi Qian, et al. ∙

research

∙ 06/10/2021

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Recent work on speech self-supervised learning (speech SSL) demonstrated...

4 Cheng-I Jeff Lai, et al. ∙

research

∙ 11/21/2020

Deep Network Perceptual Losses for Speech Denoising

Contemporary speech enhancement predominantly relies on audio transforms...

0 Mark R. Saddler, et al. ∙

research

∙ 04/23/2020

Unsupervised Speech Decomposition via Triple Information Bottleneck

Speech information can be roughly decomposed into four components: langu...

0 Kaizhi Qian, et al. ∙

research

∙ 04/15/2020

F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder

Non-parallel many-to-many voice conversion remains an interesting but ch...

7 Kaizhi Qian, et al. ∙

research

∙ 10/01/2019

An Efficient and Margin-Approaching Zero-Confidence Adversarial Attack

There are two major paradigms of white-box adversarial attacks that atte...

11 Yang Zhang, et al. ∙

research

∙ 05/14/2019

Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Non-parallel many-to-many voice conversion, as well as zero-shot voice c...

5 Kaizhi Qian, et al. ∙

research

∙ 05/14/2019

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Non-parallel many-to-many voice conversion, as well as zero-shot voice c...

0 Kaizhi Qian, et al. ∙

research

∙ 02/15/2018

Deep Learning Based Speech Beamforming

Multi-channel speech enhancement with ad-hoc sensors has been a challeng...

0 Kaizhi Qian, et al. ∙

Kaizhi Qian

Featured Co-authors

Sign in with Google

Consider DeepAI Pro