Text-based speech editing (TSE) techniques are designed to enable users ...
Prosodic phrasing is crucial to the naturalness and intelligibility of
e...
This research aims develop an Explainable Artificial Intelligence (XAI)
...
Insufficient data is a long-standing challenge for Brain-Computer Interf...
Vision-language navigation (VLN), which entails an agent to navigate 3D
...
As a unifying concept in economics, game theory, and operations research...
Accurate citation count prediction of newly published papers could help
...
Audio Deepfake Detection (ADD) aims to detect the fake audio generated b...
Air pollution has become a global concern for many years. Vehicular
crow...
Personalized news recommender systems help users quickly find content of...
We present a lightweighted neural PDE representation to discover the hid...
This paper introduces innovative data-driven techniques for estimating t...
Federated learning (FL) enables multiple data owners to build machine
le...
Creating an essay based on a few given topics is a challenging NLP task....
We introduce a graph polynomial that distinguishes tree structures to
re...
One-shot coreset selection aims to select a subset of the training data,...
Accented text-to-speech (TTS) synthesis seeks to generate speech with an...
Conversational Text-to-Speech (TTS) aims to synthesis an utterance with ...
Multimodal emotion recognition leverages complementary information acros...
Cyrillic and Traditional Mongolian are the two main members of the Mongo...
This paper introduces a high-quality open-source text-to-speech (TTS)
sy...
Spiking neural networks (SNNs) mimic brain computational strategies, and...
Accented text-to-speech (TTS) synthesis seeks to generate speech with an...
The harsh environment imposes a unique set of challenges on networking
s...
Multivariate long sequence time-series forecasting (M-LSTF) is a practic...
Emotion classification of speech and assessment of the emotion strength ...
Sparsely activated transformers, such as Mixture of Experts (MoE), have
...
This paper reviews the challenge on constrained high dynamic range (HDR)...
Transformers achieve state-of-the-art performance for natural language
p...
Accurate and unbiased examinations of skin lesions are critical for earl...
Conversational Causal Emotion Entailment aims to detect causal utterance...
Object detection is an algorithm that recognizes and locates the objects...
Low-dose computed tomography (LDCT) denoising is an important problem in...
A multi-robot system (MRS) is a group of coordinated robots designed to
...
With its powerful capability to deal with graph data widely found in
pra...
Recently unmanned aerial vehicles (UAV) have been widely deployed in var...
Domain adaptive object detection (DAOD) aims to improve the generalizati...
Different from the Single Image Super-Resolution(SISR) task, the key for...
Given taxi-ride counts information between departure and destination
loc...
In this paper, we formulate a novel task to synthesize speech in sync wi...
Recently, emotional speech synthesis has achieved remarkable performance...
Emotional voice conversion (VC) aims to convert a neutral voice to an
em...
Transformer, as a strong and flexible architecture for modelling long-ra...
To improve the generalization of detectors, for domain adaptive object
d...
Manifold ranking has been successfully applied in query-oriented
multi-d...
Making predictions in a robust way is not easy for nonlinear systems. In...
In this paper, we first provide a review of the state-of-the-art emotion...
We provide a new non-asymptotic analysis of distributed TD(0) with linea...
Video inpainting aims to fill the given spatiotemporal holes with realis...
Human multi-robot system (MRS) collaboration is demonstrating potentials...