Picture for Zhiyong Wu

Zhiyong Wu

AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis

Add code
Apr 14, 2025
Viaarxiv icon

Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Add code
Apr 11, 2025
Viaarxiv icon

UniSep: Universal Target Audio Separation with Language Models at Scale

Add code
Mar 31, 2025
Viaarxiv icon

$φ$-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation

Add code
Mar 17, 2025
Viaarxiv icon

DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models

Add code
Feb 27, 2025
Viaarxiv icon

Implicit Search via Discrete Diffusion: A Study on Chess

Add code
Feb 27, 2025
Viaarxiv icon

Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features

Add code
Feb 07, 2025
Viaarxiv icon

Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data

Add code
Jan 19, 2025
Figure 1 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 2 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 3 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 4 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Viaarxiv icon

learning discriminative features from spectrograms using center loss for speech emotion recognition

Add code
Jan 02, 2025
Viaarxiv icon

Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT

Add code
Jan 02, 2025
Figure 1 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 2 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 3 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 4 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Viaarxiv icon