Picture for Qingyang Hong

Qingyang Hong

UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models

Add code
Oct 06, 2025
Viaarxiv icon

MeanFlowSE: one-step generative speech enhancement via conditional mean flow

Add code
Sep 18, 2025
Viaarxiv icon

Pseudo Labels-based Neural Speech Enhancement for the AVSR Task in the MISP-Meeting Challenge

Add code
May 30, 2025
Viaarxiv icon

Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion

Add code
May 30, 2025
Viaarxiv icon

DS-Codec: Dual-Stage Training with Mirror-to-NonMirror Architecture Switching for Speech Codec

Add code
May 30, 2025
Viaarxiv icon

SuPseudo: A Pseudo-supervised Learning Method for Neural Speech Enhancement in Far-field Speech Recognition

Add code
May 30, 2025
Viaarxiv icon

SlimSpeech: Lightweight and Efficient Text-to-Speech with Slim Rectified Flow

Add code
Apr 10, 2025
Viaarxiv icon

Dynamic Language Group-Based MoE: Enhancing Efficiency and Flexibility for Code-Switching Speech Recognition

Add code
Jul 26, 2024
Figure 1 for Dynamic Language Group-Based MoE: Enhancing Efficiency and Flexibility for Code-Switching Speech Recognition
Figure 2 for Dynamic Language Group-Based MoE: Enhancing Efficiency and Flexibility for Code-Switching Speech Recognition
Figure 3 for Dynamic Language Group-Based MoE: Enhancing Efficiency and Flexibility for Code-Switching Speech Recognition
Figure 4 for Dynamic Language Group-Based MoE: Enhancing Efficiency and Flexibility for Code-Switching Speech Recognition
Viaarxiv icon

LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation

Add code
Jun 12, 2024
Figure 1 for LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
Figure 2 for LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
Figure 3 for LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
Figure 4 for LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
Viaarxiv icon

MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis

Add code
Dec 28, 2023
Figure 1 for MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
Figure 2 for MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
Figure 3 for MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
Figure 4 for MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
Viaarxiv icon