Picture for Yuchen Hu

Yuchen Hu

GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling

Add code
Feb 05, 2025
Viaarxiv icon

A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration

Add code
Jan 31, 2025
Figure 1 for A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration
Figure 2 for A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration
Figure 3 for A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration
Figure 4 for A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration
Viaarxiv icon

Audio Large Language Models Can Be Descriptive Speech Quality Evaluators

Add code
Jan 27, 2025
Viaarxiv icon

An Investigation on the Potential of KAN in Speech Enhancement

Add code
Dec 23, 2024
Viaarxiv icon

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Add code
Sep 17, 2024
Figure 1 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 2 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 3 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 4 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Viaarxiv icon

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis

Add code
Sep 11, 2024
Figure 1 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 2 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 3 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 4 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Viaarxiv icon

MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models

Add code
Aug 07, 2024
Figure 1 for MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models
Figure 2 for MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models
Figure 3 for MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models
Figure 4 for MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models
Viaarxiv icon

Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization

Add code
Jul 02, 2024
Figure 1 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 2 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 3 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Figure 4 for Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Viaarxiv icon

Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback

Add code
Jun 02, 2024
Viaarxiv icon

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

Add code
May 23, 2024
Viaarxiv icon