Picture for Meng Yu

Meng Yu

Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning

Add code
Aug 12, 2025
Viaarxiv icon

FNSE-SBGAN: Far-field Speech Enhancement with Schrodinger Bridge and Generative Adversarial Networks

Add code
Mar 17, 2025
Viaarxiv icon

LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems

Add code
Feb 19, 2025
Figure 1 for LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems
Figure 2 for LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems
Figure 3 for LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems
Figure 4 for LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems
Viaarxiv icon

Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments

Add code
Oct 09, 2024
Viaarxiv icon

Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules

Add code
Oct 02, 2024
Figure 1 for Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules
Figure 2 for Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules
Figure 3 for Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules
Figure 4 for Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules
Viaarxiv icon

Neural Ambisonic Encoding For Multi-Speaker Scenarios Using A Circular Microphone Array

Add code
Sep 11, 2024
Viaarxiv icon

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis

Add code
Sep 11, 2024
Figure 1 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 2 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 3 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 4 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Viaarxiv icon

SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression

Add code
Jun 17, 2024
Viaarxiv icon

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment

Add code
Jun 17, 2024
Viaarxiv icon

Joint Conditional Diffusion Model for Image Restoration with Mixed Degradations

Add code
Apr 11, 2024
Viaarxiv icon