Picture for Hung-yi Lee

Hung-yi Lee

Full-Duplex-Bench-v2: A Multi-Turn Evaluation Framework for Duplex Dialogue Systems with an Automated Examiner

Add code
Oct 09, 2025
Viaarxiv icon

Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition

Add code
Oct 09, 2025
Viaarxiv icon

SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models

Add code
Oct 08, 2025
Viaarxiv icon

Hearing the Order: Investigating Selection Bias in Large Audio-Language Models

Add code
Oct 01, 2025
Viaarxiv icon

When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models

Add code
Oct 01, 2025
Viaarxiv icon

How Does Instrumental Music Help SingFake Detection?

Add code
Sep 18, 2025
Viaarxiv icon

Do You Hear What I Mean? Quantifying the Instruction-Perception Gap in Instruction-Guided Expressive Text-To-Speech Systems

Add code
Sep 18, 2025
Viaarxiv icon

The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties

Add code
Sep 08, 2025
Figure 1 for The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties
Figure 2 for The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties
Figure 3 for The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties
Figure 4 for The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties
Viaarxiv icon

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

Add code
Jul 03, 2025
Viaarxiv icon

An Exploration of Mamba for Speech Self-Supervised Models

Add code
Jun 14, 2025
Figure 1 for An Exploration of Mamba for Speech Self-Supervised Models
Figure 2 for An Exploration of Mamba for Speech Self-Supervised Models
Figure 3 for An Exploration of Mamba for Speech Self-Supervised Models
Figure 4 for An Exploration of Mamba for Speech Self-Supervised Models
Viaarxiv icon