Picture for Hung-yi Lee

Hung-yi Lee

Building a Taiwanese Mandarin Spoken Language Model: A First Attempt

Add code
Nov 11, 2024
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Viaarxiv icon

Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback

Add code
Nov 04, 2024
Figure 1 for Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback
Figure 2 for Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback
Figure 3 for Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback
Figure 4 for Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback
Viaarxiv icon

Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning

Add code
Oct 21, 2024
Viaarxiv icon

Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration

Add code
Oct 17, 2024
Figure 1 for Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration
Figure 2 for Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration
Figure 3 for Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration
Figure 4 for Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration
Viaarxiv icon

Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data

Add code
Sep 30, 2024
Viaarxiv icon

Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses

Add code
Sep 22, 2024
Figure 1 for Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Figure 2 for Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Figure 3 for Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Figure 4 for Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Viaarxiv icon

Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models

Add code
Sep 21, 2024
Figure 1 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 2 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 3 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 4 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Viaarxiv icon

Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection

Add code
Sep 17, 2024
Figure 1 for Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection
Figure 2 for Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection
Figure 3 for Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection
Figure 4 for Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection
Viaarxiv icon

Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages

Add code
Sep 16, 2024
Figure 1 for Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages
Figure 2 for Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages
Figure 3 for Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages
Figure 4 for Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages
Viaarxiv icon