Picture for Hung-yi Lee

Hung-yi Lee

Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities

Add code
Mar 06, 2025
Viaarxiv icon

TRACT: Regression-Aware Fine-tuning Meets Chain-of-Thought Reasoning for LLM-as-a-Judge

Add code
Mar 06, 2025
Viaarxiv icon

Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models

Add code
Mar 03, 2025
Viaarxiv icon

Transferring Textual Preferences to Vision-Language Understanding through Model Merging

Add code
Feb 19, 2025
Viaarxiv icon

Speech-FT: A Fine-tuning Strategy for Enhancing Speech Representation Models Without Compromising Generalization Ability

Add code
Feb 18, 2025
Viaarxiv icon

Gender Bias in Instruction-Guided Speech Synthesis Models

Add code
Feb 08, 2025
Figure 1 for Gender Bias in Instruction-Guided Speech Synthesis Models
Figure 2 for Gender Bias in Instruction-Guided Speech Synthesis Models
Figure 3 for Gender Bias in Instruction-Guided Speech Synthesis Models
Figure 4 for Gender Bias in Instruction-Guided Speech Synthesis Models
Viaarxiv icon

BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights

Add code
Jan 29, 2025
Viaarxiv icon

Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity

Add code
Jan 24, 2025
Figure 1 for Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity
Figure 2 for Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity
Figure 3 for Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity
Figure 4 for Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity
Viaarxiv icon

CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset

Add code
Jan 14, 2025
Figure 1 for CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset
Figure 2 for CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset
Figure 3 for CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset
Figure 4 for CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset
Viaarxiv icon

Spectral-Aware Low-Rank Adaptation for Speaker Verification

Add code
Jan 07, 2025
Figure 1 for Spectral-Aware Low-Rank Adaptation for Speaker Verification
Figure 2 for Spectral-Aware Low-Rank Adaptation for Speaker Verification
Figure 3 for Spectral-Aware Low-Rank Adaptation for Speaker Verification
Figure 4 for Spectral-Aware Low-Rank Adaptation for Speaker Verification
Viaarxiv icon