Picture for Chao-Han Huck Yang

Chao-Han Huck Yang

Audio Large Language Models Can Be Descriptive Speech Quality Evaluators

Add code
Jan 27, 2025
Viaarxiv icon

Variational Bayesian Adaptive Learning of Deep Latent Variables for Acoustic Knowledge Transfer

Add code
Jan 26, 2025
Viaarxiv icon

Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits

Add code
Jan 07, 2025
Figure 1 for Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Figure 2 for Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Figure 3 for Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Figure 4 for Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Viaarxiv icon

NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts

Add code
Nov 08, 2024
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Figure 1 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 2 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 3 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 4 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Viaarxiv icon

Towards Neural Scaling Laws for Time Series Foundation Models

Add code
Oct 16, 2024
Figure 1 for Towards Neural Scaling Laws for Time Series Foundation Models
Figure 2 for Towards Neural Scaling Laws for Time Series Foundation Models
Figure 3 for Towards Neural Scaling Laws for Time Series Foundation Models
Figure 4 for Towards Neural Scaling Laws for Time Series Foundation Models
Viaarxiv icon

FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model

Add code
Oct 03, 2024
Figure 1 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 2 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 3 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 4 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Viaarxiv icon

Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data

Add code
Sep 30, 2024
Figure 1 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 2 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 3 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 4 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Viaarxiv icon

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Add code
Sep 17, 2024
Figure 1 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 2 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 3 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 4 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Viaarxiv icon

Chain-of-Thought Prompting for Speech Translation

Add code
Sep 17, 2024
Figure 1 for Chain-of-Thought Prompting for Speech Translation
Figure 2 for Chain-of-Thought Prompting for Speech Translation
Figure 3 for Chain-of-Thought Prompting for Speech Translation
Figure 4 for Chain-of-Thought Prompting for Speech Translation
Viaarxiv icon