Picture for Chao-Han Huck Yang

Chao-Han Huck Yang

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Viaarxiv icon

Towards Neural Scaling Laws for Time Series Foundation Models

Add code
Oct 16, 2024
Figure 1 for Towards Neural Scaling Laws for Time Series Foundation Models
Figure 2 for Towards Neural Scaling Laws for Time Series Foundation Models
Figure 3 for Towards Neural Scaling Laws for Time Series Foundation Models
Figure 4 for Towards Neural Scaling Laws for Time Series Foundation Models
Viaarxiv icon

FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model

Add code
Oct 03, 2024
Figure 1 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 2 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 3 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 4 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Viaarxiv icon

Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data

Add code
Sep 30, 2024
Viaarxiv icon

Chain-of-Thought Prompting for Speech Translation

Add code
Sep 17, 2024
Figure 1 for Chain-of-Thought Prompting for Speech Translation
Figure 2 for Chain-of-Thought Prompting for Speech Translation
Figure 3 for Chain-of-Thought Prompting for Speech Translation
Figure 4 for Chain-of-Thought Prompting for Speech Translation
Viaarxiv icon

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Add code
Sep 17, 2024
Figure 1 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 2 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 3 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 4 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Viaarxiv icon

Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction

Add code
Aug 29, 2024
Viaarxiv icon

Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction

Add code
Jul 23, 2024
Viaarxiv icon

From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment

Add code
Jun 20, 2024
Viaarxiv icon

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

Add code
May 23, 2024
Viaarxiv icon