Picture for Liang-Hsuan Tseng

Liang-Hsuan Tseng

TASTE-Streaming: Towards Streamable Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling

Add code
Mar 12, 2026
Viaarxiv icon

On the Fallacy of Global Token Perplexity in Spoken Language Model Evaluation

Add code
Jan 09, 2026
Viaarxiv icon

TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling

Add code
Apr 09, 2025
Figure 1 for TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
Figure 2 for TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
Figure 3 for TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
Figure 4 for TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
Viaarxiv icon

Building a Taiwanese Mandarin Spoken Language Model: A First Attempt

Add code
Nov 11, 2024
Figure 1 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 2 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 3 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 4 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Viaarxiv icon

Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data

Add code
Jul 15, 2024
Figure 1 for Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data
Figure 2 for Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data
Figure 3 for Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data
Figure 4 for Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data
Viaarxiv icon

REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

Add code
Feb 06, 2024
Viaarxiv icon

Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation

Add code
May 12, 2023
Figure 1 for Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Figure 2 for Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Figure 3 for Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Figure 4 for Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Viaarxiv icon

Introducing Semantics into Speech Encoders

Add code
Nov 15, 2022
Figure 1 for Introducing Semantics into Speech Encoders
Figure 2 for Introducing Semantics into Speech Encoders
Figure 3 for Introducing Semantics into Speech Encoders
Figure 4 for Introducing Semantics into Speech Encoders
Viaarxiv icon

Improving generalizability of distilled self-supervised speech processing models under distorted settings

Add code
Oct 20, 2022
Figure 1 for Improving generalizability of distilled self-supervised speech processing models under distorted settings
Figure 2 for Improving generalizability of distilled self-supervised speech processing models under distorted settings
Figure 3 for Improving generalizability of distilled self-supervised speech processing models under distorted settings
Figure 4 for Improving generalizability of distilled self-supervised speech processing models under distorted settings
Viaarxiv icon

Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models

Add code
Oct 07, 2021
Figure 1 for Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models
Figure 2 for Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models
Figure 3 for Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models
Figure 4 for Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models
Viaarxiv icon