Picture for Heng-Jui Chang

Heng-Jui Chang

DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models

Add code
Oct 31, 2024
Viaarxiv icon

A Large-Scale Evaluation of Speech Foundation Models

Add code
Apr 15, 2024
Viaarxiv icon

SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

Add code
Feb 10, 2024
Viaarxiv icon

R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces

Add code
Nov 15, 2023
Viaarxiv icon

CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders

Add code
Sep 14, 2023
Viaarxiv icon

Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

Add code
May 18, 2023
Viaarxiv icon

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Add code
May 17, 2023
Viaarxiv icon

M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval

Add code
Nov 02, 2022
Figure 1 for M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval
Figure 2 for M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval
Figure 3 for M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval
Figure 4 for M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval
Viaarxiv icon

SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model

Add code
Oct 03, 2022
Figure 1 for SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Figure 2 for SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Figure 3 for SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Figure 4 for SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Viaarxiv icon

SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities

Add code
Mar 14, 2022
Figure 1 for SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Figure 2 for SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Figure 3 for SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Figure 4 for SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Viaarxiv icon