Picture for Chung-Ming Chien

Chung-Ming Chien

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Viaarxiv icon

On the Evaluation of Speech Foundation Models for Spoken Language Understanding

Add code
Jun 14, 2024
Viaarxiv icon

Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning

Add code
Jun 10, 2024
Viaarxiv icon

Toward Joint Language Modeling for Speech Units and Text

Add code
Oct 12, 2023
Figure 1 for Toward Joint Language Modeling for Speech Units and Text
Figure 2 for Toward Joint Language Modeling for Speech Units and Text
Figure 3 for Toward Joint Language Modeling for Speech Units and Text
Figure 4 for Toward Joint Language Modeling for Speech Units and Text
Viaarxiv icon

Few-Shot Spoken Language Understanding via Joint Speech-Text Models

Add code
Oct 09, 2023
Viaarxiv icon

AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement

Add code
Sep 14, 2023
Viaarxiv icon

What do self-supervised speech models know about words?

Add code
Jun 30, 2023
Viaarxiv icon

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

Add code
Feb 16, 2022
Figure 1 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 2 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 3 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 4 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Viaarxiv icon

S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations

Add code
Apr 07, 2021
Figure 1 for S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Figure 2 for S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Figure 3 for S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Figure 4 for S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Viaarxiv icon

Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech

Add code
Mar 20, 2021
Figure 1 for Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech
Figure 2 for Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech
Figure 3 for Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech
Figure 4 for Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech
Viaarxiv icon