Picture for Wei-Ping Huang

Wei-Ping Huang

Building a Taiwanese Mandarin Spoken Language Model: A First Attempt

Add code
Nov 11, 2024
Viaarxiv icon

Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation

Add code
Jul 13, 2024
Figure 1 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 2 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 3 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 4 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Viaarxiv icon

Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech

Add code
Jun 16, 2024
Figure 1 for Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
Figure 2 for Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
Figure 3 for Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
Figure 4 for Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
Viaarxiv icon

Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

Add code
Jun 12, 2024
Viaarxiv icon

Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization

Add code
Jan 23, 2024
Viaarxiv icon

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

Add code
Oct 09, 2023
Viaarxiv icon

Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously

Add code
Jun 03, 2023
Viaarxiv icon

On the Utility of Self-supervised Models for Prosody-related Tasks

Add code
Oct 13, 2022
Figure 1 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 2 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 3 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 4 for On the Utility of Self-supervised Models for Prosody-related Tasks
Viaarxiv icon

Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding

Add code
Jun 27, 2022
Figure 1 for Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Figure 2 for Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Figure 3 for Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Figure 4 for Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Viaarxiv icon