Picture for Haibin Wu

Haibin Wu

TS3-Codec: Transformer-Based Simple Streaming Single Codec

Add code
Nov 27, 2024
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models

Add code
Sep 21, 2024
Figure 1 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 2 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 3 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 4 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Viaarxiv icon

Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement

Add code
Sep 16, 2024
Figure 1 for Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Figure 2 for Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Figure 3 for Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Figure 4 for Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Viaarxiv icon

Ultra-Low Latency Speech Enhancement - A Comprehensive Study

Add code
Sep 16, 2024
Viaarxiv icon

Stimulus Modality Matters: Impact of Perceptual Evaluations from Different Modalities on Speech Emotion Recognition System Performance

Add code
Sep 16, 2024
Viaarxiv icon

DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset

Add code
Sep 13, 2024
Figure 1 for DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
Figure 2 for DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
Figure 3 for DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
Figure 4 for DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
Viaarxiv icon

SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks

Add code
Aug 23, 2024
Figure 1 for SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Figure 2 for SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Figure 3 for SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Figure 4 for SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Viaarxiv icon

EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations

Add code
Jul 30, 2024
Viaarxiv icon