Picture for Jiatong Shi

Jiatong Shi

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Viaarxiv icon

Findings of the IWSLT 2024 Evaluation Campaign

Add code
Nov 07, 2024
Viaarxiv icon

Exploiting Longitudinal Speech Sessions via Voice Assistant Systems for Early Detection of Cognitive Decline

Add code
Oct 16, 2024
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration

Add code
Sep 14, 2024
Viaarxiv icon

Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm

Add code
Sep 11, 2024
Viaarxiv icon

SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge

Add code
Aug 28, 2024
Viaarxiv icon

Self-supervised Speech Representations Still Struggle with African American Vernacular English

Add code
Aug 26, 2024
Viaarxiv icon

Towards Robust Speech Representation Learning for Thousands of Languages

Add code
Jul 02, 2024
Viaarxiv icon

SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models

Add code
Jun 20, 2024
Viaarxiv icon