Picture for Shilei Zhang

Shilei Zhang

DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for Text-to-Speech with Diverse and Controllable Styles

Add code
Dec 04, 2024
Viaarxiv icon

VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark

Add code
Jul 16, 2024
Viaarxiv icon

On Calibration of Speech Classification Models: Insights from Energy-Based Model Investigations

Add code
Jun 26, 2024
Viaarxiv icon

Exploring Energy-Based Models for Out-of-Distribution Detection in Dialect Identification

Add code
Jun 26, 2024
Viaarxiv icon

CEC: A Noisy Label Detection Method for Speaker Recognition

Add code
Jun 19, 2024
Viaarxiv icon

PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models

Add code
Jun 12, 2024
Viaarxiv icon

Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network

Add code
Feb 20, 2024
Viaarxiv icon

Cascaded Multi-task Adaptive Learning Based on Neural Architecture Search

Add code
Oct 23, 2023
Viaarxiv icon

GenDistiller: Distilling Pre-trained Language Models based on Generative Models

Add code
Oct 20, 2023
Viaarxiv icon

MFAS: Emotion Recognition through Multiple Perspectives Fusion Architecture Search Emulating Human Cognition

Add code
Jun 12, 2023
Viaarxiv icon