Picture for Zeyu Jin

Zeyu Jin

Deep Audio Watermarks are Shallow: Limitations of Post-Hoc Watermarking Techniques for Speech

Add code
Apr 15, 2025
Viaarxiv icon

DiTSE: High-Fidelity Generative Speech Enhancement via Latent Diffusion Transformers

Add code
Apr 13, 2025
Viaarxiv icon

SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation

Add code
Apr 07, 2025
Viaarxiv icon

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference

Add code
Jan 23, 2025
Viaarxiv icon

Code Drift: Towards Idempotent Neural Audio Codecs

Add code
Oct 14, 2024
Viaarxiv icon

DMDSpeech: Distilled Diffusion Model Surpassing The Teacher in Zero-shot Speech Synthesis via Direct Metric Optimization

Add code
Oct 14, 2024
Viaarxiv icon

VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling

Add code
Aug 28, 2024
Figure 1 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 2 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 3 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 4 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Viaarxiv icon

Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation

Add code
Aug 28, 2024
Figure 1 for Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation
Figure 2 for Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation
Figure 3 for Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation
Figure 4 for Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation
Viaarxiv icon

VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap

Add code
May 24, 2024
Viaarxiv icon

A Closer Look at the Limitations of Instruction Tuning

Add code
Feb 03, 2024
Viaarxiv icon