Picture for Jiarui Hai

Jiarui Hai

AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking

Add code
Jan 25, 2026
Viaarxiv icon

Summary of The Inaugural Music Source Restoration Challenge

Add code
Jan 07, 2026
Viaarxiv icon

SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline

Add code
May 25, 2025
Viaarxiv icon

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Add code
Sep 17, 2024
Viaarxiv icon

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer

Add code
Sep 12, 2024
Viaarxiv icon

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis

Add code
Sep 11, 2024
Figure 1 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 2 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 3 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 4 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Viaarxiv icon

DreamVoice: Text-Guided Voice Conversion

Add code
Jun 24, 2024
Figure 1 for DreamVoice: Text-Guided Voice Conversion
Figure 2 for DreamVoice: Text-Guided Voice Conversion
Figure 3 for DreamVoice: Text-Guided Voice Conversion
Viaarxiv icon

Noise-robust Speech Separation with Fast Generative Correction

Add code
Jun 11, 2024
Viaarxiv icon

Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding

Add code
Nov 07, 2023
Figure 1 for Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding
Figure 2 for Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding
Figure 3 for Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding
Figure 4 for Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding
Viaarxiv icon

DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction

Add code
Oct 10, 2023
Figure 1 for DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
Figure 2 for DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
Figure 3 for DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
Viaarxiv icon