Picture for Zeyu Xie

Zeyu Xie

DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation

Add code
Jul 18, 2024
Viaarxiv icon

PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation

Add code
Jul 03, 2024
Figure 1 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Figure 2 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Figure 3 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Viaarxiv icon

AudioTime: A Temporally-aligned Audio-text Benchmark Dataset

Add code
Jul 03, 2024
Figure 1 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 2 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 3 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 4 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Viaarxiv icon

FakeSound: Deepfake General Audio Detection

Add code
Jun 12, 2024
Viaarxiv icon

A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds

Add code
Mar 07, 2024
Viaarxiv icon

Enhancing Audio Generation Diversity with Visual Information

Add code
Mar 02, 2024
Viaarxiv icon

Phonetic and Lexical Discovery of a Canine Language using HuBERT

Add code
Feb 25, 2024
Viaarxiv icon

Improving Audio Caption Fluency with Automatic Error Correction

Add code
Jun 16, 2023
Viaarxiv icon

Enhance Temporal Relations in Audio Captioning with Sound Event Detection

Add code
Jun 02, 2023
Viaarxiv icon

BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data

Add code
Mar 14, 2023
Viaarxiv icon