Picture for Xuenan Xu

Xuenan Xu

Unified Pathological Speech Analysis with Prompt Tuning

Add code
Nov 05, 2024
Viaarxiv icon

SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs

Add code
Oct 12, 2024
Figure 1 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 2 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 3 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 4 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Viaarxiv icon

DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning

Add code
Oct 12, 2024
Viaarxiv icon

Efficient Audio Captioning with Encoder-Level Knowledge Distillation

Add code
Jul 19, 2024
Viaarxiv icon

Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models

Add code
Jul 19, 2024
Viaarxiv icon

DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation

Add code
Jul 18, 2024
Viaarxiv icon

PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation

Add code
Jul 03, 2024
Figure 1 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Figure 2 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Figure 3 for PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Viaarxiv icon

AudioTime: A Temporally-aligned Audio-text Benchmark Dataset

Add code
Jul 03, 2024
Figure 1 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 2 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 3 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Figure 4 for AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Viaarxiv icon

FakeSound: Deepfake General Audio Detection

Add code
Jun 12, 2024
Viaarxiv icon

Zero-Shot Audio Captioning Using Soft and Hard Prompts

Add code
Jun 10, 2024
Viaarxiv icon