Picture for Zhifang Guo

Zhifang Guo

Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for Neural Codec Language Models

Add code
Sep 28, 2024
Viaarxiv icon

Advancing Multi-grained Alignment for Contrastive Language-Audio Pre-training

Add code
Aug 15, 2024
Viaarxiv icon

Qwen2-Audio Technical Report

Add code
Jul 15, 2024
Viaarxiv icon

PromptTTS 2: Describing and Generating Voices with Text Prompt

Add code
Sep 05, 2023
Viaarxiv icon

Audio Generation with Multiple Conditional Diffusion Model

Add code
Aug 23, 2023
Viaarxiv icon

Furnishing Sound Event Detection with Language Model Abilities

Add code
Aug 22, 2023
Viaarxiv icon

PromptTTS: Controllable Text-to-Speech with Text Descriptions

Add code
Nov 22, 2022
Viaarxiv icon

A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4

Add code
Oct 18, 2022
Figure 1 for A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4
Figure 2 for A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4
Figure 3 for A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4
Figure 4 for A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4
Viaarxiv icon