Picture for Chenliang Xu

Chenliang Xu

Scaling Concept With Text-Guided Diffusion Models

Add code
Oct 31, 2024
Viaarxiv icon

Will the Inclusion of Generated Data Amplify Bias Across Generations in Future Image Classification Models?

Add code
Oct 14, 2024
Viaarxiv icon

MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models

Add code
Oct 13, 2024
Viaarxiv icon

Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation

Add code
Oct 09, 2024
Figure 1 for Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Figure 2 for Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Figure 3 for Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Figure 4 for Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Viaarxiv icon

Quadratic Is Not What You Need For Multimodal Large Language Models

Add code
Oct 08, 2024
Viaarxiv icon

EAGLE: Egocentric AGgregated Language-video Engine

Add code
Sep 26, 2024
Viaarxiv icon

CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for Saliency Prediction with Diffusion

Add code
Aug 21, 2024
Viaarxiv icon

Modeling and Driving Human Body Soundfields through Acoustic Primitives

Add code
Jul 18, 2024
Viaarxiv icon

Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts

Add code
Jul 12, 2024
Viaarxiv icon

Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?

Add code
Jun 18, 2024
Viaarxiv icon