Picture for Qidong Huang

Qidong Huang

MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation

Add code
Feb 17, 2025
Viaarxiv icon

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Add code
Feb 12, 2025
Viaarxiv icon

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Add code
Oct 22, 2024
Viaarxiv icon

Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate

Add code
Oct 09, 2024
Viaarxiv icon

SimAC: A Simple Anti-Customization Method against Text-to-Image Synthesis of Diffusion Models

Add code
Dec 13, 2023
Viaarxiv icon

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

Add code
Nov 29, 2023
Viaarxiv icon

Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting

Add code
Aug 22, 2023
Viaarxiv icon

Diversity-Aware Meta Visual Prompting

Add code
Mar 14, 2023
Viaarxiv icon

Towards Precise Flood Prediction via Hierachical Terrain Attention and Multi-Scale Rainfall Guidance

Add code
Dec 04, 2022
Viaarxiv icon

Ada3Diff: Defending against 3D Adversarial Point Clouds via Adaptive Diffusion

Add code
Nov 29, 2022
Viaarxiv icon