Picture for Yunlong Tang

Yunlong Tang

The Sword of Damocles in ViTs: Computational Redundancy Amplifies Adversarial Transferability

Add code
Apr 15, 2025
Viaarxiv icon

The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report

Add code
Apr 14, 2025
Viaarxiv icon

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Add code
Apr 09, 2025
Viaarxiv icon

Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)

Add code
Apr 04, 2025
Viaarxiv icon

FreSca: Unveiling the Scaling Space in Diffusion Models

Add code
Apr 02, 2025
Viaarxiv icon

VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity

Add code
Mar 14, 2025
Viaarxiv icon

Generative AI for Cel-Animation: A Survey

Add code
Jan 08, 2025
Viaarxiv icon

Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach

Add code
Dec 24, 2024
Viaarxiv icon

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

Add code
Nov 19, 2024
Figure 1 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 2 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 3 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 4 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Viaarxiv icon

Scaling Concept With Text-Guided Diffusion Models

Add code
Oct 31, 2024
Viaarxiv icon