Picture for Zeliang Zhang

Zeliang Zhang

The Sword of Damocles in ViTs: Computational Redundancy Amplifies Adversarial Transferability

Add code
Apr 15, 2025
Viaarxiv icon

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Add code
Apr 09, 2025
Viaarxiv icon

Forward Learning with Differential Privacy

Add code
Apr 01, 2025
Viaarxiv icon

Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives

Add code
Feb 19, 2025
Viaarxiv icon

Generative AI for Cel-Animation: A Survey

Add code
Jan 08, 2025
Viaarxiv icon

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

Add code
Nov 19, 2024
Figure 1 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 2 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 3 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Figure 4 for VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Viaarxiv icon

Will the Inclusion of Generated Data Amplify Bias Across Generations in Future Image Classification Models?

Add code
Oct 14, 2024
Viaarxiv icon

Understanding Model Ensemble in Transferable Adversarial Attack

Add code
Oct 09, 2024
Figure 1 for Understanding Model Ensemble in Transferable Adversarial Attack
Figure 2 for Understanding Model Ensemble in Transferable Adversarial Attack
Figure 3 for Understanding Model Ensemble in Transferable Adversarial Attack
Figure 4 for Understanding Model Ensemble in Transferable Adversarial Attack
Viaarxiv icon

FLOPS: Forward Learning with OPtimal Sampling

Add code
Oct 08, 2024
Viaarxiv icon

Quadratic Is Not What You Need For Multimodal Large Language Models

Add code
Oct 08, 2024
Figure 1 for Quadratic Is Not What You Need For Multimodal Large Language Models
Figure 2 for Quadratic Is Not What You Need For Multimodal Large Language Models
Figure 3 for Quadratic Is Not What You Need For Multimodal Large Language Models
Figure 4 for Quadratic Is Not What You Need For Multimodal Large Language Models
Viaarxiv icon