Picture for Yuhui Zhang

Yuhui Zhang

AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing

Add code
Jun 16, 2025
Viaarxiv icon

Can Large Language Models Match the Conclusions of Systematic Reviews?

Add code
May 28, 2025
Viaarxiv icon

NegVQA: Can Vision Language Models Understand Negation?

Add code
May 28, 2025
Viaarxiv icon

TULiP: Test-time Uncertainty Estimation via Linearization and Weight Perturbation

Add code
May 22, 2025
Viaarxiv icon

A 2D Semantic-Aware Position Encoding for Vision Transformers

Add code
May 14, 2025
Viaarxiv icon

Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity

Add code
May 12, 2025
Viaarxiv icon

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

Add code
Mar 17, 2025
Viaarxiv icon

Video Action Differencing

Add code
Mar 10, 2025
Viaarxiv icon

EquiBench: Benchmarking Code Reasoning Capabilities of Large Language Models via Equivalence Checking

Add code
Feb 18, 2025
Viaarxiv icon

Temporal Preference Optimization for Long-Form Video Understanding

Add code
Jan 23, 2025
Figure 1 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 2 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 3 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 4 for Temporal Preference Optimization for Long-Form Video Understanding
Viaarxiv icon