Picture for Yujun Cai

Yujun Cai

CamReasoner: Reinforcing Camera Movement Understanding via Structured Spatial Reasoning

Add code
Jan 30, 2026
Viaarxiv icon

Noise as a Probe: Membership Inference Attacks on Diffusion Models Leveraging Initial Noise

Add code
Jan 29, 2026
Viaarxiv icon

Self-Manager: Parallel Agent Loop for Long-form Deep Research

Add code
Jan 25, 2026
Viaarxiv icon

OptiSQL: Executable SQL Generation from Optical Tokens

Add code
Jan 21, 2026
Viaarxiv icon

Learning to Generate Cross-Task Unexploitable Examples

Add code
Dec 15, 2025
Figure 1 for Learning to Generate Cross-Task Unexploitable Examples
Figure 2 for Learning to Generate Cross-Task Unexploitable Examples
Figure 3 for Learning to Generate Cross-Task Unexploitable Examples
Figure 4 for Learning to Generate Cross-Task Unexploitable Examples
Viaarxiv icon

Spatial Blind Spot: Auditory Motion Perception Deficits in Audio LLMs

Add code
Nov 17, 2025
Viaarxiv icon

PAS: A Training-Free Stabilizer for Temporal Encoding in Video LLMs

Add code
Nov 14, 2025
Viaarxiv icon

A Survey of Vibe Coding with Large Language Models

Add code
Oct 14, 2025
Viaarxiv icon

Detecting and Mitigating Insertion Hallucination in Video-to-Audio Generation

Add code
Oct 09, 2025
Viaarxiv icon

ContextNav: Towards Agentic Multimodal In-Context Learning

Add code
Oct 06, 2025
Figure 1 for ContextNav: Towards Agentic Multimodal In-Context Learning
Figure 2 for ContextNav: Towards Agentic Multimodal In-Context Learning
Figure 3 for ContextNav: Towards Agentic Multimodal In-Context Learning
Figure 4 for ContextNav: Towards Agentic Multimodal In-Context Learning
Viaarxiv icon