Picture for Ruichuan An

Ruichuan An

How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing

Add code
Feb 02, 2026
Viaarxiv icon

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Add code
Feb 02, 2026
Viaarxiv icon

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning

Add code
Jan 28, 2026
Viaarxiv icon

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

Add code
Jan 15, 2026
Viaarxiv icon

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Add code
Dec 18, 2025
Viaarxiv icon

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

Add code
Dec 17, 2025
Viaarxiv icon

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Add code
Oct 30, 2025
Viaarxiv icon

Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval

Add code
Oct 26, 2025
Figure 1 for Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Figure 2 for Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Figure 3 for Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Figure 4 for Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Viaarxiv icon

MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning

Add code
Oct 16, 2025
Figure 1 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 2 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 3 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 4 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Viaarxiv icon

WoW: Towards a World omniscient World model Through Embodied Interaction

Add code
Sep 26, 2025
Viaarxiv icon