Picture for Juncheng Li

Juncheng Li

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

Add code
Apr 09, 2025
Viaarxiv icon

Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark

Add code
Mar 24, 2025
Viaarxiv icon

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Add code
Mar 19, 2025
Viaarxiv icon

SOYO: A Tuning-Free Approach for Video Style Morphing via Style-Adaptive Interpolation in Diffusion Models

Add code
Mar 10, 2025
Viaarxiv icon

Chart-HQA: A Benchmark for Hypothetical Question Answering in Charts

Add code
Mar 07, 2025
Viaarxiv icon

The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation

Add code
Mar 06, 2025
Viaarxiv icon

AEIA-MN: Evaluating the Robustness of Multimodal LLM-Powered Mobile Agents Against Active Environmental Injection Attacks

Add code
Feb 18, 2025
Viaarxiv icon

MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation

Add code
Dec 28, 2024
Figure 1 for MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation
Figure 2 for MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation
Figure 3 for MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation
Figure 4 for MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation
Viaarxiv icon

Boosting Private Domain Understanding of Efficient MLLMs: A Tuning-free, Adaptive, Universal Prompt Optimization Framework

Add code
Dec 27, 2024
Viaarxiv icon

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Add code
Dec 13, 2024
Figure 1 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 2 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 3 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 4 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Viaarxiv icon