Picture for Jiaya Jia

Jiaya Jia

Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning

Add code
Oct 22, 2025
Viaarxiv icon

From Noisy Traces to Stable Gradients: Bias-Variance Optimized Preference Optimization for Aligning Large Reasoning Models

Add code
Oct 06, 2025
Viaarxiv icon

DreamVE: Unified Instruction-based Image and Video Editing

Add code
Aug 08, 2025
Viaarxiv icon

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Add code
Jul 17, 2025
Viaarxiv icon

DynamicBench: Evaluating Real-Time Report Generation in Large Language Models

Add code
Jun 26, 2025
Viaarxiv icon

TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization

Add code
Jun 17, 2025
Viaarxiv icon

Logits-Based Finetuning

Add code
May 30, 2025
Viaarxiv icon

RTime-QA: A Benchmark for Atomic Temporal Event Understanding in Large Multi-modal Models

Add code
May 25, 2025
Viaarxiv icon

ARPO:End-to-End Policy Optimization for GUI Agents with Experience Replay

Add code
May 22, 2025
Viaarxiv icon

Training-Free Efficient Video Generation via Dynamic Token Carving

Add code
May 22, 2025
Figure 1 for Training-Free Efficient Video Generation via Dynamic Token Carving
Figure 2 for Training-Free Efficient Video Generation via Dynamic Token Carving
Figure 3 for Training-Free Efficient Video Generation via Dynamic Token Carving
Figure 4 for Training-Free Efficient Video Generation via Dynamic Token Carving
Viaarxiv icon