Picture for Zheng Ge

Zheng Ge

Perception-R1: Pioneering Perception Policy with Reinforcement Learning

Add code
Apr 10, 2025
Viaarxiv icon

Perception in Reflection

Add code
Apr 09, 2025
Viaarxiv icon

M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?

Add code
Mar 27, 2025
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

Unhackable Temporal Rewarding for Scalable Video MLLMs

Add code
Feb 17, 2025
Viaarxiv icon

PerPO: Perceptual Preference Optimization via Discriminative Rewarding

Add code
Feb 05, 2025
Viaarxiv icon

Taming Teacher Forcing for Masked Autoregressive Video Generation

Add code
Jan 21, 2025
Viaarxiv icon

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Add code
Dec 30, 2024
Figure 1 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 2 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 3 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 4 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Viaarxiv icon

Reconstructive Visual Instruction Tuning

Add code
Oct 12, 2024
Viaarxiv icon