Picture for Zheng Ge

Zheng Ge

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

Unhackable Temporal Rewarding for Scalable Video MLLMs

Add code
Feb 17, 2025
Viaarxiv icon

PerPO: Perceptual Preference Optimization via Discriminative Rewarding

Add code
Feb 05, 2025
Viaarxiv icon

Taming Teacher Forcing for Masked Autoregressive Video Generation

Add code
Jan 21, 2025
Viaarxiv icon

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Add code
Dec 30, 2024
Figure 1 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 2 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 3 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 4 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Viaarxiv icon

Reconstructive Visual Instruction Tuning

Add code
Oct 12, 2024
Viaarxiv icon

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Add code
Sep 03, 2024
Viaarxiv icon

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Add code
Jun 24, 2024
Viaarxiv icon

Focus Anywhere for Fine-grained Multi-page Document Understanding

Add code
May 23, 2024
Viaarxiv icon

Self-Supervised Visual Preference Alignment

Add code
Apr 16, 2024
Viaarxiv icon