Picture for Kaipeng Zhang

Kaipeng Zhang

MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models

Add code
Apr 08, 2025
Viaarxiv icon

LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis

Add code
Mar 27, 2025
Viaarxiv icon

Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction

Add code
Mar 20, 2025
Viaarxiv icon

CLS-RL: Image Classification with Rule-Based Reinforcement Learning

Add code
Mar 20, 2025
Viaarxiv icon

PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models

Add code
Mar 16, 2025
Viaarxiv icon

MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification

Add code
Mar 16, 2025
Viaarxiv icon

MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Add code
Mar 10, 2025
Viaarxiv icon

ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy

Add code
Mar 09, 2025
Viaarxiv icon

ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Add code
Mar 09, 2025
Viaarxiv icon

Enhance-A-Video: Better Generated Video for Free

Add code
Feb 11, 2025
Viaarxiv icon