Picture for Zihui Cheng

Zihui Cheng

STEP3-VL-10B Technical Report

Add code
Jan 15, 2026
Viaarxiv icon

Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought

Add code
May 21, 2025
Viaarxiv icon

CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models

Add code
Dec 17, 2024
Viaarxiv icon