Picture for Zhongyuan Wang

Zhongyuan Wang

Virgo: A Preliminary Exploration on Reproducing o1-like MLLM

Add code
Jan 03, 2025
Viaarxiv icon

FriendsQA: A New Large-Scale Deep Video Understanding Dataset with Fine-grained Topic Categorization for Story Videos

Add code
Dec 22, 2024
Viaarxiv icon

Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems

Add code
Dec 12, 2024
Figure 1 for Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems
Figure 2 for Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems
Figure 3 for Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems
Figure 4 for Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems
Viaarxiv icon

Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks

Add code
Dec 09, 2024
Viaarxiv icon

Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection

Add code
Dec 05, 2024
Viaarxiv icon

Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation

Add code
Nov 27, 2024
Figure 1 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 2 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 3 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 4 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Viaarxiv icon

Stacking Brick by Brick: Aligned Feature Isolation for Incremental Face Forgery Detection

Add code
Nov 19, 2024
Viaarxiv icon

Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search

Add code
Nov 18, 2024
Viaarxiv icon

EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations

Add code
Oct 15, 2024
Viaarxiv icon

Emu3: Next-Token Prediction is All You Need

Add code
Sep 27, 2024
Figure 1 for Emu3: Next-Token Prediction is All You Need
Figure 2 for Emu3: Next-Token Prediction is All You Need
Figure 3 for Emu3: Next-Token Prediction is All You Need
Figure 4 for Emu3: Next-Token Prediction is All You Need
Viaarxiv icon