Picture for Renrui Zhang

Renrui Zhang

3DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o

Add code
Mar 17, 2025
Viaarxiv icon

Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization

Add code
Mar 17, 2025
Viaarxiv icon

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

Add code
Mar 13, 2025
Viaarxiv icon

PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models

Add code
Mar 13, 2025
Viaarxiv icon

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Add code
Feb 13, 2025
Figure 1 for MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Figure 2 for MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Figure 3 for MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Figure 4 for MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Viaarxiv icon

IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models

Add code
Jan 23, 2025
Figure 1 for IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Figure 2 for IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Figure 3 for IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Figure 4 for IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Viaarxiv icon

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Add code
Jan 23, 2025
Figure 1 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 2 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 3 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 4 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Viaarxiv icon

TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction

Add code
Dec 22, 2024
Figure 1 for TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Figure 2 for TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Figure 3 for TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Figure 4 for TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Viaarxiv icon

Chimera: Improving Generalist Model with Domain-Specific Experts

Add code
Dec 08, 2024
Viaarxiv icon

Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation

Add code
Nov 27, 2024
Figure 1 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 2 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 3 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 4 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Viaarxiv icon