Picture for Yiwen Tang

Yiwen Tang

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Add code
Feb 02, 2026
Viaarxiv icon

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Add code
Dec 11, 2025
Viaarxiv icon

REVISION:Reflective Intent Mining and Online Reasoning Auxiliary for E-commerce Visual Search System Optimization

Add code
Oct 26, 2025
Viaarxiv icon

Hume: Introducing System-2 Thinking in Visual-Language-Action Model

Add code
May 29, 2025
Figure 1 for Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Figure 2 for Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Figure 3 for Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Figure 4 for Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Viaarxiv icon

EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models

Add code
May 28, 2025
Viaarxiv icon

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Add code
May 19, 2025
Figure 1 for AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
Figure 2 for AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
Figure 3 for AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
Figure 4 for AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
Viaarxiv icon

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Add code
Apr 10, 2025
Viaarxiv icon

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

Add code
Feb 25, 2025
Viaarxiv icon

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Add code
Feb 13, 2025
Figure 1 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 2 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 3 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 4 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Viaarxiv icon

FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives

Add code
Oct 29, 2024
Figure 1 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 2 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 3 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 4 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Viaarxiv icon