Picture for Lu Sheng

Lu Sheng

Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection

Add code
Dec 05, 2024
Viaarxiv icon

MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation

Add code
Dec 04, 2024
Viaarxiv icon

MV-Adapter: Multi-view Consistent Image Generation Made Easy

Add code
Dec 04, 2024
Viaarxiv icon

A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs

Add code
Nov 26, 2024
Viaarxiv icon

WorldSimBench: Towards Video Generation Models as World Simulators

Add code
Oct 23, 2024
Figure 1 for WorldSimBench: Towards Video Generation Models as World Simulators
Figure 2 for WorldSimBench: Towards Video Generation Models as World Simulators
Figure 3 for WorldSimBench: Towards Video Generation Models as World Simulators
Figure 4 for WorldSimBench: Towards Video Generation Models as World Simulators
Viaarxiv icon

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion

Add code
Jun 05, 2024
Viaarxiv icon

From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation

Add code
Apr 23, 2024
Viaarxiv icon

Self-Supervised Monocular Depth Estimation in the Dark: Towards Data Distribution Compensation

Add code
Apr 22, 2024
Viaarxiv icon

RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents

Add code
Mar 28, 2024
Figure 1 for RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
Figure 2 for RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
Figure 3 for RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
Figure 4 for RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
Viaarxiv icon

Assessment of Multimodal Large Language Models in Alignment with Human Values

Add code
Mar 26, 2024
Viaarxiv icon