Picture for Wenqi Shao

Wenqi Shao

Text2World: Benchmarking Large Language Models for Symbolic World Model Generation

Add code
Feb 18, 2025
Viaarxiv icon

Enhance-A-Video: Better Generated Video for Free

Add code
Feb 11, 2025
Viaarxiv icon

SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement

Add code
Feb 10, 2025
Viaarxiv icon

Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM

Add code
Dec 19, 2024
Viaarxiv icon

DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation

Add code
Dec 11, 2024
Viaarxiv icon

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Add code
Dec 06, 2024
Figure 1 for Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Figure 2 for Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Figure 3 for Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Figure 4 for Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
Viaarxiv icon

TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception

Add code
Dec 04, 2024
Figure 1 for TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception
Figure 2 for TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception
Figure 3 for TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception
Figure 4 for TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception
Viaarxiv icon

CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning

Add code
Dec 04, 2024
Figure 1 for CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
Figure 2 for CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
Figure 3 for CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
Figure 4 for CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
Viaarxiv icon

GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation

Add code
Dec 01, 2024
Viaarxiv icon

DexDiffuser: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation

Add code
Nov 27, 2024
Viaarxiv icon