Picture for Wenhao Chai

Wenhao Chai

EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments

Add code
Mar 11, 2025
Viaarxiv icon

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

Add code
Mar 06, 2025
Viaarxiv icon

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

Add code
Feb 27, 2025
Viaarxiv icon

Pointmap Association and Piecewise-Plane Constraint for Consistent and Compact 3D Gaussian Segmentation Field

Add code
Feb 22, 2025
Viaarxiv icon

PackDiT: Joint Human Motion and Text Generation via Mutual Prompting

Add code
Jan 27, 2025
Figure 1 for PackDiT: Joint Human Motion and Text Generation via Mutual Prompting
Figure 2 for PackDiT: Joint Human Motion and Text Generation via Mutual Prompting
Figure 3 for PackDiT: Joint Human Motion and Text Generation via Mutual Prompting
Figure 4 for PackDiT: Joint Human Motion and Text Generation via Mutual Prompting
Viaarxiv icon

SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory

Add code
Nov 18, 2024
Viaarxiv icon

LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound

Add code
Oct 19, 2024
Viaarxiv icon

PAD: Personalized Alignment at Decoding-Time

Add code
Oct 14, 2024
Figure 1 for PAD: Personalized Alignment at Decoding-Time
Figure 2 for PAD: Personalized Alignment at Decoding-Time
Figure 3 for PAD: Personalized Alignment at Decoding-Time
Figure 4 for PAD: Personalized Alignment at Decoding-Time
Viaarxiv icon

Ego3DT: Tracking Every 3D Object in Ego-centric Videos

Add code
Oct 11, 2024
Figure 1 for Ego3DT: Tracking Every 3D Object in Ego-centric Videos
Viaarxiv icon

AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark

Add code
Oct 04, 2024
Figure 1 for AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Figure 2 for AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Figure 3 for AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Figure 4 for AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Viaarxiv icon