Picture for Wenhu Chen

Wenhu Chen

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Add code
Feb 03, 2025
Viaarxiv icon

PixelWorld: Towards Perceiving Everything as Pixels

Add code
Jan 31, 2025
Viaarxiv icon

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Add code
Jan 30, 2025
Figure 1 for Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Figure 2 for Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Figure 3 for Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Figure 4 for Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Viaarxiv icon

Aligning Instruction Tuning with Pre-training

Add code
Jan 16, 2025
Figure 1 for Aligning Instruction Tuning with Pre-training
Figure 2 for Aligning Instruction Tuning with Pre-training
Figure 3 for Aligning Instruction Tuning with Pre-training
Figure 4 for Aligning Instruction Tuning with Pre-training
Viaarxiv icon

VISA: Retrieval Augmented Generation with Visual Source Attribution

Add code
Dec 19, 2024
Viaarxiv icon

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Add code
Dec 06, 2024
Viaarxiv icon

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Add code
Dec 01, 2024
Viaarxiv icon

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Add code
Nov 11, 2024
Figure 1 for OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Figure 2 for OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Figure 3 for OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Figure 4 for OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Viaarxiv icon

Harnessing Webpage UIs for Text-Rich Visual Understanding

Add code
Oct 17, 2024
Figure 1 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 2 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 3 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 4 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Viaarxiv icon

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Add code
Oct 14, 2024
Viaarxiv icon