Picture for Wenhu Chen

Wenhu Chen

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Add code
Mar 14, 2025
Viaarxiv icon

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Add code
Mar 13, 2025
Viaarxiv icon

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Add code
Mar 11, 2025
Viaarxiv icon

TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding

Add code
Feb 26, 2025
Viaarxiv icon

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Add code
Feb 03, 2025
Viaarxiv icon

PixelWorld: Towards Perceiving Everything as Pixels

Add code
Jan 31, 2025
Viaarxiv icon

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Add code
Jan 30, 2025
Figure 1 for Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Figure 2 for Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Figure 3 for Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Figure 4 for Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Viaarxiv icon

Aligning Instruction Tuning with Pre-training

Add code
Jan 16, 2025
Figure 1 for Aligning Instruction Tuning with Pre-training
Figure 2 for Aligning Instruction Tuning with Pre-training
Figure 3 for Aligning Instruction Tuning with Pre-training
Figure 4 for Aligning Instruction Tuning with Pre-training
Viaarxiv icon

VISA: Retrieval Augmented Generation with Visual Source Attribution

Add code
Dec 19, 2024
Viaarxiv icon

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Add code
Dec 06, 2024
Figure 1 for MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
Figure 2 for MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
Figure 3 for MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
Figure 4 for MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
Viaarxiv icon