Picture for Dacheng Li

Dacheng Li

WorldModelBench: Judging Video Generation Models As World Models

Add code
Feb 28, 2025
Viaarxiv icon

S*: Test Time Scaling for Code Generation

Add code
Feb 20, 2025
Viaarxiv icon

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Add code
Feb 12, 2025
Viaarxiv icon

LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Add code
Feb 11, 2025
Viaarxiv icon

Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile

Add code
Feb 10, 2025
Viaarxiv icon

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

Add code
Feb 03, 2025
Figure 1 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 2 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 3 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Figure 4 for Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Viaarxiv icon

Locality-aware Fair Scheduling in LLM Serving

Add code
Jan 24, 2025
Viaarxiv icon

NVILA: Efficient Frontier Visual Language Models

Add code
Dec 05, 2024
Figure 1 for NVILA: Efficient Frontier Visual Language Models
Figure 2 for NVILA: Efficient Frontier Visual Language Models
Figure 3 for NVILA: Efficient Frontier Visual Language Models
Figure 4 for NVILA: Efficient Frontier Visual Language Models
Viaarxiv icon

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Add code
Sep 06, 2024
Figure 1 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 2 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 3 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 4 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Viaarxiv icon

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Add code
Aug 21, 2024
Figure 1 for LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Figure 2 for LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Figure 3 for LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Figure 4 for LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Viaarxiv icon