Picture for Yexin Liu

Yexin Liu

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Add code
Apr 04, 2025
Viaarxiv icon

VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention

Add code
Mar 20, 2025
Viaarxiv icon

ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos

Add code
Mar 20, 2025
Viaarxiv icon

Temporal Regularization Makes Your Video Generator Stronger

Add code
Mar 19, 2025
Viaarxiv icon

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Add code
Mar 11, 2025
Viaarxiv icon

MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation

Add code
Feb 17, 2025
Viaarxiv icon

VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation

Add code
Dec 03, 2024
Viaarxiv icon

Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions

Add code
Jun 15, 2024
Viaarxiv icon

EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging

Add code
May 22, 2024
Viaarxiv icon

Efficient Multimodal Large Language Models: A Survey

Add code
May 17, 2024
Figure 1 for Efficient Multimodal Large Language Models: A Survey
Figure 2 for Efficient Multimodal Large Language Models: A Survey
Figure 3 for Efficient Multimodal Large Language Models: A Survey
Figure 4 for Efficient Multimodal Large Language Models: A Survey
Viaarxiv icon