Picture for Yexin Liu

Yexin Liu

VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention

Add code
Mar 20, 2025
Viaarxiv icon

ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos

Add code
Mar 20, 2025
Viaarxiv icon

Temporal Regularization Makes Your Video Generator Stronger

Add code
Mar 19, 2025
Viaarxiv icon

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Add code
Mar 11, 2025
Viaarxiv icon

MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation

Add code
Feb 17, 2025
Viaarxiv icon

VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation

Add code
Dec 03, 2024
Viaarxiv icon

Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions

Add code
Jun 15, 2024
Viaarxiv icon

EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging

Add code
May 22, 2024
Viaarxiv icon

Efficient Multimodal Large Language Models: A Survey

Add code
May 17, 2024
Figure 1 for Efficient Multimodal Large Language Models: A Survey
Figure 2 for Efficient Multimodal Large Language Models: A Survey
Figure 3 for Efficient Multimodal Large Language Models: A Survey
Figure 4 for Efficient Multimodal Large Language Models: A Survey
Viaarxiv icon

Evaluating large language models in medical applications: a survey

Add code
May 13, 2024
Viaarxiv icon