Picture for Weizhi Wang

Weizhi Wang

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Add code
Apr 02, 2025
Viaarxiv icon

Adaptive Layer-skipping in Pre-trained LLMs

Add code
Mar 31, 2025
Viaarxiv icon

Tora: Trajectory-oriented Diffusion Transformer for Video Generation

Add code
Jul 31, 2024
Viaarxiv icon

Large Language Model based Situational Dialogues for Second Language Learning

Add code
Mar 29, 2024
Viaarxiv icon

MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration

Add code
Mar 22, 2024
Viaarxiv icon

EffiVED:Efficient Video Editing via Text-instruction Diffusion Models

Add code
Mar 18, 2024
Viaarxiv icon

Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters

Add code
Mar 05, 2024
Figure 1 for Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Figure 2 for Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Figure 3 for Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Figure 4 for Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Viaarxiv icon

AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance

Add code
Dec 04, 2023
Viaarxiv icon

GPT-4V as a Generalist Evaluator for Vision-Language Tasks

Add code
Nov 02, 2023
Viaarxiv icon

Augmenting Language Models with Long-Term Memory

Add code
Jun 12, 2023
Viaarxiv icon