Picture for Zhengfeng Lai

Zhengfeng Lai

STIV: Scalable Text and Image Conditioned Video Generation

Add code
Dec 10, 2024
Viaarxiv icon

Contrastive Localized Language-Image Pre-Training

Add code
Oct 03, 2024
Viaarxiv icon

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Add code
Oct 03, 2024
Viaarxiv icon

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Add code
Sep 30, 2024
Viaarxiv icon

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Add code
Jul 22, 2024
Viaarxiv icon

Empowering Source-Free Domain Adaptation with MLLM-driven Curriculum Learning

Add code
May 28, 2024
Viaarxiv icon

MobilityGPT: Enhanced Human Mobility Modeling with a GPT model

Add code
Feb 05, 2024
Viaarxiv icon

From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions

Add code
Oct 11, 2023
Viaarxiv icon