Picture for Kecheng Zheng

Kecheng Zheng

Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning

Add code
Dec 12, 2024
Viaarxiv icon

Learning Visual Generative Priors without Text

Add code
Dec 10, 2024
Viaarxiv icon

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

Add code
Dec 08, 2024
Viaarxiv icon

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Add code
Dec 04, 2024
Figure 1 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 2 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 3 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 4 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Viaarxiv icon

MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks

Add code
Nov 29, 2024
Viaarxiv icon

Framer: Interactive Frame Interpolation

Add code
Oct 24, 2024
Figure 1 for Framer: Interactive Frame Interpolation
Figure 2 for Framer: Interactive Frame Interpolation
Figure 3 for Framer: Interactive Frame Interpolation
Figure 4 for Framer: Interactive Frame Interpolation
Viaarxiv icon

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Add code
Oct 14, 2024
Figure 1 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 2 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 3 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 4 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Viaarxiv icon

LoTLIP: Improving Language-Image Pre-training for Long Text Understanding

Add code
Oct 07, 2024
Figure 1 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 2 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 3 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 4 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Viaarxiv icon

MePT: Multi-Representation Guided Prompt Tuning for Vision-Language Model

Add code
Aug 19, 2024
Viaarxiv icon

Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs

Add code
Jul 31, 2024
Figure 1 for Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Figure 2 for Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Figure 3 for Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Figure 4 for Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Viaarxiv icon