Picture for Kecheng Zheng

Kecheng Zheng

Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning

Add code
Dec 12, 2024
Viaarxiv icon

Learning Visual Generative Priors without Text

Add code
Dec 10, 2024
Figure 1 for Learning Visual Generative Priors without Text
Figure 2 for Learning Visual Generative Priors without Text
Figure 3 for Learning Visual Generative Priors without Text
Figure 4 for Learning Visual Generative Priors without Text
Viaarxiv icon

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

Add code
Dec 08, 2024
Viaarxiv icon

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Add code
Dec 04, 2024
Figure 1 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 2 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 3 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 4 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Viaarxiv icon

MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks

Add code
Nov 29, 2024
Viaarxiv icon

Framer: Interactive Frame Interpolation

Add code
Oct 24, 2024
Figure 1 for Framer: Interactive Frame Interpolation
Figure 2 for Framer: Interactive Frame Interpolation
Figure 3 for Framer: Interactive Frame Interpolation
Figure 4 for Framer: Interactive Frame Interpolation
Viaarxiv icon

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Add code
Oct 14, 2024
Figure 1 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 2 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 3 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 4 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Viaarxiv icon

LoTLIP: Improving Language-Image Pre-training for Long Text Understanding

Add code
Oct 07, 2024
Figure 1 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 2 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 3 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 4 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Viaarxiv icon

MePT: Multi-Representation Guided Prompt Tuning for Vision-Language Model

Add code
Aug 19, 2024
Viaarxiv icon

Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs

Add code
Jul 31, 2024
Figure 1 for Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Figure 2 for Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Figure 3 for Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Figure 4 for Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Viaarxiv icon