Picture for Yujun Shen

Yujun Shen

Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning

Add code
Dec 12, 2024
Viaarxiv icon

Learning Visual Generative Priors without Text

Add code
Dec 10, 2024
Viaarxiv icon

PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes

Add code
Dec 04, 2024
Viaarxiv icon

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Add code
Dec 04, 2024
Figure 1 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 2 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 3 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Figure 4 for Mimir: Improving Video Diffusion Models for Precise Text Understanding
Viaarxiv icon

MagicQuill: An Intelligent Interactive Image Editing System

Add code
Nov 14, 2024
Viaarxiv icon

Framer: Interactive Frame Interpolation

Add code
Oct 24, 2024
Figure 1 for Framer: Interactive Frame Interpolation
Figure 2 for Framer: Interactive Frame Interpolation
Figure 3 for Framer: Interactive Frame Interpolation
Figure 4 for Framer: Interactive Frame Interpolation
Viaarxiv icon

Rectified Diffusion Guidance for Conditional Generation

Add code
Oct 24, 2024
Viaarxiv icon

LoTLIP: Improving Language-Image Pre-training for Long Text Understanding

Add code
Oct 07, 2024
Figure 1 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 2 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 3 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 4 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Viaarxiv icon

Zero-shot Image Editing with Reference Imitation

Add code
Jun 11, 2024
Viaarxiv icon

Learning Temporally Consistent Video Depth from Video Diffusion Priors

Add code
Jun 04, 2024
Viaarxiv icon