Picture for Jiebo Luo

Jiebo Luo

Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion

Add code
Jan 15, 2025
Viaarxiv icon

SafeCFG: Redirecting Harmful Classifier-Free Guidance for Safe Generation

Add code
Dec 20, 2024
Viaarxiv icon

How Vision-Language Tasks Benefit from Large Pre-trained Models: A Survey

Add code
Dec 11, 2024
Viaarxiv icon

Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training

Add code
Dec 08, 2024
Viaarxiv icon

Personalized Multimodal Large Language Models: A Survey

Add code
Dec 03, 2024
Viaarxiv icon

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Add code
Nov 26, 2024
Figure 1 for Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Figure 2 for Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Figure 3 for Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Figure 4 for Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Viaarxiv icon

FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity

Add code
Nov 23, 2024
Figure 1 for FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
Figure 2 for FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
Figure 3 for FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
Figure 4 for FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
Viaarxiv icon

Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics

Add code
Nov 22, 2024
Figure 1 for Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Figure 2 for Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Figure 3 for Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Figure 4 for Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Viaarxiv icon

Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension

Add code
Nov 20, 2024
Figure 1 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 2 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 3 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 4 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Viaarxiv icon

CRTRE: Causal Rule Generation with Target Trial Emulation Framework

Add code
Nov 10, 2024
Figure 1 for CRTRE: Causal Rule Generation with Target Trial Emulation Framework
Figure 2 for CRTRE: Causal Rule Generation with Target Trial Emulation Framework
Figure 3 for CRTRE: Causal Rule Generation with Target Trial Emulation Framework
Figure 4 for CRTRE: Causal Rule Generation with Target Trial Emulation Framework
Viaarxiv icon