Picture for Licheng Yu

Licheng Yu

Sid

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

Add code
Jan 08, 2025
Viaarxiv icon

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Add code
Dec 13, 2024
Figure 1 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 2 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 3 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 4 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Viaarxiv icon

Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction

Add code
Nov 30, 2024
Figure 1 for Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction
Figure 2 for Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction
Figure 3 for Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction
Figure 4 for Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction
Viaarxiv icon

ROICtrl: Boosting Instance Control for Visual Generation

Add code
Nov 27, 2024
Figure 1 for ROICtrl: Boosting Instance Control for Visual Generation
Figure 2 for ROICtrl: Boosting Instance Control for Visual Generation
Figure 3 for ROICtrl: Boosting Instance Control for Visual Generation
Figure 4 for ROICtrl: Boosting Instance Control for Visual Generation
Viaarxiv icon

Movie Gen: A Cast of Media Foundation Models

Add code
Oct 17, 2024
Figure 1 for Movie Gen: A Cast of Media Foundation Models
Figure 2 for Movie Gen: A Cast of Media Foundation Models
Figure 3 for Movie Gen: A Cast of Media Foundation Models
Figure 4 for Movie Gen: A Cast of Media Foundation Models
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

Add code
Jun 03, 2024
Figure 1 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 2 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 3 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 4 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Viaarxiv icon

Animated Stickers: Bringing Stickers to Life with Video Diffusion

Add code
Feb 08, 2024
Figure 1 for Animated Stickers: Bringing Stickers to Life with Video Diffusion
Figure 2 for Animated Stickers: Bringing Stickers to Life with Video Diffusion
Figure 3 for Animated Stickers: Bringing Stickers to Life with Video Diffusion
Figure 4 for Animated Stickers: Bringing Stickers to Life with Video Diffusion
Viaarxiv icon

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Add code
Dec 29, 2023
Viaarxiv icon

Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis

Add code
Dec 20, 2023
Figure 1 for Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis
Figure 2 for Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis
Figure 3 for Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis
Figure 4 for Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis
Viaarxiv icon