Picture for Xide Xia

Xide Xia

Sid

DirectorLLM for Human-Centric Video Generation

Add code
Dec 19, 2024
Figure 1 for DirectorLLM for Human-Centric Video Generation
Figure 2 for DirectorLLM for Human-Centric Video Generation
Figure 3 for DirectorLLM for Human-Centric Video Generation
Figure 4 for DirectorLLM for Human-Centric Video Generation
Viaarxiv icon

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Add code
Dec 13, 2024
Figure 1 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 2 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 3 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 4 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Viaarxiv icon

Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction

Add code
Nov 30, 2024
Figure 1 for Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction
Figure 2 for Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction
Figure 3 for Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction
Figure 4 for Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

Learning Video Context as Interleaved Multimodal Sequences

Add code
Jul 31, 2024
Figure 1 for Learning Video Context as Interleaved Multimodal Sequences
Figure 2 for Learning Video Context as Interleaved Multimodal Sequences
Figure 3 for Learning Video Context as Interleaved Multimodal Sequences
Figure 4 for Learning Video Context as Interleaved Multimodal Sequences
Viaarxiv icon

GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation

Add code
Jun 19, 2024
Figure 1 for GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Figure 2 for GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Figure 3 for GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Figure 4 for GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Viaarxiv icon

Evaluating Text-to-Visual Generation with Image-to-Text Generation

Add code
Apr 01, 2024
Figure 1 for Evaluating Text-to-Visual Generation with Image-to-Text Generation
Figure 2 for Evaluating Text-to-Visual Generation with Image-to-Text Generation
Figure 3 for Evaluating Text-to-Visual Generation with Image-to-Text Generation
Figure 4 for Evaluating Text-to-Visual Generation with Image-to-Text Generation
Viaarxiv icon

DIME-FM: DIstilling Multimodal and Efficient Foundation Models

Add code
Mar 31, 2023
Viaarxiv icon

Real-time Localized Photorealistic Video Style Transfer

Add code
Oct 20, 2020
Figure 1 for Real-time Localized Photorealistic Video Style Transfer
Figure 2 for Real-time Localized Photorealistic Video Style Transfer
Figure 3 for Real-time Localized Photorealistic Video Style Transfer
Figure 4 for Real-time Localized Photorealistic Video Style Transfer
Viaarxiv icon

Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer

Add code
Apr 27, 2020
Figure 1 for Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer
Figure 2 for Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer
Figure 3 for Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer
Figure 4 for Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer
Viaarxiv icon