Picture for Xiaoqian Shen

Xiaoqian Shen

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Add code
Oct 22, 2024
Viaarxiv icon

Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling

Add code
Aug 07, 2024
Figure 1 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 2 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 3 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 4 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Viaarxiv icon

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos

Add code
Jul 17, 2024
Viaarxiv icon

iMotion-LLM: Motion Prediction Instruction Tuning

Add code
Jun 11, 2024
Viaarxiv icon

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens

Add code
Apr 04, 2024
Viaarxiv icon

Large Language Models as Consistent Story Visualizers

Add code
Dec 04, 2023
Viaarxiv icon

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

Add code
Oct 26, 2023
Figure 1 for MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Figure 2 for MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Figure 3 for MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Figure 4 for MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Viaarxiv icon

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations

Add code
Sep 12, 2023
Viaarxiv icon

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models

Add code
Apr 20, 2023
Viaarxiv icon

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

Add code
Apr 11, 2023
Viaarxiv icon