Picture for Xilin Wei

Xilin Wei

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Add code
Dec 12, 2024
Viaarxiv icon

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

Add code
Jun 17, 2024
Viaarxiv icon

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Add code
Jun 06, 2024
Figure 1 for ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Figure 2 for ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Figure 3 for ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Figure 4 for ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Viaarxiv icon

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

Add code
Jan 29, 2024
Viaarxiv icon

Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning

Add code
Jun 04, 2023
Viaarxiv icon