Picture for Rui Qian

Rui Qian

FA-BARF: Frequency Adapted Bundle-Adjusting Neural Radiance Fields

Add code
Mar 15, 2025
Viaarxiv icon

DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation

Add code
Mar 13, 2025
Viaarxiv icon

DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning

Add code
Mar 09, 2025
Viaarxiv icon

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Add code
Jan 09, 2025
Viaarxiv icon

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction

Add code
Jan 06, 2025
Figure 1 for Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Figure 2 for Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Figure 3 for Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Figure 4 for Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Viaarxiv icon

Reasoning to Attend: Try to Understand How <SEG> Token Works

Add code
Dec 23, 2024
Viaarxiv icon

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Add code
Dec 12, 2024
Figure 1 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 2 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 3 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 4 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Viaarxiv icon

SimC3D: A Simple Contrastive 3D Pretraining Framework Using RGB Images

Add code
Dec 06, 2024
Viaarxiv icon

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

Add code
Oct 21, 2024
Viaarxiv icon

Imagen 3

Add code
Aug 13, 2024
Viaarxiv icon