Picture for Siran Chen

Siran Chen

LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents

Add code
Mar 13, 2025
Viaarxiv icon

H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving

Add code
Jan 08, 2025
Figure 1 for H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Figure 2 for H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Figure 3 for H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Figure 4 for H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Viaarxiv icon

Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of Foundation Models for Open-World Video Recognition

Add code
Feb 29, 2024
Viaarxiv icon

M-BEV: Masked BEV Perception for Robust Autonomous Driving

Add code
Dec 19, 2023
Figure 1 for M-BEV: Masked BEV Perception for Robust Autonomous Driving
Figure 2 for M-BEV: Masked BEV Perception for Robust Autonomous Driving
Figure 3 for M-BEV: Masked BEV Perception for Robust Autonomous Driving
Figure 4 for M-BEV: Masked BEV Perception for Robust Autonomous Driving
Viaarxiv icon