Picture for Rongkun Zheng

Rongkun Zheng

ViLLa: Video Reasoning Segmentation with Large Language Model

Add code
Jul 18, 2024
Viaarxiv icon

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Add code
Mar 22, 2024
Viaarxiv icon

TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation

Add code
Dec 12, 2023
Viaarxiv icon