Picture for Chi Chen

Chi Chen

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding

Add code
Nov 06, 2024
Viaarxiv icon

PlaneSAM: Multimodal Plane Instance Segmentation Using the Segment Anything Model

Add code
Oct 21, 2024
Viaarxiv icon

ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models

Add code
Oct 07, 2024
Viaarxiv icon

Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement

Add code
Aug 15, 2024
Viaarxiv icon

PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer

Add code
Apr 07, 2024
Viaarxiv icon

Goldfish: An Efficient Federated Unlearning Framework

Add code
Apr 04, 2024
Viaarxiv icon

CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models

Add code
Feb 21, 2024
Viaarxiv icon

Model Composition for Multimodal Large Language Models

Add code
Feb 20, 2024
Viaarxiv icon

Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion

Add code
Feb 19, 2024
Viaarxiv icon

Filling the Image Information Gap for VQA: Prompting Large Language Models to Proactively Ask Questions

Add code
Nov 20, 2023
Viaarxiv icon