Picture for Chi Chen

Chi Chen

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Add code
Jan 13, 2025
Viaarxiv icon

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

Add code
Jan 11, 2025
Viaarxiv icon

Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency

Add code
Jan 09, 2025
Figure 1 for Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Figure 2 for Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Figure 3 for Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Figure 4 for Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Viaarxiv icon

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Add code
Dec 18, 2024
Viaarxiv icon

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding

Add code
Nov 06, 2024
Figure 1 for StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
Figure 2 for StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
Figure 3 for StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
Figure 4 for StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
Viaarxiv icon

PlaneSAM: Multimodal Plane Instance Segmentation Using the Segment Anything Model

Add code
Oct 21, 2024
Viaarxiv icon

ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models

Add code
Oct 07, 2024
Viaarxiv icon

Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement

Add code
Aug 15, 2024
Viaarxiv icon

PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer

Add code
Apr 07, 2024
Viaarxiv icon

Goldfish: An Efficient Federated Unlearning Framework

Add code
Apr 04, 2024
Viaarxiv icon