Picture for Shunian Chen

Shunian Chen

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

Add code
Nov 06, 2024
Figure 1 for Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Figure 2 for Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Figure 3 for Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Figure 4 for Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Viaarxiv icon

VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment

Add code
Oct 12, 2024
Figure 1 for VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Figure 2 for VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Figure 3 for VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Figure 4 for VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Viaarxiv icon

Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs

Add code
Sep 17, 2024
Viaarxiv icon

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture

Add code
Sep 04, 2024
Viaarxiv icon

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Add code
Aug 20, 2024
Figure 1 for Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Figure 2 for Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Figure 3 for Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Figure 4 for Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Viaarxiv icon

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale

Add code
Jun 27, 2024
Figure 1 for HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
Figure 2 for HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
Figure 3 for HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
Figure 4 for HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
Viaarxiv icon

MileBench: Benchmarking MLLMs in Long Context

Add code
Apr 29, 2024
Viaarxiv icon

Humans or LLMs as the Judge? A Study on Judgement Biases

Add code
Feb 20, 2024
Figure 1 for Humans or LLMs as the Judge? A Study on Judgement Biases
Figure 2 for Humans or LLMs as the Judge? A Study on Judgement Biases
Figure 3 for Humans or LLMs as the Judge? A Study on Judgement Biases
Figure 4 for Humans or LLMs as the Judge? A Study on Judgement Biases
Viaarxiv icon

ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model

Add code
Feb 18, 2024
Figure 1 for ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model
Figure 2 for ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model
Figure 3 for ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model
Figure 4 for ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model
Viaarxiv icon

Silkie: Preference Distillation for Large Visual Language Models

Add code
Dec 17, 2023
Viaarxiv icon