Picture for Xinyu Fang

Xinyu Fang

V-LoRA: An Efficient and Flexible System Boosts Vision Applications with LoRA LMM

Add code
Nov 01, 2024
Viaarxiv icon

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Add code
Oct 16, 2024
Viaarxiv icon

MuMA-ToM: Multi-modal Multi-Agent Theory of Mind

Add code
Aug 22, 2024
Viaarxiv icon

JieHua Paintings Style Feature Extracting Model using Stable Diffusion with ControlNet

Add code
Aug 21, 2024
Viaarxiv icon

A New Chinese Landscape Paintings Generation Model based on Stable Diffusion using DreamBooth

Add code
Aug 16, 2024
Viaarxiv icon

VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Add code
Jul 16, 2024
Figure 1 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Figure 2 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Figure 3 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Figure 4 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Viaarxiv icon

Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

Add code
Jun 20, 2024
Viaarxiv icon

MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding

Add code
Jun 20, 2024
Viaarxiv icon

LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design

Add code
May 28, 2024
Viaarxiv icon

Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting

Add code
Oct 28, 2023
Figure 1 for Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Figure 2 for Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Figure 3 for Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Figure 4 for Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Viaarxiv icon