Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Jul 16, 2024

Haodong Duan, Junming Yang, Yuxuan Qiao, Xinyu Fang, Lin Chen, Yuan Liu, Xiaoyi Dong, Yuhang Zang, Pan Zhang, Jiaqi Wang(+2 more)

Figure 1 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Figure 2 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Figure 3 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Figure 4 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Share this with someone who'll enjoy it:

Abstract:We present VLMEvalKit: an open-source toolkit for evaluating large multi-modality models based on PyTorch. The toolkit aims to provide a user-friendly and comprehensive framework for researchers and developers to evaluate existing multi-modality models and publish reproducible evaluation results. In VLMEvalKit, we implement over 70 different large multi-modality models, including both proprietary APIs and open-source models, as well as more than 20 different multi-modal benchmarks. By implementing a single interface, new models can be easily added to the toolkit, while the toolkit automatically handles the remaining workloads, including data preparation, distributed inference, prediction post-processing, and metric calculation. Although the toolkit is currently mainly used for evaluating large vision-language models, its design is compatible with future updates that incorporate additional modalities, such as audio and video. Based on the evaluation results obtained with the toolkit, we host OpenVLM Leaderboard, a comprehensive leaderboard to track the progress of multi-modality learning research. The toolkit is released at https://github.com/open-compass/VLMEvalKit and is actively maintained.

View paper on

Share this with someone who'll enjoy it:

Title:VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Paper and Code