Picture for Fanrui Zhang

Fanrui Zhang

MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams

Add code
Aug 09, 2025
Viaarxiv icon

Sekai: A Video Dataset towards World Exploration

Add code
Jun 18, 2025
Viaarxiv icon

A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

Add code
Jun 11, 2025
Viaarxiv icon

Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Add code
May 22, 2025
Viaarxiv icon

MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models

Add code
Apr 08, 2025
Viaarxiv icon

ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Add code
Mar 09, 2025
Viaarxiv icon

ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy

Add code
Mar 09, 2025
Figure 1 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 2 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 3 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 4 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Viaarxiv icon

ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization

Add code
Oct 14, 2024
Figure 1 for ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Figure 2 for ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Figure 3 for ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Figure 4 for ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Viaarxiv icon

Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks

Add code
Mar 22, 2024
Figure 1 for Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks
Figure 2 for Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks
Figure 3 for Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks
Viaarxiv icon

Hierarchical Information Enhancement Network for Cascade Prediction in Social Networks

Add code
Mar 22, 2024
Viaarxiv icon