Picture for Xiaoshuai Sun

Xiaoshuai Sun

Persistent Story World Simulation with Continuous Character Customization

Add code
Mar 17, 2026
Viaarxiv icon

Test-Time Computing for Referring Multimodal Large Language Models

Add code
Feb 23, 2026
Viaarxiv icon

MICON-Bench: Benchmarking and Enhancing Multi-Image Context Image Generation in Unified Multimodal Models

Add code
Feb 23, 2026
Viaarxiv icon

CSMCIR: CoT-Enhanced Symmetric Alignment with Memory Bank for Composed Image Retrieval

Add code
Jan 07, 2026
Viaarxiv icon

CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning

Add code
Oct 09, 2025
Figure 1 for CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Figure 2 for CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Figure 3 for CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Figure 4 for CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Viaarxiv icon

MIHBench: Benchmarking and Mitigating Multi-Image Hallucinations in Multimodal Large Language Models

Add code
Aug 01, 2025
Viaarxiv icon

AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models

Add code
Jul 03, 2025
Figure 1 for AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models
Figure 2 for AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models
Figure 3 for AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models
Figure 4 for AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models
Viaarxiv icon

RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning

Add code
May 23, 2025
Viaarxiv icon

Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach

Add code
Apr 16, 2025
Figure 1 for Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach
Figure 2 for Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach
Figure 3 for Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach
Figure 4 for Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach
Viaarxiv icon

An Efficient and Mixed Heterogeneous Model for Image Restoration

Add code
Apr 15, 2025
Viaarxiv icon