Picture for Guangtao Zhai

Guangtao Zhai

Enhancing Image Quality Assessment Ability of LMMs via Retrieval-Augmented Generation

Add code
Jan 13, 2026
Viaarxiv icon

KidVis: Do Multimodal Large Language Models Possess the Visual Perceptual Capabilities of a 6-Year-Old?

Add code
Jan 13, 2026
Viaarxiv icon

Agentic Retoucher for Text-To-Image Generation

Add code
Jan 08, 2026
Viaarxiv icon

EvolMem: A Cognitive-Driven Benchmark for Multi-Session Dialogue Memory

Add code
Jan 07, 2026
Viaarxiv icon

Robust Mesh Saliency GT Acquisition in VR via View Cone Sampling and Geometric Smoothing

Add code
Jan 06, 2026
Viaarxiv icon

VTONQA: A Multi-Dimensional Quality Assessment Dataset for Virtual Try-on

Add code
Jan 06, 2026
Viaarxiv icon

Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs

Add code
Dec 19, 2025
Figure 1 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 2 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 3 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 4 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Viaarxiv icon

Embodied Image Compression

Add code
Dec 12, 2025
Figure 1 for Embodied Image Compression
Figure 2 for Embodied Image Compression
Figure 3 for Embodied Image Compression
Figure 4 for Embodied Image Compression
Viaarxiv icon

Using GUI Agent for Electronic Design Automation

Add code
Dec 12, 2025
Figure 1 for Using GUI Agent for Electronic Design Automation
Figure 2 for Using GUI Agent for Electronic Design Automation
Figure 3 for Using GUI Agent for Electronic Design Automation
Figure 4 for Using GUI Agent for Electronic Design Automation
Viaarxiv icon

ManipShield: A Unified Framework for Image Manipulation Detection, Localization and Explanation

Add code
Nov 18, 2025
Figure 1 for ManipShield: A Unified Framework for Image Manipulation Detection, Localization and Explanation
Figure 2 for ManipShield: A Unified Framework for Image Manipulation Detection, Localization and Explanation
Figure 3 for ManipShield: A Unified Framework for Image Manipulation Detection, Localization and Explanation
Figure 4 for ManipShield: A Unified Framework for Image Manipulation Detection, Localization and Explanation
Viaarxiv icon