Picture for Guangtao Zhai

Guangtao Zhai

Q-Bench-Portrait: Benchmarking Multimodal Large Language Models on Portrait Image Quality Perception

Add code
Jan 26, 2026
Viaarxiv icon

QualiRAG: Retrieval-Augmented Generation for Visual Quality Understanding

Add code
Jan 26, 2026
Viaarxiv icon

Enhancing Image Quality Assessment Ability of LMMs via Retrieval-Augmented Generation

Add code
Jan 13, 2026
Viaarxiv icon

KidVis: Do Multimodal Large Language Models Possess the Visual Perceptual Capabilities of a 6-Year-Old?

Add code
Jan 13, 2026
Viaarxiv icon

Agentic Retoucher for Text-To-Image Generation

Add code
Jan 08, 2026
Viaarxiv icon

EvolMem: A Cognitive-Driven Benchmark for Multi-Session Dialogue Memory

Add code
Jan 07, 2026
Viaarxiv icon

VTONQA: A Multi-Dimensional Quality Assessment Dataset for Virtual Try-on

Add code
Jan 06, 2026
Viaarxiv icon

Robust Mesh Saliency GT Acquisition in VR via View Cone Sampling and Geometric Smoothing

Add code
Jan 06, 2026
Viaarxiv icon

Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs

Add code
Dec 19, 2025
Figure 1 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 2 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 3 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 4 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Viaarxiv icon

Embodied Image Compression

Add code
Dec 12, 2025
Figure 1 for Embodied Image Compression
Figure 2 for Embodied Image Compression
Figure 3 for Embodied Image Compression
Figure 4 for Embodied Image Compression
Viaarxiv icon