Picture for Kaihang Pan

Kaihang Pan

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Add code
Dec 13, 2024
Figure 1 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 2 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 3 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 4 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Viaarxiv icon

STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

Add code
Nov 29, 2024
Viaarxiv icon

AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea

Add code
Nov 24, 2024
Figure 1 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 2 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 3 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 4 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Viaarxiv icon

Unified Generative and Discriminative Training for Multi-modal Large Language Models

Add code
Nov 01, 2024
Figure 1 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 2 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 3 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 4 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Viaarxiv icon

Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration

Add code
Sep 30, 2024
Viaarxiv icon

Auto-Encoding Morph-Tokens for Multimodal LLM

Add code
May 03, 2024
Viaarxiv icon

Improving Vision Anomaly Detection with the Guidance of Language Modality

Add code
Oct 04, 2023
Viaarxiv icon

ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval

Add code
Aug 19, 2023
Viaarxiv icon

Empowering Vision-Language Models to Follow Interleaved Vision-Language Instructions

Add code
Aug 10, 2023
Viaarxiv icon

Meta-augmented Prompt Tuning for Better Few-shot Learning

Add code
Mar 28, 2023
Viaarxiv icon