Picture for Juncheng Li

Juncheng Li

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Add code
Dec 13, 2024
Viaarxiv icon

Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness

Add code
Dec 09, 2024
Viaarxiv icon

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Add code
Dec 08, 2024
Viaarxiv icon

HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing

Add code
Dec 05, 2024
Viaarxiv icon

STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

Add code
Nov 29, 2024
Viaarxiv icon

AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea

Add code
Nov 24, 2024
Figure 1 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 2 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 3 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 4 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Viaarxiv icon

Unified Generative and Discriminative Training for Multi-modal Large Language Models

Add code
Nov 01, 2024
Figure 1 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 2 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 3 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 4 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Viaarxiv icon

RADAR: Robust Two-stage Modality-incomplete Industrial Anomaly Detection

Add code
Oct 02, 2024
Viaarxiv icon

Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration

Add code
Sep 30, 2024
Viaarxiv icon

Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation

Add code
Sep 27, 2024
Figure 1 for Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
Figure 2 for Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
Figure 3 for Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
Figure 4 for Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
Viaarxiv icon