Picture for Jiayi Ji

Jiayi Ji

IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation

Add code
Jan 09, 2025
Viaarxiv icon

RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation

Add code
Dec 03, 2024
Figure 1 for RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Figure 2 for RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Figure 3 for RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Figure 4 for RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Viaarxiv icon

Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding

Add code
Nov 25, 2024
Viaarxiv icon

Any-to-3D Generation via Hybrid Diffusion Supervision

Add code
Nov 22, 2024
Figure 1 for Any-to-3D Generation via Hybrid Diffusion Supervision
Figure 2 for Any-to-3D Generation via Hybrid Diffusion Supervision
Figure 3 for Any-to-3D Generation via Hybrid Diffusion Supervision
Figure 4 for Any-to-3D Generation via Hybrid Diffusion Supervision
Viaarxiv icon

Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension

Add code
Nov 20, 2024
Figure 1 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 2 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 3 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 4 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Viaarxiv icon

Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image

Add code
Oct 20, 2024
Figure 1 for Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
Figure 2 for Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
Figure 3 for Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
Figure 4 for Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
Viaarxiv icon

$γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

Add code
Oct 17, 2024
Figure 1 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 2 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 3 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Figure 4 for $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Viaarxiv icon

I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing

Add code
Aug 26, 2024
Viaarxiv icon

TraDiffusion: Trajectory-Based Training-Free Image Generation

Add code
Aug 19, 2024
Figure 1 for TraDiffusion: Trajectory-Based Training-Free Image Generation
Figure 2 for TraDiffusion: Trajectory-Based Training-Free Image Generation
Figure 3 for TraDiffusion: Trajectory-Based Training-Free Image Generation
Figure 4 for TraDiffusion: Trajectory-Based Training-Free Image Generation
Viaarxiv icon

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models

Add code
Jul 31, 2024
Viaarxiv icon