Picture for Jiayi Ji

Jiayi Ji

Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach

Add code
Apr 16, 2025
Viaarxiv icon

An Efficient and Mixed Heterogeneous Model for Image Restoration

Add code
Apr 15, 2025
Viaarxiv icon

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Add code
Mar 30, 2025
Viaarxiv icon

MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning

Add code
Mar 26, 2025
Viaarxiv icon

QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension

Add code
Mar 11, 2025
Viaarxiv icon

IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation

Add code
Jan 09, 2025
Figure 1 for IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation
Figure 2 for IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation
Figure 3 for IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation
Figure 4 for IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation
Viaarxiv icon

RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation

Add code
Dec 03, 2024
Figure 1 for RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Figure 2 for RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Figure 3 for RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Figure 4 for RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Viaarxiv icon

Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding

Add code
Nov 25, 2024
Viaarxiv icon

Any-to-3D Generation via Hybrid Diffusion Supervision

Add code
Nov 22, 2024
Figure 1 for Any-to-3D Generation via Hybrid Diffusion Supervision
Figure 2 for Any-to-3D Generation via Hybrid Diffusion Supervision
Figure 3 for Any-to-3D Generation via Hybrid Diffusion Supervision
Figure 4 for Any-to-3D Generation via Hybrid Diffusion Supervision
Viaarxiv icon

Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension

Add code
Nov 20, 2024
Figure 1 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 2 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 3 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Figure 4 for Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Viaarxiv icon