Picture for Zongxin Yang

Zongxin Yang

The Devil is in Temporal Token: High Quality Video Reasoning Segmentation

Add code
Jan 15, 2025
Viaarxiv icon

3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering

Add code
Jan 09, 2025
Viaarxiv icon

Noise-Tolerant Hybrid Prototypical Learning with Noisy Web Data

Add code
Jan 05, 2025
Viaarxiv icon

Generalizable Origin Identification for Text-Guided Image-to-Image Diffusion Models

Add code
Jan 04, 2025
Viaarxiv icon

Collaborative Hybrid Propagator for Temporal Misalignment in Audio-Visual Segmentation

Add code
Dec 11, 2024
Viaarxiv icon

3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation

Add code
Oct 16, 2024
Figure 1 for 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Figure 2 for 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Figure 3 for 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Figure 4 for 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Viaarxiv icon

MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis

Add code
Jul 02, 2024
Figure 1 for MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Figure 2 for MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Figure 3 for MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Figure 4 for MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Viaarxiv icon

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

Add code
Feb 08, 2024
Viaarxiv icon

Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation

Add code
Feb 04, 2024
Viaarxiv icon

DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models

Add code
Jan 16, 2024
Figure 1 for DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
Figure 2 for DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
Figure 3 for DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
Figure 4 for DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
Viaarxiv icon