Picture for Jinbin Bai

Jinbin Bai

Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Add code
May 29, 2025
Viaarxiv icon

Conditional Panoramic Image Generation via Masked Autoregressive Modeling

Add code
May 22, 2025
Figure 1 for Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Figure 2 for Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Figure 3 for Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Figure 4 for Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Viaarxiv icon

An Empirical Study of GPT-4o Image Generation Capabilities

Add code
Apr 08, 2025
Viaarxiv icon

Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer

Add code
Mar 21, 2025
Viaarxiv icon

Evaluating Image Caption via Cycle-consistent Text-to-Image Generation

Add code
Jan 08, 2025
Figure 1 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 2 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 3 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 4 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Viaarxiv icon

HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing

Add code
Dec 05, 2024
Viaarxiv icon

RelationBooth: Towards Relation-Aware Customized Object Generation

Add code
Oct 30, 2024
Viaarxiv icon

MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models

Add code
Oct 17, 2024
Viaarxiv icon

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Add code
Oct 10, 2024
Figure 1 for Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Figure 2 for Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Figure 3 for Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Figure 4 for Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Viaarxiv icon

An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control

Add code
Mar 07, 2024
Viaarxiv icon