Picture for Qihang Yu

Qihang Yu

FlowTok: Flowing Seamlessly Across Text and Image Tokens

Add code
Mar 13, 2025
Viaarxiv icon

Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation

Add code
Feb 27, 2025
Viaarxiv icon

Dictionary-based Framework for Interpretable and Consistent Object Parsing

Add code
Feb 26, 2025
Viaarxiv icon

COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation

Add code
Feb 04, 2025
Viaarxiv icon

Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens

Add code
Jan 13, 2025
Figure 1 for Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Figure 2 for Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Figure 3 for Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Figure 4 for Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Viaarxiv icon

FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching

Add code
Dec 19, 2024
Viaarxiv icon

Randomized Autoregressive Visual Generation

Add code
Nov 01, 2024
Viaarxiv icon

MaskBit: Embedding-free Image Generation via Bit Tokens

Add code
Sep 24, 2024
Figure 1 for MaskBit: Embedding-free Image Generation via Bit Tokens
Figure 2 for MaskBit: Embedding-free Image Generation via Bit Tokens
Figure 3 for MaskBit: Embedding-free Image Generation via Bit Tokens
Figure 4 for MaskBit: Embedding-free Image Generation via Bit Tokens
Viaarxiv icon

Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

Add code
Jun 13, 2024
Viaarxiv icon

An Image is Worth 32 Tokens for Reconstruction and Generation

Add code
Jun 11, 2024
Figure 1 for An Image is Worth 32 Tokens for Reconstruction and Generation
Figure 2 for An Image is Worth 32 Tokens for Reconstruction and Generation
Figure 3 for An Image is Worth 32 Tokens for Reconstruction and Generation
Figure 4 for An Image is Worth 32 Tokens for Reconstruction and Generation
Viaarxiv icon