Picture for Kai Qiu

Kai Qiu

MageBench: Bridging Large Multimodal Models to Agents

Add code
Dec 05, 2024
Viaarxiv icon

XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation

Add code
Dec 02, 2024
Viaarxiv icon

REDUCIO! Generating 1024$\times$1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Add code
Nov 20, 2024
Viaarxiv icon

ImageFolder: Autoregressive Image Generation with Folded Tokens

Add code
Oct 02, 2024
Figure 1 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 2 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 3 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 4 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Viaarxiv icon

Efficient Autoregressive Audio Modeling via Next-Scale Prediction

Add code
Aug 16, 2024
Viaarxiv icon

ControlVAR: Exploring Controllable Visual Autoregressive Modeling

Add code
Jun 14, 2024
Figure 1 for ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Figure 2 for ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Figure 3 for ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Figure 4 for ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Viaarxiv icon

$\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations

Add code
Mar 07, 2024
Figure 1 for $\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Figure 2 for $\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Figure 3 for $\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Figure 4 for $\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Viaarxiv icon

Exploring Transferability for Randomized Smoothing

Add code
Dec 14, 2023
Viaarxiv icon

MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation

Add code
Nov 30, 2023
Figure 1 for MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Figure 2 for MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Figure 3 for MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Figure 4 for MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Viaarxiv icon

ART$\boldsymbol{\cdot}$V: Auto-Regressive Text-to-Video Generation with Diffusion Models

Add code
Nov 30, 2023
Viaarxiv icon