Picture for Kai Qiu

Kai Qiu

HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models

Add code
Mar 14, 2025
Viaarxiv icon

Robust Latent Matters: Boosting Image Generation with Sampling Error

Add code
Mar 11, 2025
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Viaarxiv icon

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Add code
Feb 20, 2025
Viaarxiv icon

Three-dimensional attention Transformer for state evaluation in real-time strategy games

Add code
Jan 07, 2025
Figure 1 for Three-dimensional attention Transformer for state evaluation in real-time strategy games
Figure 2 for Three-dimensional attention Transformer for state evaluation in real-time strategy games
Figure 3 for Three-dimensional attention Transformer for state evaluation in real-time strategy games
Figure 4 for Three-dimensional attention Transformer for state evaluation in real-time strategy games
Viaarxiv icon

MageBench: Bridging Large Multimodal Models to Agents

Add code
Dec 05, 2024
Viaarxiv icon

XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation

Add code
Dec 02, 2024
Figure 1 for XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
Figure 2 for XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
Figure 3 for XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
Figure 4 for XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
Viaarxiv icon

REDUCIO! Generating 1024$\times$1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Add code
Nov 20, 2024
Viaarxiv icon

ImageFolder: Autoregressive Image Generation with Folded Tokens

Add code
Oct 02, 2024
Figure 1 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 2 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 3 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Figure 4 for ImageFolder: Autoregressive Image Generation with Folded Tokens
Viaarxiv icon

Efficient Autoregressive Audio Modeling via Next-Scale Prediction

Add code
Aug 16, 2024
Viaarxiv icon