Picture for Xiuye Gu

Xiuye Gu

Language-Guided Image Tokenization for Generation

Add code
Dec 08, 2024
Viaarxiv icon

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Add code
Dec 21, 2023
Figure 1 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 2 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 3 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 4 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Viaarxiv icon

CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor

Add code
Dec 21, 2023
Viaarxiv icon

Pixel Aligned Language Models

Add code
Dec 14, 2023
Viaarxiv icon

Photorealistic Video Generation with Diffusion Models

Add code
Dec 11, 2023
Viaarxiv icon

PolyMaX: General Dense Prediction with Mask Transformer

Add code
Nov 09, 2023
Figure 1 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 2 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 3 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 4 for PolyMaX: General Dense Prediction with Mask Transformer
Viaarxiv icon

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

Add code
Oct 09, 2023
Viaarxiv icon

DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

Add code
Jun 02, 2023
Figure 1 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 2 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 3 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 4 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Viaarxiv icon

A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

Add code
Feb 13, 2023
Viaarxiv icon

Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features

Add code
Dec 20, 2022
Viaarxiv icon