Picture for Xin Tao

Xin Tao

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Add code
Oct 10, 2024
Viaarxiv icon

SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs

Add code
Aug 21, 2024
Viaarxiv icon

VideoTetris: Towards Compositional Text-to-Video Generation

Add code
Jun 06, 2024
Viaarxiv icon

SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance

Add code
May 24, 2024
Viaarxiv icon

UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark

Add code
Apr 15, 2024
Viaarxiv icon

Perception-Oriented Video Frame Interpolation via Asymmetric Blending

Add code
Apr 10, 2024
Viaarxiv icon

Motion Inversion for Video Customization

Add code
Mar 29, 2024
Viaarxiv icon

DVIS++: Improved Decoupled Framework for Universal Video Segmentation

Add code
Dec 20, 2023
Viaarxiv icon

Stable Segment Anything Model

Add code
Dec 05, 2023
Viaarxiv icon

1st Place Solution for the 5th LSVOS Challenge: Video Instance Segmentation

Add code
Aug 28, 2023
Viaarxiv icon