Picture for Xiaoming Wei

Xiaoming Wei

Meituan

Active Intelligence in Video Avatars via Closed-loop World Modeling

Add code
Dec 23, 2025
Viaarxiv icon

UFVideo: Towards Unified Fine-Grained Video Cooperative Understanding with Large Language Models

Add code
Dec 12, 2025
Viaarxiv icon

LongCat-Image Technical Report

Add code
Dec 08, 2025
Viaarxiv icon

InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing

Add code
Aug 19, 2025
Figure 1 for InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
Figure 2 for InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
Figure 3 for InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
Figure 4 for InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
Viaarxiv icon

DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution

Add code
Jul 01, 2025
Figure 1 for DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution
Figure 2 for DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution
Figure 3 for DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution
Figure 4 for DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution
Viaarxiv icon

PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

Add code
Jun 12, 2025
Figure 1 for PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
Figure 2 for PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
Figure 3 for PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
Figure 4 for PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
Viaarxiv icon

LLIA -- Enabling Low-Latency Interactive Avatars: Real-Time Audio-Driven Portrait Video Generation with Diffusion Models

Add code
Jun 06, 2025
Viaarxiv icon

Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation

Add code
May 28, 2025
Figure 1 for Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Figure 2 for Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Figure 3 for Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Figure 4 for Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Viaarxiv icon

LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation

Add code
Apr 15, 2025
Figure 1 for LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation
Figure 2 for LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation
Figure 3 for LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation
Figure 4 for LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation
Viaarxiv icon

Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes

Add code
Apr 14, 2025
Figure 1 for Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Figure 2 for Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Figure 3 for Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Figure 4 for Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Viaarxiv icon