Picture for Xiaoming Wei

Xiaoming Wei

Meituan

High-Resolution Image Synthesis via Next-Token Prediction

Add code
Nov 22, 2024
Viaarxiv icon

Faster Multi-GPU Training with PPLL: A Pipeline Parallelism Framework Leveraging Local Learning

Add code
Nov 19, 2024
Viaarxiv icon

Denoising with a Joint-Embedding Predictive Architecture

Add code
Oct 02, 2024
Figure 1 for Denoising with a Joint-Embedding Predictive Architecture
Figure 2 for Denoising with a Joint-Embedding Predictive Architecture
Figure 3 for Denoising with a Joint-Embedding Predictive Architecture
Figure 4 for Denoising with a Joint-Embedding Predictive Architecture
Viaarxiv icon

FreeEdit: Mask-free Reference-based Image Editing with Multi-modal Instruction

Add code
Sep 26, 2024
Viaarxiv icon

Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding

Add code
Sep 12, 2024
Figure 1 for Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding
Figure 2 for Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding
Figure 3 for Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding
Figure 4 for Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding
Viaarxiv icon

Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation

Add code
Aug 28, 2024
Viaarxiv icon

Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input

Add code
Aug 28, 2024
Figure 1 for Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Figure 2 for Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Figure 3 for Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Figure 4 for Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Viaarxiv icon

Deformable 3D Shape Diffusion Model

Add code
Jul 31, 2024
Viaarxiv icon

Fine-gained Zero-shot Video Sampling

Add code
Jul 31, 2024
Viaarxiv icon

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Add code
Apr 01, 2024
Viaarxiv icon