Picture for Xiaodan Liang

Xiaodan Liang

WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation

Add code
Mar 11, 2025
Viaarxiv icon

Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?

Add code
Mar 08, 2025
Viaarxiv icon

Structured Preference Optimization for Vision-Language Long-Horizon Task Planning

Add code
Feb 28, 2025
Viaarxiv icon

UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting

Add code
Feb 25, 2025
Viaarxiv icon

TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba

Add code
Feb 21, 2025
Viaarxiv icon

ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions

Add code
Jan 21, 2025
Figure 1 for ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions
Figure 2 for ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions
Figure 3 for ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions
Figure 4 for ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions
Viaarxiv icon

CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation

Add code
Jan 20, 2025
Viaarxiv icon

DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder

Add code
Dec 23, 2024
Figure 1 for DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder
Figure 2 for DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder
Figure 3 for DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder
Figure 4 for DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder
Viaarxiv icon

Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism

Add code
Dec 13, 2024
Figure 1 for Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism
Figure 2 for Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism
Figure 3 for Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism
Figure 4 for Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism
Viaarxiv icon

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation

Add code
Dec 11, 2024
Viaarxiv icon