Picture for Xiao Dong

Xiao Dong

WonderHuman: Hallucinating Unseen Parts in Dynamic 3D Human Reconstruction

Add code
Feb 03, 2025
Viaarxiv icon

ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions

Add code
Jan 21, 2025
Viaarxiv icon

CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation

Add code
Jan 20, 2025
Viaarxiv icon

RMAvatar: Photorealistic Human Avatar Reconstruction from Monocular Video Based on Rectified Mesh-embedded Gaussians

Add code
Jan 13, 2025
Viaarxiv icon

LiTformer: Efficient Modeling and Analysis of High-Speed Link Transmitters Using Non-Autoregressive Transformer

Add code
Nov 18, 2024
Figure 1 for LiTformer: Efficient Modeling and Analysis of High-Speed Link Transmitters Using Non-Autoregressive Transformer
Figure 2 for LiTformer: Efficient Modeling and Analysis of High-Speed Link Transmitters Using Non-Autoregressive Transformer
Figure 3 for LiTformer: Efficient Modeling and Analysis of High-Speed Link Transmitters Using Non-Autoregressive Transformer
Figure 4 for LiTformer: Efficient Modeling and Analysis of High-Speed Link Transmitters Using Non-Autoregressive Transformer
Viaarxiv icon

A Survey of Foundation Models for Music Understanding

Add code
Sep 15, 2024
Viaarxiv icon

CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models

Add code
Jul 21, 2024
Figure 1 for CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Figure 2 for CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Figure 3 for CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Figure 4 for CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Viaarxiv icon

OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

Add code
Jul 10, 2024
Figure 1 for OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Figure 2 for OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Figure 3 for OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Figure 4 for OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Viaarxiv icon

SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction

Add code
Jul 06, 2024
Viaarxiv icon

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving

Add code
Apr 25, 2024
Figure 1 for ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Figure 2 for ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Figure 3 for ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Figure 4 for ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Viaarxiv icon