Picture for Xi Wang

Xi Wang

ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training

Add code
Jan 08, 2025
Viaarxiv icon

AKiRa: Augmentation Kit on Rays for optical video generation

Add code
Dec 18, 2024
Viaarxiv icon

LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model

Add code
Dec 16, 2024
Viaarxiv icon

GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control

Add code
Dec 15, 2024
Figure 1 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 2 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 3 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 4 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Viaarxiv icon

Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework

Add code
Dec 15, 2024
Figure 1 for Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework
Figure 2 for Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework
Figure 3 for Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework
Figure 4 for Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework
Viaarxiv icon

Holistic Understanding of 3D Scenes as Universal Scene Description

Add code
Dec 02, 2024
Viaarxiv icon

Understanding the World's Museums through Vision-Language Reasoning

Add code
Dec 02, 2024
Viaarxiv icon

InTraGen: Trajectory-controlled Video Generation for Object Interactions

Add code
Nov 25, 2024
Viaarxiv icon

Look a Group at Once: Multi-Slide Modeling for Survival Prediction

Add code
Nov 18, 2024
Figure 1 for Look a Group at Once: Multi-Slide Modeling for Survival Prediction
Figure 2 for Look a Group at Once: Multi-Slide Modeling for Survival Prediction
Figure 3 for Look a Group at Once: Multi-Slide Modeling for Survival Prediction
Figure 4 for Look a Group at Once: Multi-Slide Modeling for Survival Prediction
Viaarxiv icon

Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering

Add code
Nov 13, 2024
Figure 1 for Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering
Figure 2 for Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering
Figure 3 for Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering
Figure 4 for Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering
Viaarxiv icon