Picture for Xi Wang

Xi Wang

DeepCircuitX: A Comprehensive Repository-Level Dataset for RTL Code Understanding, Generation, and PPA Analysis

Add code
Feb 25, 2025
Viaarxiv icon

GOD model: Privacy Preserved AI School for Personal Assistant

Add code
Feb 24, 2025
Viaarxiv icon

FreeTumor: Large-Scale Generative Tumor Synthesis in Computed Tomography Images for Improving Tumor Recognition

Add code
Feb 23, 2025
Viaarxiv icon

CosyAudio: Improving Audio Generation with Confidence Scores and Synthetic Captions

Add code
Jan 28, 2025
Viaarxiv icon

ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training

Add code
Jan 08, 2025
Figure 1 for ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
Figure 2 for ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
Figure 3 for ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
Figure 4 for ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
Viaarxiv icon

AKiRa: Augmentation Kit on Rays for optical video generation

Add code
Dec 18, 2024
Viaarxiv icon

LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model

Add code
Dec 16, 2024
Viaarxiv icon

Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework

Add code
Dec 15, 2024
Figure 1 for Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework
Figure 2 for Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework
Figure 3 for Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework
Figure 4 for Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework
Viaarxiv icon

GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control

Add code
Dec 15, 2024
Figure 1 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 2 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 3 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 4 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Viaarxiv icon

Understanding the World's Museums through Vision-Language Reasoning

Add code
Dec 02, 2024
Viaarxiv icon