Picture for Yuxin Guo

Yuxin Guo

Towards Unified Song Generation and Singing Voice Conversion with Accompaniment Co-Generation

Add code
Jun 05, 2026
Viaarxiv icon

Separating Intrinsic Ambiguity from Estimation Uncertainty in Deep Generative Models for Linear Inverse Problems

Add code
May 14, 2026
Viaarxiv icon

Enjoy Your Layer Normalization with the Computational Efficiency of RMSNorm

Add code
May 14, 2026
Viaarxiv icon

UniSonate: A Unified Model for Speech, Music, and Sound Effect Generation with Text Instructions

Add code
Apr 24, 2026
Viaarxiv icon

GraphWalker: Graph-Guided In-Context Learning for Clinical Reasoning on Electronic Health Records

Add code
Apr 08, 2026
Viaarxiv icon

From Single Scan to Sequential Consistency: A New Paradigm for LIDAR Relocalization

Add code
Feb 03, 2026
Viaarxiv icon

MM-Sonate: Multimodal Controllable Audio-Video Generation with Zero-Shot Voice Cloning

Add code
Jan 08, 2026
Viaarxiv icon

Klear: Unified Multi-Task Audio-Video Joint Generation

Add code
Jan 07, 2026
Viaarxiv icon

CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving

Add code
Oct 09, 2025
Viaarxiv icon

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Add code
Aug 27, 2025
Figure 1 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 2 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 3 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 4 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Viaarxiv icon