Picture for Ye Zhu

Ye Zhu

GASS: Geometry-Aware Spherical Sampling for Disentangled Diversity Enhancement in Text-to-Image Generation

Add code
Feb 19, 2026
Viaarxiv icon

PEAR: Pixel-aligned Expressive humAn mesh Recovery

Add code
Jan 30, 2026
Viaarxiv icon

Mass Distribution versus Density Distribution in the Context of Clustering

Add code
Jan 14, 2026
Viaarxiv icon

MANGO:Natural Multi-speaker 3D Talking Head Generation via 2D-Lifted Enhancement

Add code
Jan 05, 2026
Viaarxiv icon

LEMAS: Large A 150K-Hour Large-scale Extensible Multilingual Audio Suite with Generative Speech Models

Add code
Jan 04, 2026
Viaarxiv icon

D2D: Detector-to-Differentiable Critic for Improved Numeracy in Text-to-Image Generation

Add code
Oct 22, 2025
Figure 1 for D2D: Detector-to-Differentiable Critic for Improved Numeracy in Text-to-Image Generation
Figure 2 for D2D: Detector-to-Differentiable Critic for Improved Numeracy in Text-to-Image Generation
Figure 3 for D2D: Detector-to-Differentiable Critic for Improved Numeracy in Text-to-Image Generation
Figure 4 for D2D: Detector-to-Differentiable Critic for Improved Numeracy in Text-to-Image Generation
Viaarxiv icon

MelCap: A Unified Single-Codebook Neural Codec for High-Fidelity Audio Compression

Add code
Oct 02, 2025
Viaarxiv icon

CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Add code
Jul 03, 2025
Viaarxiv icon

BNMusic: Blending Environmental Noises into Personalized Music

Add code
Jun 12, 2025
Viaarxiv icon

Dynamic Diffusion Schrödinger Bridge in Astrophysical Observational Inversions

Add code
Jun 11, 2025
Viaarxiv icon